Gridworks: Super Data Cleanup and Exploration Tool

In my presentation at the Spring 2010 Mid-Atlantic Regional Archives Conference (MARAC), Whirlwind Tour of Visualization-Land,  I showed some screenshots of a tool called Gridworks. At the time, Gridworks was not available to the general public. The good news is that earlier this month Gridworks 1.0 was officially released and you can get Gridworks right now.

For those of you who didn’t see my presentation, Gridworks is tool you run locally on your computer via a web browser. It permits you to load ‘grid-shaped data’ for examination, filtering and data cleanup. That makes is sound so much less exciting than it is. The best way to get a sense of what you can do is to watch the Gridworks Videos.

What sort of data do I think there is in archives to be pumped into Gridworks? How about collection descriptive data and electronic record datasets? Since all the data is kept locally, you don’t need to worry about uploading your data to some anonymous server in order to work with it. It all stays safely on your local computer the whole time.

A quick list of things that Gridworks can do:

  • Cluster data to find values that are almost the same so you can normalize your data (for example – NYC vs N.Y.C.)
  • Create instant facetted browsing based on any column in your data
  • Provide scatterplots of the values from any two numeric columns as well as a way to spot the most interesting combinations across many possible columns
  • Reconcilliation and validation of values based on data from within Freebase.com
  • Pull data from Freebase.com based on a matched column – such as the population of a country, if you have a column in your dataset with country specified
  • Splitting data within a cell based on a specified delimiter
  • Application of regular expressions and other simple code to data to create new columns

This list just scratches the surface, but it should give you a decent idea of the power of Gridworks. Even if the only feature you ever use is the one which lets you cluster and update your data to remove the ‘almost the same’ values, Gridworks can save you hours of painstaking data cleanup.

Why is data cleanup exciting? Because once you have nice clean data with all the attributes that are usefull to have for your data set – then you can start playing with the data in visualization tools! So go watch some Gridworks Videos, get Gridworks for yourself and start playing with data. It is free and it makes working with data fun!

http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/digg_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/reddit_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/delicious_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/blinklist_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/newsvine_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/technorati_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/google_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/facebook_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/yahoobuzz_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/sphinn_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/twitter_48.png
Related Posts:

Posted on 29th May 2010
Under: electronic records, information visualization, learning technology, MARAC, metadata, software | No Comments » | Print This Post Print This Post

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>