Tutorial and Assignment 2 - Data Cleaning

In this tutorial you will use Google Refine to clean a dataset. We will perform some cleaning together in class.

Getting Started


Install Google Refine. You can install download and install from here:

 http://openrefine.org/download.html

You should install the version called "OpenRefine 2.7" at the top of the page. builds.)''

The documentation for Google Refine / Open Refine is available here.

There are also a set of nice introductory tutorials available on YouTube: Part 1, Part 2, Part 3

Here are helpful pointers to the Open Refine Expression Language

Files


universityData.csv - A file containing sample data we will use in the tutorial.