Data Collection

For this assignment you continue to work on the project that you chose previously and for which you have derived first research questions. The purpose of this assignment is to collect the first datasets that will form the basis of your project work.

Your mission for this assignment is the following:

  1. Each team member searches for 2 datasets relevant to your project work; for a total of 4 different datasets in a team of 2. These datasets may be accessible in any form (API, download, data you have collected yourself, ...). The only requirement is that the data should already be available - or at least available by the assignment deadline. Do not download this dataset yet but keep track of where you found it.
  2. Each team member looks at documentation about their datasets and the dimensions/variables and number of datapoints that are contained - as well as other relevant factors such as when the data was collected.
  3. You meet as a team and decide on which of the found datasets might best provide enough interesting exploratory content to support your research question. Your result may be that a single dataset is interesting enough or that you might want to combine some of the datasets you found. It is possible that in this process you might want to narrow down or modify your research question. Keep track of decisions that you make in this regard.
  4. As a team download this/these dataset/s and share them among the team
  5. Write a brief report in which you - in the following order:
    1. Have a title page with your team name and the name of each team member
    2. Write a short paragraph about each dataset each team member found in the first step above. Provide a link to the source and a short description.
    3. Write a short description about which dataset/s you chose to download. Give rationale for the choice of dataset/s and additional detail such as a) where it was found and how it was collected, b) when it was downloaded/extracted/scraped, c)how it was processed (if at all), d) who (which person/organization) is behind the dataset
    4. In the process of choosing a dataset you will be able to narrow down from the three research questions you posed in the previous assignment. You might have to also modify your initial research questions. In the report describe which question/s you choose to focus on. Describe your initial question proposed in the last assignment and your modified version. Explain why you changed it.
    5. Include a link to the dataset/s you collected (e.g. dropbox/google drive/box/ etc.)
    6. OPTIONAL: Any supporting material you wish to add, such as code written for a scraper or for querying an API.

Submitting the Assignment


WHAT - To complete the assignment you should:

  1. Submit a single report pdf file called "YOUR_TEAMNAME-Assignment-3.pdf" via email.

WHERE - You should email the file to petra.isenberg@inria.fr with the subject VA-Assignment-3.

WHEN - Assignment 3 is due before "23:00 on Oct 7th.'''