Skip to Main Content
Link to Clark Family Library

Data Science (DS) Guide

This guide will introduce you to data resources, as well as how to cite data, work with data, and ask questions about data.

Researching Data

Step 1: If you're not sure where to start, ask a librarian for help!

Step 2: Conduct a search using keywords that match your research question.  Consider using a database relevant to your area of study.

Step 3: When reading an article, use the Methods section.  Every empirical research paper should have a Methods section.  The Methods section will describe the data.  Every Methods section will describe the data.  Ask yourself these questions:

  • What data did they use?
  • Did they cite their data? (Data citation can make it easier to track down where to find the data)
  • Who is the author?  What is the year of publication?
  • Track the following descriptors; Claim; Data; Dependent variable/estimate technique; Significant finding

Step 4: Repeat Steps 1-3 with different keywords to find more articles

*Consider using Google Scholar in addition to any resources found in this LibGuide!

Think about your research question.  What is your topic?  What kind of claim do you want to make?  What kind of evidence do you need to support your claim?  Use the answers to these questions to find your ideal datasets.

Keep in mind that the ideal dataset may not exist.  Here are some obstacles that may make finding your perfect dataset difficult:

  1. The data doesn't exist
  2. The data does exist, but is restricted and you can't access it.
  3. The data you want costs money to access
  4. Some data has intellectual property rights and terms of use, meaning it is not legal for you to use for your personal use.
  5. The data you want may not be in "machine readable" form.  This might be true for historical forms of data.  This data is in a different form, such as in a book or paper form.  You will need the help of a librarian to access it.

Consider which organization may have collected this data?  Common sources of data include:

  • Researchers
  • Government agencies  (e.g., Census, BLS, BEA)
  • NGOs and IGOs (U.N World Bank)
  • Research organizations
  • Private companies

Content for this page has been adapted from:

Evaluating Data

Be sure to ask yourself if the potential data is reliable:

  1. Find overview information
    • Who created the data and why?
    • Where and when was the data collected?
    • Is this data still accurate or relevant?
  2. Find technical documentation
    • Find the technical documentation about the dataset, including information on how it was created, variable definitions, and any indications of what information was included/excluded.
  3. Identify the download options and access restrictions
    • Who can use the data?  What formats of downloads are available (.csv, .txt, .json, etc.)
    • Ask a librarian for help if you are unsure about the reliability, accessibility, or any other questions you may have

Developing Keywords

Keywords are the building blocks of research. However, finding ones that produce the kind of search results that you are looking for can be tricky. These questions can be helpful to think through these questions when developing keywords:

  1. What are the important nouns in your research question?
  2. What are synonyms for those important nouns? How might different groups refer to your topic?
  3. Who are the key figures/ what are the key events that you learned about during your research with reference sources?

Like the research process, developing keywords is an iterative process, so don't panic if your first keywords don't produce the results that you were expecting.

Below are some resources from other libraries with further tips for developing keywords.