The purpose of this guide is to serve as a starting point for your research needs in the field of Data Science. Because Data Science is a rapidly growing and evolving field, this guide will be updated as new resources become available. If you have any questions, feedback, or would like to offer suggestions, let the Library Staff know!
Data is made up of individual pieces of information that are collected, organized, and stored for later use. Data is a human artifact: it’s always made by and for people in a specific context. Therefore it’s important to think about how and why the data you are using was collected, or how and why you are collecting data yourself. Because data is both a core building block of digital information systems as well as a widespread commodity, you must always think consciously and ethically about data collection.
Data Science is a multidisciplinary field of study that combines principles and practices from the fields of mathematics, statistics, artificial intelligence, and computer engineering to analyze large amounts of data. This analysis helps data scientists to ask and answer a wide variety of research questions in many different fields and disciplines. Data Science work spans the areas of data collection and experimental design, exploratory data analysis and visualization, statistical modeling and machine learning, and interpretation and communication of results.
Washington & Jefferson College offers several programs and areas of study in Data Science: for more, refer to the DS Programs and Courses page of this guide: Data Science Programs and Courses
Data reliability refers to the completeness and accuracy of data as a measure of how well it can be relied upon to be consistent and free from errors across time and sources. In conjunction, data validity concerns the accuracy, structure, and integrity of the data. As you consider which resources to use for datasets, you must consider the following questions:
To chose reliable data resources, consider the accuracy, authority, objectivity, currency, and coverage of the resource. Evaluate the source's reputation, credentials, and potential biases. Make sure that the data is up-to-date and relevant to your needs. To read more about determining the reliability and validity of a data resource, use the following links for more information:
Gaining background information on a topic is a critical first step in the research process. Background information helps you learn more about your topic, identify important facts related to your topic (keywords, dates, events, history, and names and organizations), refine your topic, and find additional sources of information through bibliographies and works cited pages.
You might do this by conducting a Google search or going to Wikipedia and reading up on a subject. Below are some reference resources from the library that can also help you familiarize yourself with your topic.