Understand These Data Science Terms Before Starting Your Career
Starting out in any field in tech can seem intimidating, especially if you’re not familiar with the terminology. These 20 data science terms to know can help you understand the jargon and give you an idea of what concepts you can learn in our Data Science and Analytics Bootcamp.
Algorithm: procedures that are implemented in code and are run on data and provide a type of automatic programming. An “algorithm” in machine learning is a procedure that is run on data to create a machine learning “model.”
Artificial Intelligence (AI): this refers to a machine that exhibits traits associated with a human mind such as learning and problem-solving.
Bias: is the systematic favoritism of a group of data or outcomes. There are different types of bias, including confirmation bias, selection bias, and recall bias.
Correlation: is the measurement of the relationship between two variables, which can have a strong or weak correlation. This is useful for understanding how data sets relate to one another and develop predictions.
Cross-Validation: this is how professionals test learning models and provides the ability to estimate the performance on unseen (or future) data that was not used in the training phase.
Big Data: a volume of data so large or complex that it requires non-traditional methods to process it. Big data will always have at least one of the following characteristics: high velocity, high volume, or high variety.
Dark Data: data that can never offer meaningful insight. From logs used in a call center to social media feeds, these are chunks of data that can never be analyzed for insights.
Database: structured data stored in a computer’s memory that can be accessed in various ways.
Data Analytics: the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
Data Science: uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from several sources of structural and unstructured data. Data science is related to data mining, machine learning, and big data.
Data Scraping: the act of extracting data from a human-readable output (like a website) and saving it into another program. Professionals can scrape data to compare pricing, research for web content, find sales leads, and more.
Data Visualization: visual elements that make data easily digestible. This is where storytelling comes into play. Not to be confused with a “report.”
Data Wrangling: the process of cleaning, structuring, and enriching raw data into the desired format for better decision-making in less time. Prepares data for analysis down the line.
Machine Learning: part of artificial intelligence, machine learning is the study of computer algorithms that improve automatically through experience.
Model: a representation of one or more concepts that may be realized in the physical world. A “model” in machine learning is the output of data from an algorithm. A model represents what was learned by a machine learning algorithm.
Neural Networks: the use of man-made computing power intended to replicate how human brains calculate.
Structured Data: conform to a data model and can be accessed by a computer program or identified by a person. Structured data can be sourced from online forms, metadata and servers.
Unstructured Data: data that lacks an identifiable structure or format. It cannot be modeled as it is hard for computer programs to identify how to read it as data. Unstructured data can include memos, social media accounts, or web pages.
Turn Curiosity Into a Career
These 20 terms are just the tip of the iceberg when it comes to data science. Learn more about this rich profession and the career opportunities that come with it. Schedule a call with a dedicated admissions advisor and ask about how you can test-drive the program with our Introductory Course.