What Is Data Science?

Data science…explained.

Flatiron School / 29 May 2018

Data science is so hot right now.

Surely, there’s been a point where you’ve come across a conversation about data science and all you could do was nod in agreement because you didn’t know what was going on. Data science can seem overwhelming, but it’s also an incredibly exciting field at the cutting edge of technology. Whether you’re curious to learn what data science is or want to learn what data science opportunities are available to you, we’re here to help with the classic question: “What is data science?”

What is data science?

There are many definitions for data science, but we like to think of the field as the multidisciplinary approach to unlocking stories and insights from the data being collected on a variety of behaviors, topics, and trends. Data science is everywhere — and chances are you’ve already interacted with it today a whole lot. Take Google’s search engine, for example. Its algorithm and site ranking and results are firmly in the realm of data science. If you’ve uploaded a photo on Facebook and the social media platform suggested tagging a friend, you’ve interacted with data science. That Netflix recommendation to continue your binge watching, Amazon’s product recommendations, or targeted advertisements are all the result of data science.

Credit: Calvin Andrus

Credit: Calvin Andrus

There’s plenty of rigor involved with data science, but the field also needs people interested in finding their own path and/or want to solve complex problems.

Judging by the buzz surrounding data science, you’re forgiven for thinking that the field is relatively new. But, depending on how you want to define data science, the field actually predates computers. In fact, data science has been around since the beginning of recorded history. By understanding the context of the evolution of data science, we can see the expansiveness and versatility of the field.

Take the quiz: What Coding Course Is Right For Me?

History of data science

To put data science into context, it’s good to look at its place in history. It’s a pretty broad way of thinking about data science, but it helps to highlight that data science has been around for a while. From Sumerian cuneiform to Egyptian hieroglyphs to Ancient Greek to Latin, there were individuals tasked with storing, compiling, and interpreting those written records. Ancient philosophers, librarians, and scholars all could be considered data scientists or data analysts. The great ancient Alexandrian mathematician and scholar Hypatia used existing data to produce commentaries and teach students. That’s not too far removed from what data scientists are doing today. Statisticians, for a more modern example, could also fall under the data science umbrella.

Computers have come a long way since the Electronic Numerical Integrator and Computer.

Computers have come a long way since the Electronic Numerical Integrator and Computer.

More pragmatically, the field of data science as we know it today dates back to ENIAC, widely considered the first digital, general purpose computer created in 1946. Computers made data processing more efficient, and as computers became more sophisticated, so too did data processing and storage. Alongside these technological innovations were advancements in data analysis and the theory of how to use that data. Discussions surrounding data and data analysis have been around since the 1940s, according to Forbes. In 1974, Peter Naur, a computer scientist and data science pioneer, discussed data analysis and provided a definition for data science in “Concise Survey of Computer Methods.” “Data science is the science of dealing with data, once they have been established, while the relation of data to what they represent is delegated to other fields and sciences,” he wrote.

The mid-1990s would be another pivotal moment in shaping data science as we know it today with the proliferation of personal computers and commercialized internet access. Fast forward 20 years and we’re no longer talking about data. Instead, we’re talking about “big data.” By the early 2000s, the data science we know today was solidified and currently shapes how we think about the field and job opportunities within data science.

Data science jobs today

Data science is so hot — to use a scientific term — because so many companies rely on data to figure out what they need to do to beat their competition. Demand far exceeds supply, which is great news for anyone looking to enter this flourishing field.

Companies are not just looking for data scientists or data analysts, the two data science roles you may be most familiar with, but also data architects, data engineers, machine learning engineers, and analytics managers. Not surprisingly, LinkedIn’s Emerging Jobs Report is full of data science roles with soaring job growth rates. Machine Learning Engineer was LinkedIn’s top emerging position in 2017. Other data science jobs on the list include Data Scientist in second place, Big Data Developer in fifth place, and Director of Data Science in eighth place. Data scientist jobs have increased 650% since 2012 and has been the hottest job in America since 2016, according to Glassdoor. A lack of qualified candidates means a high median base salary for data scientists.

Contrary to popular belief, not all data science jobs require coding expertise. A data analyst may have familiarity with data visualization, SQL, or Python, but it’s not a must. However, coding knowledge is an incredibly valuable skill and is typically associated with the most sought after data science jobs. And aside from plenty of job opportunities and high salary potential, data science jobs aren’t going to be unnecessary anytime soon.

Data science jobs in a nutshell

We’ve been discussing data science in a pretty abstract way, so it could be good to see data science in action. There are thousands of data science jobs, but chances are you’re most familiar with data analysts and scientists. These are well-defined roles that highlight the variety of roles that can be found within data science.

A data analyst is a great entrypoint into data science because it does not require high-level technical skills or coding knowledge. Looking through job descriptions on Glassdoor, data analysts jobs requirements include managing, analyzing, and reporting on data. A data analyst typically reports to a manager or supports multiple departments. Because of the versatility, data analyst jobs can be found across all industries from marketing to finance to healthcare. Most job descriptions ask candidates to be knowledgeable with analytics software—such as SAS—data management software—including SQL—and data visualization software—Tableau is a popular option. Data analysts are in high demand with over 4,200 job openings, according to Glassdoor. The median base salary for a data analyst is $60,000. If you’re a natural storyteller that likes working with data, a data analyst career may be right for you.

Data scientists are on the opposite side of the technical spectrum and have been referred to as unicorns for their rarity. They’re a hybrid role that combines software engineering, coding, statistical analysis, and data visualization. Because of the job’s complexity, and the interdisciplinary nature of the role, there aren’t a lot of data scientists without work. That’s why it’s the hottest job, according to Glassdoor, and the sexiest job, according to the Harvard Business Review. There are over 4,500 open data science jobs with a median base salary of $110,000, according to Glassdoor.

What separates a data scientist from a data analyst is the scientist’s ability to code. Not only are data scientists telling stories through data, they’re using that information to create new models, algorithms, and programs to create more nuanced insights. A typical job description for a data scientist requires knowledge in statistical computer languages, such as Python, SQL, and R, along with experience creating models.

Data scientists are asked to create algorithms or models that can solve a company’s most complex problems. For example, billions of photos have been uploaded to Facebook. That’s an incredible amount of data to sift through, but also a tremendous opportunity. Facebook has teams of researchers, engineers, analysts, and scientists tasked with developing algorithms and AI that are then trained and refined over time. The more photos and hashtags used in Facebook, the smarter the AI becomes. Facebook uses machine learning to suggest tags and uses hashtags to make image recognition even smarter.

These models can also be used to make predictions regarding consumer behavior or industry trends. Data scientists are highly coveted across all industries.

Credit: Stephan Kolassa

What does a data scientist do?

With any role, your job depends on the company and your team. However, you may expect to spend time in meetings discussing business needs, reviewing code, building models monitoring and analyzing data for accuracy and performance, or formatting data into something that’s accessible across teams, according to feedback from data scientists on Quora. Data scientists have experience across multiple disciplines and their resumes may include experience in research, education, technology, or statistics.

Interestingly, most data scientists say they spend the least amount of time on algorithms. Coding is incredibly important, but it’s not the sole responsibility of a data scientist. Instead, most data scientists spend their day understanding the problems facing their company and figuring out how data can be used to solve those needs. After that’s achieved, a data scientist will spend time with their team communicating the needs of the company. Once everyone is aligned, the actual work can begin. Along with coding skills, sharp communication skills are a necessary, albeit less discussed, requirement to becoming a data scientist.

Becoming a data scientist no longer requires a Master’s degree or PhD, although many companies seek individuals with advanced degrees. Individuals looking to change careers can enroll in bootcamps to acquire the necessary skills to become a data scientist. Instead of taking years, bootcamps can provide the skills and a working portfolio in weeks.

Future of data science

Considering many data science jobs were non-existent a decade ago, let alone five years ago, we can expect even greater demand for data science jobs over the next few years. Just take a look at the latest tech news and you can see how data science will be in even greater demand in the years to come. For every company touting something related to algorithms, artificial intelligence, machine learning, pattern recognition, or prognostication there will be new jobs for those roles.

Instead of worrying about robots or AI taking away jobs, we should be embracing the possibilities that come with technological innovation. Enrolling in a data science bootcamp is a great step into the lucrative and exciting world of data science. Flatiron School offers a free online Data Science Bootcamp prep course with over 75 hours of curriculum covering all the basics of data science. For individuals looking to change careers or gain a competitive edge in their field, we recommend Intro to Data Science, a part-time course available at our NYC and London campuses. For those eager to become data scientists, our Immersive Data Science Bootcamp in NYC provides all the data science skills you need in just 15 weeks.

Dynamic SSL Proxy for Jupyter Notebook Previous Post Data Analyst vs. Data Scientist: What’s the Difference? Next Post