Software Development

A comparison between Computer Science and Data Science – Differences, Similarities and Common Ground

Data Science has become one of those buzzwords you hear every day on a particular field, in this case, Data Science is a relatively new area of knowledge that’s getting all the spotlight.  Harvard Business Review[1] named data scientist the sexiest job of the 21st century in 2012, and a well-cited McKinsey report[2] predicted that there would be a need for more 150,000 deeply analytical data scientists and another 1.5 million managerial analysts by 2018. It’s not even a buzzword anymore, because as the Economist [3]notes, Data Science is an entirely new global commodity more valuable than oil.

 

The field of Computer Science has been changed in a significant way since Data Science appeared as a more mature term and field. Even though data science utilizes many other fields of study, the entire discipline is built upon and processed through the infrastructure of computer science. But for those on the inside, this two fields are more than related, they belong to a part of a symbiotic relationship, with nuances to each that nurture the other.

 

What is Computer Science?

 

Computer Science is the study of computers design, architecture and its application in the field of science and technology that consists of several concepts of technical aspects. It includes hardware, software, networking and internet having a vast number of research areas to advance beyond.

 

Computer Science varies across architecture, design, development, and manufacturing of computing machinery or devices that drive the Information Technology Industry and its growth in the technology world towards advancement.

 

Computer scientists analyze algorithms and study the performance of computer software and hardware. The main areas of computer science study are computer systems, artificial intelligence, computer systems and network, human-computer interaction, vision and graphics, programming language, numerical analysis, bioinformatics, software engineering, and theory of computing etc.

 

What is Data Science?

 

One of the best descriptions of data science comes from David Blei, Professor of Computer Science and Professor of Statistics at Columbia University, and Padhraic Smyth, Professor of Statistics at the University of California, Irvine, as they write in their Proceedings of the National Academies of Sciences (PNAS) article “Science and Data Science” that “data science is the child of statistics and computer science.”  They further elaborate:

 

Data science focuses on exploiting the modern deluge of data for prediction, exploration, understanding, and intervention. It emphasizes the value and necessity of approximation and simplification; it values effective communication of the results of a data analysis and of the understanding about the world and data that we glean from it; it prioritizes an understanding of the optimization algorithms and transparently managing the inevitable tradeoff between accuracy and speed; it promotes domain-specific analyses, where data scientists and domain experts work together to balance appropriate assumptions with computationally efficient methods [Blei and Smyth 2017].

 

The “child” metaphor appropriately infers that data science inherits (ideally the best) from both its parents, but eventually grows into its own entity.  Its focus separates it from its parents.

 

Data Science emphasizes effective communication of the outcomes of the data analysis. Moreover, data science gives priority to the knowledge of optimization algorithms by managing the needed tradeoff between speed and accuracy.

 

Comparison between Computer Science and Data Science

 

Computer scientists and data scientists have overlapping skills. Each utilizes computational processes. A working understanding of programming languages and algorithms is a must in both fields, but what one does with that understanding is the primary differentiation between the two tracks. Computer science focuses on the “how,” while data science looks at the “why.”

 

As Jeannette M. Wing describes on her article “How Does Data Science Differ from Computer Science and from Statistics?”, Data science embraces uncertainty and approximation as first-class concepts.  For both, it uses probability modeling for mathematical formulation and reasoning.  In contrast, computer science’s foundations sit squarely on symbolic logic; much of computing rests on the abstraction from voltages to bits.  In the logical framework of computer science, uncertainty is traditionally represented as non-determinism.  This distinction is a gross over-simplification of computer science, since many subareas of computing use probabilistic reasoning, but often these probabilistic framings are built as scaffolding over its discrete and logic-based elements.  Thinking as a computer scientist, but with the perspective of a data scientist, takes us beyond the discrete, combinatorial, and exact.

 

In data science, the algorithmic principles are applied to greater areas of uncertainty, often producing probabilistic answers to interdisciplinary questions about business. Modern data scientists typically have a proficiency in computer science, but they can come from mathematical, statistical, or even business backgrounds. Working on top of what computer science has built, data scientists design unique ways to filter through the massive amounts of data that flow through network systems and then extract actionable insights. Those actionable insights can make a business more efficient and effective; they can widen the world’s understanding of health sciences; and they can even filter back into computer science to create better datasets and customer experiences.

 

Beyond the more academic definition, we can say that Data Science is the study of various types of data such as structured, semi-structured and unstructured data in any form or formats available in order to get some information out of it. Data Science consists of different technologies used to study data such as data mining, data storing, data purging, data archival, data transformation etc., in order to make it efficient and ordered.

 

The relationship between Computer Science and Data Science has been growing and is growing more and more each day, just as both areas evolve as well.

 

The following is an Infographic that explains, in a few bullet points, those main differences and similarities Data Science and Computer Science share.

 

Computer Science vs Data Science

 

Conclusion

 

We can conclude that Computer Science and Data Science are two different fields, but they come from the same root. A Computer Scientist will study computers, and will be able to practice Data Science by adding knowledge and experience around statistics and analytics. Computer Science gives us the view to use the technologies in computing the data whereas Data Science lets us to operate on the existing data to make it available for the useful purposes.

 

What will be exciting to see is how Data Science grows up.  What new kinds of problems will data science be able to solve?  What new techniques will be invented that would not have come into existence if not for the marriage of Computer Science, Data Science, Analytics and Statistics?  And finally, what will the field of Data Science look like when it becomes a more matured and developed field?  After all, five decades ago, no one could have predicted the revolutionary change that Computer Science has had on our lives. Data Science possesses the same potential to revolutionize the entire world.

 

Sources

https://www.educba.com/computer-science-vs-data-science/

https://medium.com/oniverse/the-difference-between-computer-science-and-data-science-6e4aef973ceb

https://www.quora.com/How-is-computer-science-different-from-data-science

https://www.onlineengineeringprograms.com/faq/computer-vs-data-science

https://datascience.columbia.edu/data-science-vs-computer-science-vs-statistics

 

[1] Data Scientist: The Sexiest Job of the 21st Century: https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century. Harvard Business Review, 2012.

[2] Big data: The next frontier for innovation, competition, and productivity: https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation. McKinsey Global Institute, 2011.

[3] The world’s most valuable resource is no longer oil, but data. https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data. The Economist, 2017.

Leave a Reply

Your email address will not be published. Required fields are marked *