If data is the new oil, then data science is the new refinery.
I was recently asked whether studying Data Science has helped me in my day-to-day job. My response was yes, but not in an obvious way - it's resulted in better designed customer solutions by improving my empathy.
Let me take a step back. For the past few years, I've been leading Software-as-a-Service (SaaS) platform integrations for enterprise clients. I often describe the work as similar to being a clothing tailor. If a software consultancy is a bespoke tailor that customizes every detail at a premium price; than, a SaaS platform is a made-to-measure tailor who cuts from an existing pattern at an economical price. Over time, I've learned how to measure and cut software for customers of all shapes, sizes, and sophistication.
However, where the analogy ends is that cloth is something we can touch and see, so we naturally understand its limitations; software architecture, on the other hand, exists in our minds and most of us aren't able to judge the quality. With fabric, we don't question why it can't be made from a liquid, but I often find myself explaining to customers why our platform can't do what they want because how data is stored.
So how does studying data science fit into all of this?
A large part of pragmatic data science involves the process of Extracting, Transforming, and Loading (ETL) data. Extract data from a source (i.e. database, API, CSV), Transform the data by cleaning it up such as removing outliers and incomplete records, and Load the new data into your machine learning training model. Rinse, lather, and repeat for every project.
Let me give some examples. If I have a task that requires bulk automation, a developer will likely prefer a well-formatted CSV where they can easily extract the information. If I have a task that requires a computer to read thousands of records, a developer will likely prefer an input with standardized punctuation and identifiers (i.e. JSON). If I have a task that requires loading data into a new table or database, a developer will likely prefer working with someone who weighs any risks to the existing data.
All the hands-on ETL practice I've done over the past year has honed my compass on how to work with data - whether it's a better grasp of what's a reasonable request of a developer or being more articulate with a customer in explaining what's possible with data. It leads to improved communication, credibility, faster-decision making, and ultimately, a timely well-designed solution.