Where do Data Scientists Come From?
Here we see some interesting patterns: data scientists, machine learning engineers, and software engineers are more likely to start straight out of academia. Many of the “other” previous jobs are unrelated, such as catering, tutoring, store clerks, and other positions people can often hold while completing their degrees.
Many roles transition into data scientists or machine learning engineers, but rarely do we see data scientists and machine learning engineers transitioning into any of the other roles. This is likely due in part to the relative sizes of the fields, the infancy of the “data scientist” and “machine learning engineer” titles, and the recent growth in popularity of those titles. However, I believe we are also observing an interesting phenomena that speaks to how individuals are moving between and progressing³ through each role.
This chord diagram illustrates the main transitions we see between these roles. The color of the chord indicates which role people are transitioning from.
Software engineers make up a big slice of the pie. Many transition to analyst roles, while others hop straight to data science.
Data science is equally fed by academia, analysts, and software engineers. Software engineers are far more likely to hop into a data analyst role, although this is in part due to the larger number of analyst roles than data scientist roles.
Again, we see few individuals leaving data science at this moment. It’s unclear if this pattern will change in the future. The key takeaway here is that the data science field is fed by a wide variety of backgrounds, and it is relatively common to see software engineers become data analysts, and data analysts to become data scientists. This may represent a viable path for anyone looking to transition out of a software engineering role.
Transitions into data engineering come almost exclusively from software engineering⁴.
Conclusion
Where do data scientists come from? Everywhere! Although the field is predominantly populated by individuals with MAs and PhDs, there are still plenty of individuals with bachelor degrees (26%) in the role. No field of study seems to dominate data science at this time; conversely, we see a great diversity in backgrounds for data scientists, especially compared to fields like software engineering. In addition, we see a large number of individuals moving from other tech roles — such as software engineering and data analytics — into data science.
While machine learning engineers reflect data scientists in their levels of academic achievement, they seem to be more heavily focused in engineering backgrounds, and are more likely to have transitioned from a software engineer role. Data engineers also have more of an engineering focus, but tend to have lower levels of degree achievement when compared to the other roles in this study.
What does this mean for data science job seekers?
Graduate school is still the dominant way data scientists get into the field. Data science degrees have a growing presence, and now appear to be a somewhat common way to get entry into the field. Any field of study seems viable if one has obtained an advanced degree. If you’re in a graduate program now, there’s almost certainly someone in your field of study working in data science. I suggest you reach out to them and find out how they made the leap!
Software engineers and data analysts seem to transition into data science roles quite regularly, and represent substantial portions of new data scientists. Future jobseekers should consider these routes as well.
What does this mean for employers looking for data scientists?
If you’re looking for a generalist data scientist, don’t throw out a resume just because the field or degree isn’t what you expect. Data scientists are diverse in their education and background. Although most have an advanced degree in some field, there is no one field that dominates the job market.
If you’re having difficulty hiring experienced data scientists or scientists out of academia, consider bringing in individuals from software engineering or data analyst roles, as that is clearly a common pathway to data science.
Also — as we’ll discuss in a later article — make sure you know the role you’re actually hiring for. Do you think need a data scientist, but feel your role is more heavy on engineering? Consider introducing a “machine learning engineer” role. Do you think you need a data scientist, but with more focus on a business background? Consider hiring an analyst. Do you need someone with a focus on database and infrastructure skills? Consider a data engineer, and don’t focus as much on their educational background.
Finally, if you think you do need some sort of generalist data scientists for your team, consider looking for a variety of educational backgrounds. At Indeed, the members of our data science and product science teams span a wide range of fields, including astronomy, sociology, biology, mathematics, economics, and business. Having a diverse data science team — both in demographics and in field of study — is essential for doing great work⁵ ⁶.