Subscribe To The 9Lenses Blog Sign Up

3 Problems with the Data Science Labor Market

AnonymousBy Yogita Malik Arora 7 years ago
Home  /  Data  /  3 Problems with the Data Science Labor Market

We are already in the era of big data, and it is only getting bigger. Every day we gain faster access to a larger volume of data, which then must be analyzed in order to fully capitalize on this expanding resource.

Image courtesy of infocux via Flickr

But this wealth of data can be incredibly useful. According to McKinsey, analyzing big data allows for access to more accurate and detailed information that can be applied to boost a company’s performance, provide better, narrower segmentation of customers, and offer a thorough and analytical backbone to the decision-making process.

Despite its improving reputation, there are still three main problems with data science as a field:

1. It is an understaffed field

Even with the addition of courses in many universities related to data science and a surge in related job offerings, there is still a massive shortage of workers for jobs in data science predicted everywhere. According to a 2011 study by McKinsey, “By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills.” While the field is increasing in popularity and companies are continuing to realize the necessity of data analytics, the supply of data scientists is surpassed by high demand.

2. Data scientists are expensive

Although there is undeniable value in understanding your company’s data, hiring a data scientist is certainly an investment. According to Glassdoor, the median income for data scientists nationally as of August 2014 was $115,000, and some have annual salaries as high as $300,000 and over. While this is driven in part by the basic economics of demand outstripping supply, it is also due to their highly specialized skill sets and quantitative abilities. Quite simply, it takes a certain caliber of person to be an effective data scientist.

3. Much of what you pay for is janitorial work

Even if you decide to hire a data scientist, the majority of their time is not spent actually evaluating data; a huge part of a data scientist’s job involves converting data into a manner that algorithms can understand. The New York Times named the extensive collection and preparation of data necessary for any data scientist “data janitor work.” The disparity between the knowledge of humans and of computers makes the substantial and laborious “janitor work” a prominent and necessary aspect of the job.

For these reasons, a company can invest a considerable amount in a data scientist (if they can find one), yet most of his or her time will not be spent actually analyzing the data as he or she is specialized to do.

We need a solution to these issues, because in a world of data, the last thing we need is another black box with magical guardians who benefit by not working themselves out of a job.

What can a company do to avoid these problems and still access the analytical data it needs?

1. Do more with less:Start with better data

How you collect your data to begin with will determine the amount of janitorial work necessary. Instead of piecing together information from surveys or traditional big data sources, ask in-depth questions to the internal experts of a business through a software platform and analyze these results.

2. Pre-structure your data

Insist on capturing your data into an integrated framework, not as a series of one-offs. Using a platform with an incorporated schema and data/logic tier further eliminates janitorial work. Data architecture leads to automation as well, saving your resources and allowing you to access better data insights faster.

3. Insist on repeatable tools

Using repeatable tools will not only allow your company to frequently assess how its doing and to track its progress over time, but also will lead to self-service analytics. With self-service analytics, all levels of your business can consistently make better, data-driven decisions.

So, the messier the data, the more pay and the longer the contract. Focus on a single platform that questions stakeholders internally, captures the data into a structured framework, and allows for self-service analytics in order to get the insights you need without exhausting resources.

With this elimination of janitorial work and access to better quality data, data scientists can focus on doing what they do best – actually analyzing the data and revealing its significance – therefore adding value to your investment in hiring a data scientist in the first place and providing useable, analytical data to your company as a whole.

Share this post

One Comment

  • Jerry OvertonJerry Overton says:

    Great article. Most issues I come across in data science falls within one of the 3 big problems you mentioned. In fact, I think that problem #2 is a symptom of problem #1. So, I would venture to say that there are, really, 2 big problems:

    1. Availability of Talent
    2. Availability of Data

    I agree that better data collection, better data stewardship, and better data platforms are paramount to solving the problem. To your list I would add the need for:

    1. Better (not necessarily more) Data Science Education
    2. Better Integration of Existing Data Science Tools

    Some of our talent shortage seems self-inflicted. I think we’d have more data scientists if the fundamental concepts were explained different. I think that cleaning up the data would help, but data scientists are kinda like archaeologists:they seek out the dirt because that’s where the good stuff is buried.

Google Analytics Alternative