We are already in the era of big data, and it is only getting bigger. Every day we gain faster access to a larger volume of data, which then must be analyzed in order to fully capitalize on this expanding resource.
Image courtesy of infocux via Flickr
But this wealth of data can be incredibly useful. According to McKinsey, analyzing big data allows for access to more accurate and detailed information that can be applied to boost a company’s performance, provide better, narrower segmentation of customers, and offer a thorough and analytical backbone to the decision-making process.
Despite its improving reputation, there are still three main problems with data science as a field:
1. It is an understaffed field
Even with the addition of courses in many universities related to data science and a surge in related job offerings, there is still a massive shortage of workers for jobs in data science predicted everywhere. According to a 2011 study by McKinsey, “By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills.” While the field is increasing in popularity and companies are continuing to realize the necessity of data analytics, the supply of data scientists is surpassed by high demand.
2. Data scientists are expensive
Although there is undeniable value in understanding your company’s data, hiring a data scientist is certainly an investment. According to Glassdoor, the median income for data scientists nationally as of August 2014 was $115,000, and some have annual salaries as high as $300,000 and over. While this is driven in part by the basic economics of demand outstripping supply, it is also due to their highly specialized skill sets and quantitative abilities. Quite simply, it takes a certain caliber of person to be an effective data scientist.
3. Much of what you pay for is janitorial work
Even if you decide to hire a data scientist, the majority of their time is not spent actually evaluating data; a huge part of a data scientist’s job involves converting data into a manner that algorithms can understand. The New York Times named the extensive collection and preparation of data necessary for any data scientist “data janitor work.” The disparity between the knowledge of humans and of computers makes the substantial and laborious “janitor work” a prominent and necessary aspect of the job.
For these reasons, a company can invest a considerable amount in a data scientist (if they can find one), yet most of his or her time will not be spent actually analyzing the data as he or she is specialized to do.
We need a solution to these issues, because in a world of data, the last thing we need is another black box with magical guardians who benefit by not working themselves out of a job.
What can a company do to avoid these problems and still access the analytical data it needs?
1. Do more with less:Start with better data
How you collect your data to begin with will determine the amount of janitorial work necessary. Instead of piecing together information from surveys or traditional big data sources, ask in-depth questions to the internal experts of a business through a software platform and analyze these results.
2. Pre-structure your data
Insist on capturing your data into an integrated framework, not as a series of one-offs. Using a platform with an incorporated schema and data/logic tier further eliminates janitorial work. Data architecture leads to automation as well, saving your resources and allowing you to access better data insights faster.
3. Insist on repeatable tools
Using repeatable tools will not only allow your company to frequently assess how its doing and to track its progress over time, but also will lead to self-service analytics. With self-service analytics, all levels of your business can consistently make better, data-driven decisions.
So, the messier the data, the more pay and the longer the contract. Focus on a single platform that questions stakeholders internally, captures the data into a structured framework, and allows for self-service analytics in order to get the insights you need without exhausting resources.
With this elimination of janitorial work and access to better quality data, data scientists can focus on doing what they do best – actually analyzing the data and revealing its significance – therefore adding value to your investment in hiring a data scientist in the first place and providing useable, analytical data to your company as a whole.