Get That Cool Job: data miner

As the internet makes its way into every aspect of our everyday lives, there is more data on us out there than ever before. Companies are collecting and analyzing our data to deliver us better customer experiences, as well as generate more sales. And with the high demand for data comes the hiring of data miners and scientists to extract it.

According to McKinsey, in the U.S., there is a shortage of 140,000 to 190,000 people with the necessary skills to work in big data. By 2018, the U.S. is projected to have a 50 to 60 percent gap between supply and demand for analytical talent in big data.

If you want to enter this growing field, you need to be both analytical and creative. So we spoke with working data miners and scientists, and found out what it takes. Here’s what they had to say.

You crave discovery

Data miners and scientists are naturally curious people. They want to know the answers to those burning questions that are plaguing their clients and themselves.

Kenny Darrell, a lead data scientist at Elder Research, says that in order to work in this trade, you need to be able to “dive into a problem and find what is really at the core of it.”

“You need to be driven to look for other things outside of the problem,” he says. “Tunnel-vision is bad.”

If you’re great at math and computer science, but you don’t have imagination, you won’t be useful to a company.

“Even the most skilled statistician had to learn statistics,” says Darrell. “Learning curiosity is a little harder.”

Dean Abbott, the president of Abbott Analytics, says that he loves locating patterns and relationships in data that has never been discovered.

“Often, the data miner is the first person to have ever looked at and analyzed the data he or she is using,” he says. “It truly is exploring a new frontier.”

You’re a problem solver

Companies value data miners and scientists because they help them find solutions for their issues, including ways to better connect with customers, produce more sales, and save money.

Data scientist Kurt Thearling, vice president of analytics at WEX Inc., says that fancy math doesn’t matter “if it doesn’t help a business solve a problem that they care about.”

“Knowing how the math can have an impact on a business,” he explains, “and then following through and using that math every day in a production environment will determine the ultimate value of the math.”

You look at problems from different perspectives

Data miners and scientists have to do more than find one solution for one problem. They must also be able to look at these issues from various perspectives, and always be striving to create better and better algorithms.

Abbott says you need a forensics or a Freakonomics mindset.

“In that book, the analyses were interesting not because of the algorithms used,” he explains, “but because of the questions the analyst asked along the way to understand the data. The question isn’t just ‘How accurate is the model?’ but ‘Why are these particular patterns found, [and] why other patterns are not found by the models,” and ‘Is there anything we can do to help the algorithms find patterns better?’”

You have to be able to foresee what kind of impact your findings might have on a social and economic level, too.

“When we build a model, we often think of the good it can do, but we also need to think of the bad,” says Darrell. “Whenever a model is built that helps something become more efficient, return more dollars, etc, we need to reflect how. If we built a model that could place employees in jobs with better results, we have improved many things, but we may have also marginalized the careers of many people.”

You can deal with organizational bureaucracy

Companies aren’t always willing to make changes or employ your models. You have to be able to manage roadblocks that may get in the way of you putting your algorithms to work, and prepare for these issues from the start.

According to Abbott, “there are hurdles politically in an organization to get sign-off by the key stakeholders before a model can be used.”

“There are often technical hurdles to overcome to get the models to run automatically, or at least fast enough to be useful,” Abbott says. “Often, these problems were never considered at the beginning of a project and should have been.”

How to start your data mining career

If you’d like to become a data miner or scientist, Abbott recommends taking courses in statistics and machine learning, and to actually start practicing data mining.

“Academic instruction, while good at teaching the science, doesn’t have time to teach the art of data mining,” Abbott says. “It takes far too much time to describe how to overcome the seemingly endless ways data can be wrong, and needs further preparation or ways the data misleads the algorithms. Experience applying these techniques is critical.”

To get that valuable experience, Abbott says you can try to find problems within your company that need to be solved. If that’s not possible, participate in Kaggle’s data science competitions, which give you problems to solve.

According to Thearling, you should learn to program in languages like R, JavaScript, and Python to manage and analyze the data in front of you.

“Most data science tools today are really software development environments, and you write code to access the data, manipulate it to get it ready for analysis, and then do the actual analysis,” he says.

You should also learn visualization tactics in order to show your results to your co-workers and bosses. Thearling calls this “data artistry,” which he says, “involves being able to visually display the results of a data analysis problem in such a way that others get the point and understand what the pattern looks like.”

“It’s part aesthetics, part data visualization, and part graphic design,” he says.

Photo credit: NYC Media Lab/Flickr

 

Interested in workspace? Get in touch.