Big data only gains commercial value when you can identify patterns in its accumulated mass. Once you’ve got the pattern then you can predict the behaviour of the people generating it. Then you can understand the market.

Thomas H. Davenport and D.J. Patil, writing in a 2012 article in The Harvard Business Review, illustrate this by relating the experience of Jonathan Goldman (one of the pioneers of Data Science) during his early days in LinkedIn.

Goldman was confronted with vast arrays of data generated by millions of LinkedIn users. After in-depth scans and analysis of the data-sets he was able to see patterns emerging in how user activity related to shared employers and contacts.

He discovered that people are more likely to connect with others they shared these contacts with.

So he came up with the “people you might know” ads on LinkedIn, which proved to be a powerful tool that boosted user activity and many of the company’s core metrics.

Goldman’s discovery is an example of something that’s central to what the data scientist does: he picked up on flashes of a pattern in the data, analysed them exhaustively and eventually extracted commercial value for the LinkedIn product.

The data scientist’s role is about seeing connections between data that are randomly distributed across the rows and columns of the dataset and the database; for the emerging generation of data scientists, the wilder and more unstructured the data the better.

Like any scientist they will develop an hypothesis about a pattern or trend in the data and design tests to confirm its viability.

Once they have a validated the idea, they’ll analyse it in the context of the market or business process to find out how it can add value to the product they’re working on.

Any data scientist will of course have to be highly technical. They’ll need to have demonstrated a mastery of advanced maths and stats as well as skills in software coding.

But the job isn’t just about working deep in the data. It also demands other skills like good communication with other people.

The data scientist will have to translate patterns pulsing through the database into a story they can communicate to the rest of the company, either in writing or graphically. It’s not uncommon for a data scientist job specs to look for web authoring and graphic design skills.

Additionally, they’ll need that creative impulse where they can see value and meaning in what everyone simply sees as a hopeless mess of symbols, letters and numbers.

As Marketing Distellery make clear in the infographic below.


It could be a unique pattern that keeps repeating in the dataset or something that reminds them of what they’ve seen in a completely different area of study. Davenport and Patil relate a story about a data scientist who was tracking sequencing in financial data and was reminded of similar patterns they saw in a previous job analysing DNA sequences.

The talent emerging in the nascent data science profession comes from backgrounds not necessarily related to IT, but where working with data is fundamental like ecology or the social sciences.

These are disciplines that try to understand mass or large scale system behaviour in the real world.

A good hiring manager will look for a range of qualities in a data scientist. At the outset he’ll confirm the candidate has a solid foundation in maths and stats as well as thorough programming skills.

However, he’ll go further and look for business acumen that will give them an intuitive understanding of how ripples in the data can portend tectonic shifts in the market.

Data scientists perform optimally when working closely with front line staff like product managers and product owners.

The product manager has to be Janus faced, looking at the development process in house but at the time monitoring how the product will perform in an ever evolving market in the real world outside.

The data scientist too has to work with raw data generated by customers in the real world and apply what they learn from that to the product their team is developing.

They’re a rare talent and even when academia creates more of them they are still going to be a very rare professional profile with a highly marketable skill-set.

A good data scientist can listen to big data and be fluent in its strange, layered nuances but at the same time will be a very capable communicator. And just to give the skill-set another twist, they’ll also need to be first rate entrepreneurs: they must immediately and intuitively grasp how the weak whispering in the data flows can mean huge changes in the market and great opportunities for the product.