To whom does pharma turn when facing cutting-edge research challenges such as designing algorithms to comb unstructured EHR data hunting for undiagnosed patients? Who helps glean patient insights from digital data streams such as social media? Who is designing the value-based frameworks behind pharma’s latest wave of performance-based agreements with payers?

These are just some of the tasks being fielded by data scientists, a new kind of insight and analytics professional showing up in the biopharma ranks. These experts, skilled as they are in advanced data techniques, are being seriously courted by the industry.

“We’re hiring engineers, quantitative pharmacologists, economists, mathematicians, and machine-learning experts. It’s with diversity in mind — diversity not only in skill set, but also in experience,” reports Sandy Allerheiligen, Merck VP of predictive and economic modeling, at September’s AI and the Near-term seminar organized by the consultancy Luminary Labs.

See also: Big data expert Hilary Mason on 3 ways to create a data-driven culture

Cut from a different cloth than traditional quants charged with tracking multichannel campaign management, they’re just as likely to have experience in classic metrics as in machine learning.

And they take their cue from consumer tech advances such as Amazon Alexa and Apple’s Siri, the virtual assistants for the home and phone, respectively, that use artificial intelligence (AI). That term was coined in 1956 and roughly translates to “endowing computers with human-like intelligence.” But the field has more recently rebranded as machine learning, an expression that refers to computers using data to gather insights.

What’s behind the latest recruiting wave in pharma? Some cite well-known factors such as a renewed focus on outcomes-based payment, the greater diversity of data, and the size of data volume.

Others say it’s a response to pressure driven by consumer tech, whether it’s the way Apple and Uber have had to quickly figure out how to use huge amounts of data in an intelligent manner to deliver value to the consumer, or machine-learning projects such as Microsoft’s Hanover, which aims to predict what drugs and combinations are most effective to fight cancer. And there’s IBM, whose Watson Oncology Advisor uses AI to develop individualized treatments.

Hilary Mason, the former chief data scientist for URL-shortening firm Bitly who now heads her own consultancy, Fast Forward Labs, suggests another reason, which she summed up as “technical possibility meets business opportunity.”

See also: Why healthbots may eat search ads and mobile apps

To illustrate, Mason speaks of the ability to analyze images in rich media. When Mason was at Bitly, she said she observed that about 16% of the links were primarily media objects — an image, audio, or video.

“At the time, we weren’t able to do anything with that object itself,” recalls Mason, also at the Luminary Labs event. “We were restricted to analyzing text around it or what people said about it.”

Now, Instagram — thanks to a data technique called deep learning — can automatically ascertain what users like to take pictures of by analyzing only the images.

“This complex data that has been unavailable for machine-learning techniques before is now accessible,” she explains. “It opens up many possibilities.”


All of these factors have led to a kind of cognitive-computing inflection point in pharma, where more emphasis is being devoted to preparing for big data and installing more data scientists.

More than 40% of insight and analytic leaders cited “preparing for or managing big data” and the need to “upskill our organization’s analytic capabilities” among their chief priorities for the next two years, according to a 2016 survey conducted by TGaS Advisors.

Half say they think the advent of big data will have a profound impact on shaping the insights and analytics discipline, and it will heighten the need for data-science based skill sets, according to the survey, which included 18 leaders of insights and analytics across large, mid, and small biopharma organizations (not a statistically representative sample, but a broad one).

What’s more, adds Sharon Getty, TGaS executive director of marketing science, anecdotal evidence suggests data science is starting to supplant more traditional forms of market research as the latter are increasingly viewed as outmoded.

“We see continued downward pressure on primary market research budgets, which signals a reprioritization of available funding [toward analytics] and a deprioritization or overall reduction in traditional project-based market research,” notes Getty. “Some organizations are taking a progressive stance and making concerted momentum in this direction.”

While it’s not known if they were part of the above survey, biotech firm Celgene formed an organizational capability known as IKU (information, knowledge, utilization) to manage healthcare data as a core asset and harness its derived insights to drive decision-making across the entire organization.

The company is hiring senior data scientists whose qualifications include machine learning analyses using real-world evidence, among other healthcare-related data sources, according to an online job description.

Merck’s own Center for Observational and Real-world Evidence (CORE) grew out of a need to provide stakeholders with robust evidence that can be used for decision-making. Its accomplishments include harnessing natural language processing — a technique that relies on pattern recognition — to find patients with peripheral arterial disease (PAD) by extracting information from unstructured EHR data.

Researchers from Merck and the Regenstrief Institute used machine-learning techniques to improve the method for identifying the number of PAD patients fourfold in an EHR compared with using structured data alone, according to a paper published in this past April’s Journal of the American College of Cardiology.

See also: 4 trends with the potential to change behavior in the patient journey

“External partnerships are one way in which we augment our expertise,” explains Merck’s Aman Bhandari, PhD, executive director of data science and partnerships. “You can’t always hire a team that’s multidisciplinary. We’re very fortunate to have one.”

The partnership also showcased what happens when different data sets and approaches are linked or compared. That pooling of data sets and capabilities will birth yet more opportunities across the life sciences continuum, according to a report from QuintilesIMS.

On the commercial side, as many payers ask for financial guarantees if drugs do not meet thresholds, insights generated from real-world data are being used to provide more accurate information on patients and expected outcomes, the report’s authors observe.

They cite insurer Harvard Pilgrim Health Care’s utilization of real-world evidence in three separate cases — observing the lowering of patients’ lipids to those seen during clinical trials by tracking the effectiveness of Amgen’s Repatha; monitoring hospitalizations for congestive heart failure by tracking the effectiveness of Novartis heart drug Entresto at reducing readmissions; and evaluating the ability of Eli Lilly’s Trulicity to lower HbA1c in diabetic patients.


Having diverse teams sometimes presents its own challenges in terms of giving everyone the space to collaborate, which is, by Allerheiligen’s estimation, a good problem to have.

This year, her firm created a data innovation lab, giving CORE members the opportunity to pull together teams and come up with their own ideas and solutions.

“We put tools at their disposal and allowed them to try some of these methods,” she says. “As you can imagine, with such a wide array of quantitative backgrounds, how do you bring them together and just let them tackle problems? We got way more proposals than we [expected], which was great.”

Another challenge is instilling the mindset that diverse thinking is OK. Asked by Luminary Labs CEO Sara Holoubek how she makes an AI investment case, Allerheiligen responds, “How do we work with partners outside the company [who] make us think about it in different ways? How do we unleash staff? Some have been told ‘this is your job, in this box,’ and now we’re asking them after 15 years to get out of the box.”

See also: Pharma turns to mobile, population data to market to docs

Indeed, data scientists sometimes need to come at research questions with unconventional solutions. A 2015 published study saw Merck partner with Boston Children’s Hospital to capture data on insomnia via Twitter. The research could help uncover new, previously undescribed populations of patients suffering sleep problems, notes John Brownstein, chief innovation officer of Boston Children’s, in a statement.

Speaking about data mining from social networks and online chatter, Brownstein says, “We think of this as the digital phenotype, our digital exhaust being critical in terms of understanding our health.”

Among their prerequisites, this new breed of researcher also needs to be a good communicator. They need to be able to speak with clinicians, lab scientists, and payers alike. Still, big pharma typically trains its staff. “It’s rare somebody comes in with a full package,” Allerheiligen notes.

Like most other areas of biopharma, there’s a talent scarcity here, too. Respondents to the TGaS survey said they expect fierce competition for data scientists, “particularly as they must now compete with other data intensive industries and draw talent to biopharma as their preferred destination,” notes Getty.

She adds momentum is being aided by what she termed a notable cultural shift within many biopharma organizations away from gut or intuition-guided decisions and toward data-driven decision-making.

“We don’t get rewarded for how many models we built,” Allerheiligen explains. “Where the value comes is, did we answer a question? So if we can answer it with a simple correlation, that’s great. It’s all about bringing a robust answer to the critical question.”


Pharma, as does every industry, also faces ethical issues in building AI-based products. There has been no shortage of ethical conundrums arising from the use of machine learning.

For example, a 2016 ProPublica investigation revealed racial biasing in an algorithm used to assess who can be set free at every stage of the criminal justice system, as part of a larger examination of “the hidden effect of algorithms in American life.”

Mason cites the investigation as a cautionary tale in the march toward near-term use of AI. “That should not be allowed and should be kept in mind as we build these products,” she explains.

See also: Johnson & Johnson’s Alison Lewis on building memorable brands

Interestingly, a conversation is percolating around creation of an ethical code of conduct for data scientists. Although the medical community already has one, human subject laws and protections are used around the world for clinical research. This is a place where the scientific community within healthcare has something to offer the more general, tech-oriented data-science community.

“We’re concerned with making sure we’re maintaining the highest ethics in the research that we do,” says Bhandari. No matter the technique being used, “[we need to ensure] we’re applying the scientific method in a rigorous and ethical manner as we do for all of our work.”

There are myriad other ethical questions, from AI’s impact on jobs to national security, not to mention the overriding ethical conundrum: What does it mean when machines are more intelligent than humans? The Partnership for AI, a group comprised of IBM, Google, Amazon, Facebook, and others, is studying some of the implications.

The Luminary Labs event revealed there are almost two stories about AI and its promise; automation of simple tasks such as a factory robot or driverless car, and the other about discovery and helping people do things better than they can today.

In the broader business environment, a lot of AI’s potential seems centered around augmenting human capability and taking away what Mason called the “cognitive drudgery” of ordinary tasks. Meanwhile, healthcare is poised to derive many more insights.

“We will know more about our individual genetics, long-term patient response, and adherence to regimens, as well as compliance,” predicts Allerheiligen. “For me, it’s how do we actually learn how to translate a clinical-trial patient for a real-world patient? When we’re projecting doses, it’s what’s the right dose of the drug and how do you find that? It’s still a journey, but one that is within reach.”