Big Data: Big headache for the industry?
Big Data, big opportunities. But if data sets aren’t linked, the possibilities become limited.
The McKinsey Global Institute estimates that applying Big Data strategies to better inform decision-making could generate up to US$100 billion in value annually across the US healthcare system by optimizing innovation, improving the efficiency of research and clinical trials, and building new tools for physicians, consumers, insurers, and regulators to meet the promise of more individualized approaches. eyeforpharma met with J.P. Errico, Chairman, Founder, Principal Investor and CEO, electroCore, who spoke at the Financial Times Global Pharmaceutical and Biotechnology Conference 2014, to discuss his experience with Big Data.
“Big Data are like natural resources: they are most valuable when extracted and purified. It can be anything you want it to be, but you need to ask the right questions,” Errico said. “The value of big data is realized through the quality of queries. Intelligent interrogation provides the novel and useful insights that are derived from big data,” he added before launching into a story of his company’s research on healthcare resource utilization.
The project, made possible through the NHS database of patients belonging to participating GP practices, served to confirm what had been previously supported by anecdotal evidence: patients with headaches are extremely expensive. “In their first year after diagnosis, they [patients with headaches] are more expensive than individuals with asthma, mental disorders, and diabetes,” Errico pointed out.
Although it might appear counterintuitive – after all treating headaches is not an expensive thing to do –one should realize the number of comorbidities headache patients suffer from. “Those individuals have two or three more other conditions. Patients who complain of headaches consult their physician 2.5-3.0 times more frequently than average. In addition, they are 2.5-3.0 times more likely to seek help from secondary care, and they take 2.5-4.0 times more medication,” Errico reported. This shows that patients with headaches are much more susceptible to other conditions, and thanks to Big Data, Errico’s team was able to show that the association may not be random. “When disorders co-occur, there is likely to be a set of underlying pathologies,” he pointed out, adding that traditional research was only able to look at 2-3 comorbidities at a time, while the introduction of Big Data expanded the possibilities to 7-8 conditions that can be studied simultaneously in a large group of patients.
Are we there yet?
It is easy to get lost in the enormity of Big Data. Errico advises keeping things in perspective: “It’s no different than having a spreadsheet with a 100 items on it,” he noted. So why isn’t pharma getting on with it?
“There are two sides to it,” Errico suggested. “First, it’s being able to manipulate large datasets in the most economically effective way, to use the right math tools for answering the questions. Second, it’s the importance of the question that you ask. You have to have a reason to ask a question. It has to make sound scientific, rational, and logical sense, otherwise it’s an exercise in futility.”
Asking the right questions is fundamental to science. Sometimes, however, even if you do have the right questions, there are practical considerations that might prevent you from asking it. “Right now there are data sets that are not linked together,” Errico said. Indeed, over 50,000 patients have volunteered their DNA to a genomic dataset, but it’s not connected with any database that contains the real world medical information about these patients. Why is this a problem?
You can’t query that unless the data sets are linked. The industry needs to recognize that even though they can’t imagine today the questions that will be asked tomorrow, linking datasets together would allow people to come up with revolutionary questions".
If you believe that a certain gene is associated with, for example, pancreatic cancer, you can, in principle, study the association in a petri dish, but it’s helpful to have a large population sample to be able to screen through looking for that gene, and see what proportion of people with the gene have developed pancreatic cancer. If your hypothesis is correct, you’ll find that about 50% of the population did get cancer, and then you can show a link between the gene and the outcome. But what you can’t do is link it with a full medical record that would allow you to go back and identify what disease other than pancreatic cancer people get that might have shielded them from cancer. If you knew that, you could figure out how to introduce that disease, in some benign way, into people.
“You can’t query that unless the data sets are linked. The industry needs to recognize that even though they can’t imagine today the questions that will be asked tomorrow, linking datasets together would allow people to come up with revolutionary questions,” Errico concluded.
Big data is a big opportunity, but the right infrastructure must be put in place. Only by linking existing data sets can researchers come up with questions that, when explored in a rigorous way, can lead to revolutionary discoveries.
Since you're here...
... and value our content, you should sign-up to our newsletter. Sign up here