Turning information overload into competitive advantage in the pharma industry

According to TripleHop Technologies, developers of information filtering technologies for the life sciences, some staggering statistics bring to light the overwhelming information management challenges pharma is facing:

  • Every minute scientific knowledge increases by 2,000 pages;
  • Every day we send the equivalent of more than 300 million pages of text over the Internet;
  • It takes five years to read the new scientific material produced every 24 hours; and
  • Over half of all scientific knowledge becomes outdated every ten years;

Most companies are literally swimming in data that largely goes untapped. In fact, research indicates 35-50% of the information available within a given enterprise is not centrally indexed, and IDC estimates an enterprise employing 1,000 knowledge workers wastes nearly $2.5 million per year due to an inability to locate and retrieve information. That said, in an industry where time-to-market is perhaps the most critical success factor, improved information harvesting technologies are crucial.

Gilles Rousseau, Head of Life Sciences and Pharmaceuticals for TripleHop, believes the company's MatchPoint enterprise software solution, already in wide use within the financial, media, manufacturing and travel industries, offers unique advantages to pharmaceutical clients looking for new approaches to capitalize on their information resources. According to Rousseau, MatchPoint's single interface provides unified access to disparate information sources both internal and external. MatchPoint allows users to search within their entire enterprise information universe with just one query, he said.

Through a combination of proprietary search technology and connectors to a client's existing online subscription-based services, MatchPoint provides access to a variety of internal and external sources of information. Regardless of location and format, MatchPoint provides unified access to a wide range of disparate information resources, including the Internet, corporate intranets, shared corporate files, databases, annotations and reports, e-mails and attachments, local documents and online libraries.

But according to Rousseau, it is MatchPoint's concept-based's searching mechanisms that really raise the bar over other search technologies. Keyword searches work fine, but they'sre outdated, Rousseau explained. The big problem is what you get depends on what you type in. And even if you ask the right question, it becomes an issue of have you asked it in all the right ways? Our concept-based search enables users to retrieve not only documents that specifically contain the word(s) entered by the user, but also documents that may not contain such word(s), yet contain words related to the same concept. In addition, MatchPoint also provides more personalized and context-sensitive search results, proven to be more precise than Bayesian and semantic searches.

The challenge in concept searching, according to Rousseau, is determining which words are related terms. MatchPoint learns from a client's documents which terms are related to which concepts, he said. At TripleHop, we believe the relevancy of search results depends on the task at hand and the intended use of the information collected, and thus changes with the context. Therefore, we designed MatchPoint to be context-sensitive. It continuously learns and categorizes search concepts from each client's corpus of documents, as well as from the interaction of the users. While being essentially based on statistical algorithms, our classification technology also includes a layer of semantic analysis in order to bootstrap the system and make it efficient as soon as it is installed. As a result, our technology solves the typical integration and maintenance problems encountered with Bayesian and semantic-based technologies while providing much better scalability.

The beauty of the technology, however, Rousseau points out, is its ability to continue to learn with use. As this system gets used on a regular basis, the concepts become more and more refined, he said. Although it works well on Day 1, it's constantly improving, based on users's explicit and implicit feedback.

MatchPoint's concept searching technology and learning mechanisms are, according to Triplehop, based on the latest generation of artificial intelligence-based algorithms, including semantically tagged information (automation of the semantic Web) and statistical analysis of terms (keywords and phrases) and patterns within a corpus. The company believes concept searching decreases the time spent on searches, prevents work redundancy, minimizes the risk of missing important documents, and optimizes the effectiveness of searches.

However, Rousseau and his colleagues believe simply accessing a broad range of information isn'st enough, but that the real strategic payoff comes in sharing knowledge and expertise throughout the enterprise. Subject to existing network privileges and privacy rules, MatchPoint can refer users to relevant search results obtained and saved by co-workers on a similar topic, Rousseau said. This enables users to learn from each other, identify experts on a specific topic, locate the most relevant company documents on a particular subject, and ensure they don'st reinvent the wheel's every time they need to produce a document or research a specific topic.

In addition, Rousseau stresses the importance of being able to stay abreast of continuing developments related to a specific topic of interest. If you want to keep track of a particular topic, like competitor activities for instance or who gets FDA approvals and when, you can program alert searches and have them run at particular intervals, he said.

Although Rousseau sees many potential applications of the MatchPoint technology within pharma companies, he believes the discovery and regulatory submission processes will see the greatest benefit. One important aspect of where we'sre heading is to create bridges between all the documents we can search and the actual databases, he said. Databases are structured information and the big challenge, and where the information retrieval world is headed is to create bridges between all the unstructured data, in emails and documents, and the structured data. In other words, if you'sre in the middle of your database looking at testing numbers on a particular drug, you'sd be able to click on that number and see all the internal memos, Web pages and emails generated on the topic.

Matchpoint's real advantage according to Rousseau and his colleagues lies in its ability to search a wide range of data and information levels, relating each level to other documents in other sources. MatchPoint makes a whole new range of information available that you may not have had access to before, because you didn'st even know it was out there, Rousseau said.

To learn more about MatchPoint and Triplehop, visit the company's Web site at triplehop.com or e-mail contact@triplehop.com.

To find out more about the Knowledge Management strategies available to your company and to hear best practices from across the world, apply now for a ticket to eyeforpharma's Knowledge Management for R&D One Day Intensive Workshops, December 2, 2002 in Central London.

To apply now for this exclusive pharma-only event, click here or for more information contact Event Director, Jonathan Gardner on +44 (0)20 73 75 75 63 or email him on jgardner@eyeforpharma.com.