almaKnowledgeExplorer



Brief info

A novel system for detecting relationships between biological entities, as published in the literature. AlmaKnowledgeExplorer is a very valuable tool for aiding in the tedious and time-consuming task of gathering the information published in the literature. AlmaKnowledgeExplorer displays the knowledge associated to a gene, a protein, a pathway, etc. Some examples of possible uses of the system include:

  • Knowing the described relationships for a given entry.

  • Assesing the importance of a relationship.

  • Distinguish the type of relation (interaction, activation, repression, etc.)

  • Reconstructing pathways in base to bibliographic information.

  • Structure the knowledge according to year of study, authors, etc.

AlmaKnowledgeExplorer can be combined with AlmaTextMiner, producing a powerful system for biological analysis from knowledge sources


More

Description:
You want to analyze the relationships between some specific objects that could be for example genes or proteins, drugs and other metabolites, diseases, ... to explore biochemical pathways, target genes for drugs, relations between genes, drugs and diseases, reasons for pathological states, ... To do this in an efficient way all these entities have to be detected and classified first in the text and then the relevant sentences describing a relation between these entities have to be separated from the vast amount of sentences that contain both entities but do not express the relation one is interested in. It is crucial for an information extraction system to spot the "needle in the haystack" and to present the results in a way that allows to see the connections in the extracted knowledge.

Solution:
The KnowledgeExplorer was developed for the automatic extraction of bio-entities in scientific text and to fit them in a collection of pre-defined grammatical constructions commonly used to indicate different types of relations between these entities. The system includes features of computational linguistics methods (such as text tagging) and of statistical methods, that guaranty high performance without loosing the flexibility necessary for the adaptation to new problems. The results are stored in the central database of relationships and represented via a dynamic interface that allows simultaneous manipulation of the underlying information, access to the text sources and exploration of relationships at different levels by human experts.

Current applications:
We currently apply the KnowledgeExplorer to the analysis and reconstruction of complete gene and protein interaction networks extracted for specific organisms like Saccharomyces cerevisiae from the literature. The predefined rules detect constructions like "protein A binds to/interacts with protein B" among many others that were chosen because of their high coverage and specificity for the detection of this information. In a previous step genes and proteins mentioned in the text are detected and the only user input required are the documents that have to be analyzed.
The interface allows to browse through the network at different levels or focus on specific nodes and explore the interaction partners from these positions. The textual information about the genes and proteins in the network and the individual interactions that allows the immediate verification of the results is presented in the lower part of the window.




Synergies:
These techniques increase their performance dramatically when they are used together, the TextMiner to find relevant documents and get an overview over the available publications, the KnowledgeExplorer to establish relationships between the entities, then back to the TextMiner to extract information about a set of entity pairs and to compare them to other sets to spot hidden differences and to extend the text corpus with the new information, and all this knowledge in accumulated in KnowledgeDB to be mined in further sessions.
Contact us at for further information print this page