Databases of genomic information assist molecular biologists in understanding the biochemical functions, macromolecular structure, and evolutionary history of organisms. We present an introduction to molecular biology, discuss how and why molecular biologists use genomic databases, describe search algorithms, discuss the notion of relevance in genomics, and outline current and future directions for genomic IR. Information retrieval is important in various biomedical research fields. This covers the theoretical background and the state of the art and future trends in biomedical information retrieval. Techniques for literature searches, genomic information retrieval and database searches are discussed. Literature searches techniques cover name entity extraction, document indexing, document clustering and event extraction.
Genomic information retrieval techniques are based on sequence alignment algorithms. This also briefly describes widely used biological databases and discusses the issues related to the information retrieval from these databases. Terminology systems are involved in almost every aspect of information retrieval. The various types of terminology systems and their usage to support information retrieval are reviewed. They are homology, database, and indexing, clustering, ontology’s. In order to efficiently retrieve relevant information and improve precision, modern information retrieval systems usually index documents, or group similar documents together to facilitate the identification of relevant documents.
New information retrieval new techniques are being developed that may lead to more sophisticated information retrieval systems with both a high recall and a high precision. These consist of literature search, genomic sequence searches and database searches. It also reviews the use of terminology systems in information retrieval and its importance in supporting information retrieval and integration (e.g., the integration of semantically related but syntactically variant information). Homology searches are the building blocks of many studies, such as comparative genomics, gene prediction and phylogenetic analysis.