CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Tags

Introducing the potential of text mining to animal sciences.

by: Sudeep Sahadevan, Martin Hofmann-Apitius, Karl Schellander, Dawit Tesfaye, Juliane Fluck, Christoph M. Friedrich
Journal of animal science (4 June 2012), doi:10.2527/jas.2011-4841  Key: citeulike:11175440

Formatted Citation


Show HTML

Likes (beta)

This copy of the article hasn't been liked by anyone yet.

View FullText article


Abstract

In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on two major information sources: (i) various biological databases and (ii) scientific publications in the field. The major difference between the two information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature, by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information is very time consuming. It is more than likely that using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been employed in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects, for example: gene/protein names, drugs, enzymes) in text. In case of animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from cattle and pig. For this purpose, the rather non-comprehensive resources of pig and cattle gene and protein terminologies were enriched with orthologue synonyms, integrated in the NER platform, ProMiner, which is successfully used in human genomics domain. Based on the performance tests done, the present system achieved a fair performance with precision 0.64, recall 0.74 and F(1) measure of 0.69 in a test scenario based on cattle literature.


guhjy's tags for this article

Citations (CiTO)

No CiTO relationships defined

X There are no reviews yet

X Find related articles with these CiteULike tags

X Posting History


X Export records

Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.