Archive for May, 2009

XTractor Premium for scientists to access data stored in PubMed says Knowledgespeak

Life sciences informatics company Molecular Connections, India, has announced the availability of XTractor Premium, a knowledge management platform, for the biomedical research community. The product is projected to provide scientists with new and refined ways to access the latest scientific facts from PubMed. Pharmaceutical companies, researchers and scientists can use the platform to identify, verify and develop predictive relationships from PubMed Literature.

Leveraging the ability to transform data to knowledge and to assist experts is seen to be crucial to the success of research investigation today. XTractor Premium offers several enhancements over the beta version in managing large volume of published biomedical data to accelerate decision making.

The XTractor knowledgebase currently contains 150,000 relationships on biomarkers, clinical trials, pathways, knockout studies, mutations, drug interactions and disease mechanisms. The knowledgebase is updated everyday with manually annotated relationships from the latest PUBMED publications. Other premium features include semantic search functionality; concept linking; bibliographic search; and the option to save and export data in user-friendly formats.

Original PR click here

Add comment May 19, 2009

The XTractor Premium Knowledgebase surpasses 150000 mark

XTractor Premium – a platform for discovery, analysis and modelling of published biomedical facts. The application also comes with -XTractor Premium Knowledgebase – the only knowledgebase, which provides “manually” annotated facts from PubMed on a daily basis.
The XTractor Premium Knowledgebase now has more than 150000 relationships.

Add comment May 19, 2009

Increased Time and Cost of mining PubMed everyday

www.xtractor.in/premium

Researchers are benefited to the maximal when accurate content from Biomedical Literature is delivered on a real time basis. We estimated the amount of time, effort & cost that is involved in mining PubMed for the relevant Biomedical facts every single day.

We performed a search on PubMed for Breast Neoplasm over different time spans:

We then estimated the time that is required by a normal human annotator to pick & categorize the relevant sentences from each of these abstracts and annotate the sentences for Protein, Diseases, Drugs or Biological Processes.

We found to annotate 30 days of data on breast neoplasm it would take at least 1man day (taking into account from our past estimates that it would at least take 10 min to annotate one abstract). So extrapolating the amount of time required to annotate data on a single Search “ Breast Neoplasm” for 3 months or 90 days happens to be 11 man-days of effort and considerable cost.

The same would imply if you were to read and analyze the facts yourself.

So in order to annotate 50000 abstracts (it would amount to 50000 abstracts x 10 mins) = 500000 mins or 1041 man-days or ~ 3 years to annotate 1 month of literature findings from PubMed.

Solution:

The major bottleneck with manual curation as demonstrated above involves considerable time and cost.  So in XTractor we have reduced the time involved in the manual annotation effort- by significantly cutting down the process steps to boost our internal productivity and turnaround time. So that in almost real-time basis we are able to serve you with the latest manually annotated scientific facts every single day.

ROI:

At XTractor we do manual annotation of more than 700 abstracts everyday that too at < $1 per day.

XTractor delivers you the handpicked, manually annotated facts within 10 days from the date of publication from PubMed.

FREE Trial at  http://www.xtractor.in/premium/trial.do

text mining, manual annotation, data alerts, colloborations, pubmed, curation, genes, drugs, processes, diseases, free, data mining, tag, annotations, drug discovery, web 2.0, biomedical literature, publishing, abstracts, natural language processing, NLP, data analysis, visualization, concept linking, abstraction, categorization, precision, recall, data accuracy, proteins, interactions, molecules, text gathering, indexing, index, query, MeSH, biological process, protein function, NLM, accuracy, accurate data, manual curation, curate, annotate, annotations, colloborations, curation, data alerts, data mining, diseases, drug discovery, drugs, free, genes, manual annotation, processes, pubmed, tag, text mining,text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0

Add comment May 4, 2009

Precision/accuracy based mining of PubMed data

www.xtractor.in/premium

Precision / accuracy of the mining PubMed data

The major problem that concerns most of the Biomedical researchers is the accuracy of the data mining. Making computers understand the human language and more so  analysis of the extracted data is still a problem with most of the text mining engines, which are based on Natural Language Processing  (NLP). Though NLPs have a quicker turnaround time, they are not accurate and majority of the time not comprehensive. We conducted an analysis of XTractor manual curation Vs one of the best Natural Language Processing engines in the Biomedical Space. We procured PubMed abstracts over different dates and passed them through the NLP, and the Xtractor team also manually annotated the same set for biomedical relevance.

We found that our manual curators reduced the false positive rate of NLP picked abstracts by more than 10-38% or in other words, our manual annotation effort enabled us to:

  1. Pick additional abstracts that were totally missed by the NLP
  2. Abstracts that were wrongly annotated for Proteins, Diseases, Drugs and Biological Processes.

Some Examples of the miss outs by the NL P include:

Common English term mismatches: MICE, PEG, DAMAGE, RAW, which overlap with protein names.

Common Isoform Mismatches: p16-INK4 to p14ARF and cd11c to cd11d

Common Protein mismatches: S1P (sphingosine-1-phosphate) matched to sphingosine-1-phosphate receptor and ERK to ephrin type-B receptor 2

Protein-disease mismatch: VHL protein mismatched to von-hippel lindau disease and progressive multifocal leukoencephalopathy mismatched to PML protein

Protein-process Mismatches: cell growth tagged to growth factor

Protein drug Mismatches: rapamycin tagged to protein Mammalian target of rapamycin

So both these above aspects amounted to an average of 10-38% false positives.

Solution:
XTractor manual annotation involves initially screening of all the relevant records from PubMed on an everyday basis. This is followed by manual annotation and sentence categorization. Finally the categorized sentences are again manually quality checked for accuracy and automated validations are run on them to avoid false positives from getting into the knowledgebase.

So each fact presented in the XTractor knowledgebase passes through 2 rounds of quality check’s before it’s presented to the user. So you can be rest assured that the data provided will be more than 98-99% accurate.

ROI:

Erroneous data from text mining engines can lead to wrong assumptions and hypothesis building, which may finally lead to major losses in your drug discovery program which may amount to millions of dollars. So try XTractor the world’s first knowledgebase of manually curated PubMed data every day . FREE Trial at  http://www.xtractor.in/premium/trial.do

text mining, manual annotation, data alerts, colloborations, pubmed, curation, genes, drugs, processes, diseases, free, data mining, tag, annotations, drug discovery, web 2.0, biomedical literature, publishing, abstracts, natural language processing, NLP, data analysis, visualization, concept linking, abstraction, categorization, precision, recall, data accuracy, proteins, interactions, molecules, text gathering, indexing, index, query, MeSH, biological process, protein function, NLM, accuracy, accurate data, manual curation, curate, annotate, annotations, colloborations, curation, data alerts, data mining, diseases, drug discovery, drugs, free, genes, manual annotation, processes, pubmed, tag, text mining,text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0

Add comment May 4, 2009


Calendar

May 2009
M T W T F S S
« Apr   Jul »
 123
45678910
11121314151617
18192021222324
25262728293031

Posts by Month

Posts by Category