Precision/accuracy based mining of PubMed data

www.xtractor.in/premium

Precision / accuracy of the mining PubMed data

The major problem that concerns most of the Biomedical researchers is the accuracy of the data mining. Making computers understand the human language and more so  analysis of the extracted data is still a problem with most of the text mining engines, which are based on Natural Language Processing  (NLP). Though NLPs have a quicker turnaround time, they are not accurate and majority of the time not comprehensive. We conducted an analysis of XTractor manual curation Vs one of the best Natural Language Processing engines in the Biomedical Space. We procured PubMed abstracts over different dates and passed them through the NLP, and the Xtractor team also manually annotated the same set for biomedical relevance.

We found that our manual curators reduced the false positive rate of NLP picked abstracts by more than 10-38% or in other words, our manual annotation effort enabled us to:

  1. Pick additional abstracts that were totally missed by the NLP
  2. Abstracts that were wrongly annotated for Proteins, Diseases, Drugs and Biological Processes.

Some Examples of the miss outs by the NL P include:

Common English term mismatches: MICE, PEG, DAMAGE, RAW, which overlap with protein names.

Common Isoform Mismatches: p16-INK4 to p14ARF and cd11c to cd11d

Common Protein mismatches: S1P (sphingosine-1-phosphate) matched to sphingosine-1-phosphate receptor and ERK to ephrin type-B receptor 2

Protein-disease mismatch: VHL protein mismatched to von-hippel lindau disease and progressive multifocal leukoencephalopathy mismatched to PML protein

Protein-process Mismatches: cell growth tagged to growth factor

Protein drug Mismatches: rapamycin tagged to protein Mammalian target of rapamycin

So both these above aspects amounted to an average of 10-38% false positives.

Solution:
XTractor manual annotation involves initially screening of all the relevant records from PubMed on an everyday basis. This is followed by manual annotation and sentence categorization. Finally the categorized sentences are again manually quality checked for accuracy and automated validations are run on them to avoid false positives from getting into the knowledgebase.

So each fact presented in the XTractor knowledgebase passes through 2 rounds of quality check’s before it’s presented to the user. So you can be rest assured that the data provided will be more than 98-99% accurate.

ROI:

Erroneous data from text mining engines can lead to wrong assumptions and hypothesis building, which may finally lead to major losses in your drug discovery program which may amount to millions of dollars. So try XTractor the world’s first knowledgebase of manually curated PubMed data every day . FREE Trial at  http://www.xtractor.in/premium/trial.do

text mining, manual annotation, data alerts, colloborations, pubmed, curation, genes, drugs, processes, diseases, free, data mining, tag, annotations, drug discovery, web 2.0, biomedical literature, publishing, abstracts, natural language processing, NLP, data analysis, visualization, concept linking, abstraction, categorization, precision, recall, data accuracy, proteins, interactions, molecules, text gathering, indexing, index, query, MeSH, biological process, protein function, NLM, accuracy, accurate data, manual curation, curate, annotate, annotations, colloborations, curation, data alerts, data mining, diseases, drug discovery, drugs, free, genes, manual annotation, processes, pubmed, tag, text mining,text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0

Add comment May 4, 2009

XTractor Premium FREE TRIAL ACCESS

XTractor Premium – a platform for discovery, analysis and modelling of published biomedical facts. The application also comes with -XTractor Premium Knowledgebase – the only knowledgebase, which provides “manually” annotated facts from PubMed on a weekly basis. XTractor- (basic version) has been widely adopted by the life sciences research community and has more than 2000 users from 300 organizations across the globe.

FREE TRIAL at: http://www.xtractor.in/premium/trial.do

http://www.xtractor.in/premium/purchase.do

text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0

Add comment April 25, 2009

It Just takes 15 minutes to analyse thousands of biomedical scientific facts

Check this out :

http://www.xtractor.in/premium/case_study.do
Tracking common gene polymorphisms across multiple Diseases with XTractor

http://www.xtractor.in/premium

Keywords Index:

text mining,  manual annotation,  data alerts,  pubmed,  genes,  drugs,  processes,  diseases,  free,  data mining,  tag,  drug discovery,  web 2.0,  natural language processing,  data analysis,  visualization,  concept linking,  abstraction,  precision,  recall,  data accuracy,  proteins,  interactions,  index,  query,  MeSH,  NLM,  manual curation,  protein interactions,  abstraction,  abstracts,  accuracy,  accurate,  data,  annotations,  biological process,  biomedical literature, categorization,  colloborations,  concept linking,  curate,  curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da,  abstracts,  annotations,  biological process,  categorization,  colloborations,  data accuracy,  data analysis,  data mining,  drug discovery,  MeSH,  molecules,  natural language processing,  NLM,  NLP,  precision,  processes,  protein function,  protein interactions,  publishing,  pubmed,  query,  recall,  tag,  text gathering,  visualization,  web 2.0

Add comment April 15, 2009

Cut on your Data mining Time and Cost with XTractor

Free Trial at : www.xtractor.in/premium

XTractor Premium – a platform for discovery, analysis and modelling of published biomedical facts. The application also comes with -XTractor Premium Knowledgebase – the only knowledgebase, which provides “manually” annotated facts from PubMed on a weekly basis. XTractor- (basic version) has been widely adopted by the life sciences research community and has more than 2000 users from 300 organizations across the globe.

Semantic Search:

The first of its kind highly advanced search feature, where in one could query the XTractor Premium Knowledgebase using multiple ontologies. This would enable you to narrow down on your targets at ease.

Concept Linking:

Our advanced query feature would enable you to make assumptions from multiple abstracts, and discover facts, which are hitherto not published.

Bibliographic search:

Search based on author, organization or country.

Save, Export and Visualize:

Save and export the data in user-friendly formats, also XTractor Premium is made compatible with CytoScape – so one can visualize the networks also at ease.

Updates made through web services.

Web based and onsite installation available.

text mining,  manual annotation,  data alerts,  pubmed,  genes,  drugs,  processes,  diseases,  free,  data mining,  tag,  drug discovery,  web 2.0,  natural language processing,  data analysis,  visualization,  concept linking,  abstraction,  precision,  recall,  data accuracy,  proteins,  interactions,  index,  query,  MeSH,  NLM,  manual curation,  protein interactions,  abstraction,  abstracts,  accuracy,  accurate,  data,  annotations,  biological process,  biomedical literature, categorization,  colloborations,  concept linking,  curate,  curation, data accuracy, data alerts

Add comment April 3, 2009

XTractor Premium Utilities

§
XTractor enables….
Target and Drug reusability studies
§Understanding biological mechanisms of a disease/s
§Biomarker related studies
§Studying common pathways/processes across multiple diseases
§Compare adverse drug effects across drugs of the same family
§Identify similar polymorphism studies across diseases leading to changes in drug response
§Compare closely related disease types
§Study Knockouts and loss of function studies and co-relate them to drug effects
§Result extrapolations/groupings – based on Protein Family and Drug class
Try XTracor Premium- Free Trial at http://www.xtractor.in/premium/

Keywords Index:

text mining,  manual annotation,  data alerts,  pubmed,  genes,  drugs,  processes,  diseases,  free,  data mining,  tag,  drug discovery,  web 2.0,  natural language processing,  data analysis,  visualization,  concept linking,  abstraction,  precision,  recall,  data accuracy,  proteins,  interactions,  index,  query,  MeSH,  NLM,  manual curation,  protein interactions,  abstraction,  abstracts,  accuracy,  accurate,  data,  annotations,  biological process,  biomedical literature, categorization,  colloborations,  concept linking,  curate,  curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da,  abstracts,  annotations,  biological process,  categorization,  colloborations,  data accuracy,  data analysis,  data mining,  drug discovery,  MeSH,  molecules,  natural language processing,  NLM,  NLP,  precision,  processes,  protein function,  protein interactions,  publishing,  pubmed,  query,  recall,  tag,  text gathering,  visualization,  web 2.0

§
§

Add comment April 3, 2009

Data mining PubMed manually at a very low cost

We have begun this service called XTractor from July 2008- and this month we have come out with the premium version of the product.

XTractor provides the latest manually extracted scientific facts on drug discovery related aspects from PubMed within 10 days of publication.

This version provides more of inbuilt analytical features and access to the complete knowledgebase of 100000 scientific facts which is manually updated annotated from PubMed every single day. More than 600 annotated sentences are added to XTractor every single day. So in no time you would be accessing one of the largest Biomedical Databases.

We are the only Literature mining service across the globe who provide manually annotated data everyday from PubMed for as less as 450$ a year. Also we give you one year data updates absolutely free.

So subscribe to XTractor premium and save $300 right now.

For more infromation visit: www.xtractor.in/premium

For trial access: http://www.xtractor.in/premium/trial.do

Keywords Index:

text mining,  manual annotation,  data alerts,  pubmed,  genes,  drugs,  processes,  diseases,  free,  data mining,  tag,  drug discovery,  web 2.0,  natural language processing,  data analysis,  visualization,  concept linking,  abstraction,  precision,  recall,  data accuracy,  proteins,  interactions,  index,  query,  MeSH,  NLM,  manual curation,  protein interactions,  abstraction,  abstracts,  accuracy,  accurate,  data,  annotations,  biological process,  biomedical literature, categorization,  colloborations,  concept linking,  curate,  curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da,  abstracts,  annotations,  biological process,  categorization,  colloborations,  data accuracy,  data analysis,  data mining,  drug discovery,  MeSH,  molecules,  natural language processing,  NLM,  NLP,  precision,  processes,  protein function,  protein interactions,  publishing,  pubmed,  query,  recall,  tag,  text gathering,  visualization,  web 2.0

Add comment March 19, 2009

XTractor Premium now at Introductory Offer of $450 for Academic Researchers. Subscribe now and save $300!

XTractor Premium- the only Knowledgebase that provides accurate manually annotated scientific relationships, updated from PubMed on a daily basis.

Subscribe Now and Get…

Information on Disease mechanisms, Biomarkers, Drug effects, Clinical trials, Prognosis, Pathways, Knockouts, Mutations, RNAi studies, Protein studies and much more…

+

Access to the XTractor Premium Knowledge Base of more than 1,00,000 relationships from PubMed

+

Database updates for one-complete year from the date of purchase absolutely FREE!!

+

All Scientific Facts manually annotated by our expert annotators every day
XTractor is currently being used by more than 2000 users from over 300 organizations including USFDA, NIH/NCI, MD Anderson, Scripps, Harvard Medical, Max Planck etc..

Keywords Index:

text mining,  manual annotation,  data alerts,  pubmed,  genes,  drugs,  processes,  diseases,  free,  data mining,  tag,  drug discovery,  web 2.0,  natural language processing,  data analysis,  visualization,  concept linking,  abstraction,  precision,  recall,  data accuracy,  proteins,  interactions,  index,  query,  MeSH,  NLM,  manual curation,  protein interactions,  abstraction,  abstracts,  accuracy,  accurate,  data,  annotations,  biological process,  biomedical literature, categorization,  colloborations,  concept linking,  curate,  curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da,  abstracts,  annotations,  biological process,  categorization,  colloborations,  data accuracy,  data analysis,  data mining,  drug discovery,  MeSH,  molecules,  natural language processing,  NLM,  NLP,  precision,  processes,  protein function,  protein interactions,  publishing,  pubmed,  query,  recall,  tag,  text gathering,  visualization,  web 2.0

Add comment March 17, 2009

Get Manually annotated PubMed abstracts everyday

Highlights- www.xtractor.in/premium

  1. Find information extracted by our expert scientists from PubMed, including information on Proteins, Drugs, Biological and Disease processes
  2. Use classified Literature findings to support experimental studies on Biomarkers, Clinical trials, Drug interactions, Pathways, Disease mechanisms and much more
  3. Make newer hypothesis using the latest relationships from PubMed within 10 days of publication
  4. Cut short your discovery and analysis time through innovative search features
  5. Track organizations and their hot areas of research

Drug Discovery Researcher:

XTractor Premium can help you with:

  • Find closely associated Proteins, Drugs, Biological Processes and Diseases
  • Sort your facts with our manually categorised sentences on Biomarkers, drug studies, pathway studies, knockout studies, mutation studies and many more.
  • Obtain highly accurate data everyday for your research
  • Discover new relationships and frame hypothesis
  • Track competitors and organizations working on similar or related areas.
  • Import the data onto your inhouse applications at ease.

Academic researcher:

XTractor Premium can help you with:

  • Quick collection of facts required for setting up your experiments
  • Enhanced review writing with our manually annotated sentences
  • Regular updates on your areas of interest
  • Sorting of the most relevant abstracts at ease.
  • Customised downloadable Reports for your favorite biological and chemical entities
  • Save time on PubMed searches

Add comment March 17, 2009

Molecular Connections Launches XTractor Premium for Effective Knowledge Management

2 comments March 17, 2009

XTractor Premium- www.xtractor.in/premium

Overview

XTractor Premium – a platform for discovery, analysis and modelling of published biomedical facts. The application also comes with -XTractor Premium Knowledgebase – the only knowledgebase, which provides “manually” annotated facts from PubMed on a weekly basis. XTractor- (basic version) has been widely adopted by the life sciences research community and has more than 2000 users from 300 organizations across the globe.

XTractor Premium enables you to..

Precision based mining of relevant facts

Compared to Natural language processing engines- our manual annotation approach reduces false positives rate by 12-35%.

Reduce the amount of time required for data mining

Manually mined scientific data from PubMed delivered within 10 days from the date of publication.

Sentences categorised into 13 different categories for easy refining.

Comprehensive coverage of information published everyday in PubMed

More than 600 abstracts mined for relationships every single day from PubMed. More than 15000-20000 relationships added to the XTractor Premium knowledgebase in a month.

Analysis of the extracted facts

Highly Advanced Semantic,Bibliographic Search, Concept linking and extensive reports for quicker analysis.

Stay Current with Weekly updates

Latest updates from PubMed delivered every week.

Watchlist feature to enable topic tracking.

www.xtractor.in/premium

For FREE trail access: http://www.xtractor.in/premium/trial.do
text mining, manual annotation, data alerts, colloborations, pubmed, curation, genes, drugs, processes, diseases, free, data mining, tag, annotations, drug discovery, web 2.0, biomedical literature, publishing, abstracts, natural language processing, NLP, data analysis, visualization, concept linking, abstraction, categorization, precision, recall, data accuracy, proteins, interactions, molecules, text gathering, indexing, index, query, MeSH, biological process, protein function, NLM, accuracy, accurate data, manual curation, curate, annotate, annotations, colloborations, curation, data alerts, data mining, diseases, drug discovery, drugs, free, genes, manual annotation, processes, pubmed, tag, text mining,text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0

Add comment March 16, 2009

Next Posts Previous Posts


Categories

  • Blogroll

  • Feeds