Posts filed under 'Uncategorized'
Increased Time and Cost of mining PubMed everyday
Researchers are benefited to the maximal when accurate content from Biomedical Literature is delivered on a real time basis. We estimated the amount of time, effort & cost that is involved in mining PubMed for the relevant Biomedical facts every single day.
We performed a search on PubMed for Breast Neoplasm over different time spans:
We then estimated the time that is required by a normal human annotator to pick & categorize the relevant sentences from each of these abstracts and annotate the sentences for Protein, Diseases, Drugs or Biological Processes.
We found to annotate 30 days of data on breast neoplasm it would take at least 1man day (taking into account from our past estimates that it would at least take 10 min to annotate one abstract). So extrapolating the amount of time required to annotate data on a single Search “ Breast Neoplasm” for 3 months or 90 days happens to be 11 man-days of effort and considerable cost.
The same would imply if you were to read and analyze the facts yourself.
So in order to annotate 50000 abstracts (it would amount to 50000 abstracts x 10 mins) = 500000 mins or 1041 man-days or ~ 3 years to annotate 1 month of literature findings from PubMed.
Solution:
The major bottleneck with manual curation as demonstrated above involves considerable time and cost. So in XTractor we have reduced the time involved in the manual annotation effort- by significantly cutting down the process steps to boost our internal productivity and turnaround time. So that in almost real-time basis we are able to serve you with the latest manually annotated scientific facts every single day.
ROI:
At XTractor we do manual annotation of more than 700 abstracts everyday that too at < $1 per day.
XTractor delivers you the handpicked, manually annotated facts within 10 days from the date of publication from PubMed.
FREE Trial at http://www.xtractor.in/premium/trial.do
text mining, manual annotation, data alerts, colloborations, pubmed, curation, genes, drugs, processes, diseases, free, data mining, tag, annotations, drug discovery, web 2.0, biomedical literature, publishing, abstracts, natural language processing, NLP, data analysis, visualization, concept linking, abstraction, categorization, precision, recall, data accuracy, proteins, interactions, molecules, text gathering, indexing, index, query, MeSH, biological process, protein function, NLM, accuracy, accurate data, manual curation, curate, annotate, annotations, colloborations, curation, data alerts, data mining, diseases, drug discovery, drugs, free, genes, manual annotation, processes, pubmed, tag, text mining,text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0
Add comment May 4, 2009
Precision/accuracy based mining of PubMed data
Precision / accuracy of the mining PubMed data
The major problem that concerns most of the Biomedical researchers is the accuracy of the data mining. Making computers understand the human language and more so analysis of the extracted data is still a problem with most of the text mining engines, which are based on Natural Language Processing (NLP). Though NLPs have a quicker turnaround time, they are not accurate and majority of the time not comprehensive. We conducted an analysis of XTractor manual curation Vs one of the best Natural Language Processing engines in the Biomedical Space. We procured PubMed abstracts over different dates and passed them through the NLP, and the Xtractor team also manually annotated the same set for biomedical relevance.
We found that our manual curators reduced the false positive rate of NLP picked abstracts by more than 10-38% or in other words, our manual annotation effort enabled us to:
- Pick additional abstracts that were totally missed by the NLP
- Abstracts that were wrongly annotated for Proteins, Diseases, Drugs and Biological Processes.
Some Examples of the miss outs by the NL P include:
Common English term mismatches: MICE, PEG, DAMAGE, RAW, which overlap with protein names.
Common Isoform Mismatches: p16-INK4 to p14ARF and cd11c to cd11d
Common Protein mismatches: S1P (sphingosine-1-phosphate) matched to sphingosine-1-phosphate receptor and ERK to ephrin type-B receptor 2
Protein-disease mismatch: VHL protein mismatched to von-hippel lindau disease and progressive multifocal leukoencephalopathy mismatched to PML protein
Protein-process Mismatches: cell growth tagged to growth factor
Protein drug Mismatches: rapamycin tagged to protein Mammalian target of rapamycin
So both these above aspects amounted to an average of 10-38% false positives.
Solution:
XTractor manual annotation involves initially screening of all the relevant records from PubMed on an everyday basis. This is followed by manual annotation and sentence categorization. Finally the categorized sentences are again manually quality checked for accuracy and automated validations are run on them to avoid false positives from getting into the knowledgebase.
So each fact presented in the XTractor knowledgebase passes through 2 rounds of quality check’s before it’s presented to the user. So you can be rest assured that the data provided will be more than 98-99% accurate.
ROI:
Erroneous data from text mining engines can lead to wrong assumptions and hypothesis building, which may finally lead to major losses in your drug discovery program which may amount to millions of dollars. So try XTractor the world’s first knowledgebase of manually curated PubMed data every day . FREE Trial at http://www.xtractor.in/premium/trial.do
text mining, manual annotation, data alerts, colloborations, pubmed, curation, genes, drugs, processes, diseases, free, data mining, tag, annotations, drug discovery, web 2.0, biomedical literature, publishing, abstracts, natural language processing, NLP, data analysis, visualization, concept linking, abstraction, categorization, precision, recall, data accuracy, proteins, interactions, molecules, text gathering, indexing, index, query, MeSH, biological process, protein function, NLM, accuracy, accurate data, manual curation, curate, annotate, annotations, colloborations, curation, data alerts, data mining, diseases, drug discovery, drugs, free, genes, manual annotation, processes, pubmed, tag, text mining,text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0
Add comment May 4, 2009
XTractor Premium FREE TRIAL ACCESS
XTractor Premium – a platform for discovery, analysis and modelling of published biomedical facts. The application also comes with -XTractor Premium Knowledgebase – the only knowledgebase, which provides “manually” annotated facts from PubMed on a weekly basis. XTractor- (basic version) has been widely adopted by the life sciences research community and has more than 2000 users from 300 organizations across the globe.
FREE TRIAL at: http://www.xtractor.in/premium/trial.do
http://www.xtractor.in/premium/purchase.do
text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0
Add comment April 25, 2009
It Just takes 15 minutes to analyse thousands of biomedical scientific facts
Check this out :
http://www.xtractor.in/premium/case_study.do
Tracking common gene polymorphisms across multiple Diseases with XTractor
http://www.xtractor.in/premium
Keywords Index:
text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0
Add comment April 15, 2009
Cut on your Data mining Time and Cost with XTractor
Free Trial at : www.xtractor.in/premium
XTractor Premium – a platform for discovery, analysis and modelling of published biomedical facts. The application also comes with -XTractor Premium Knowledgebase – the only knowledgebase, which provides “manually” annotated facts from PubMed on a weekly basis. XTractor- (basic version) has been widely adopted by the life sciences research community and has more than 2000 users from 300 organizations across the globe.
Semantic Search:
The first of its kind highly advanced search feature, where in one could query the XTractor Premium Knowledgebase using multiple ontologies. This would enable you to narrow down on your targets at ease.
Concept Linking:
Our advanced query feature would enable you to make assumptions from multiple abstracts, and discover facts, which are hitherto not published.
Bibliographic search:
Search based on author, organization or country.
Save, Export and Visualize:
Save and export the data in user-friendly formats, also XTractor Premium is made compatible with CytoScape – so one can visualize the networks also at ease.
Updates made through web services.
Web based and onsite installation available.
text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts
Add comment April 3, 2009
XTractor Premium Utilities
Keywords Index:
text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0
Add comment April 3, 2009
Data mining PubMed manually at a very low cost
We have begun this service called XTractor from July 2008- and this month we have come out with the premium version of the product.
XTractor provides the latest manually extracted scientific facts on drug discovery related aspects from PubMed within 10 days of publication.
This version provides more of inbuilt analytical features and access to the complete knowledgebase of 100000 scientific facts which is manually updated annotated from PubMed every single day. More than 600 annotated sentences are added to XTractor every single day. So in no time you would be accessing one of the largest Biomedical Databases.
We are the only Literature mining service across the globe who provide manually annotated data everyday from PubMed for as less as 450$ a year. Also we give you one year data updates absolutely free.
So subscribe to XTractor premium and save $300 right now.
For more infromation visit: www.xtractor.in/premium
For trial access: http://www.xtractor.in/premium/trial.do
Keywords Index:
text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0
Add comment March 19, 2009
XTractor Premium now at Introductory Offer of $450 for Academic Researchers. Subscribe now and save $300!
XTractor Premium- the only Knowledgebase that provides accurate manually annotated scientific relationships, updated from PubMed on a daily basis.
Subscribe Now and Get…
Information on Disease mechanisms, Biomarkers, Drug effects, Clinical trials, Prognosis, Pathways, Knockouts, Mutations, RNAi studies, Protein studies and much more…
+
Access to the XTractor Premium Knowledge Base of more than 1,00,000 relationships from PubMed
+
Database updates for one-complete year from the date of purchase absolutely FREE!!
+
All Scientific Facts manually annotated by our expert annotators every day
XTractor is currently being used by more than 2000 users from over 300 organizations including USFDA, NIH/NCI, MD Anderson, Scripps, Harvard Medical, Max Planck etc..
Keywords Index:
text mining, manual annotation, data alerts, pubmed, genes, drugs, processes, diseases, free, data mining, tag, drug discovery, web 2.0, natural language processing, data analysis, visualization, concept linking, abstraction, precision, recall, data accuracy, proteins, interactions, index, query, MeSH, NLM, manual curation, protein interactions, abstraction, abstracts, accuracy, accurate, data, annotations, biological process, biomedical literature, categorization, colloborations, concept linking, curate, curation, data accuracy, data alerts, literature, categorization, colloborations, concept linking curate curation data accuracy data alerts da, abstracts, annotations, biological process, categorization, colloborations, data accuracy, data analysis, data mining, drug discovery, MeSH, molecules, natural language processing, NLM, NLP, precision, processes, protein function, protein interactions, publishing, pubmed, query, recall, tag, text gathering, visualization, web 2.0
Add comment March 17, 2009
Get Manually annotated PubMed abstracts everyday
Highlights- www.xtractor.in/premium

- Find information extracted by our expert scientists from PubMed, including information on Proteins, Drugs, Biological and Disease processes
- Use classified Literature findings to support experimental studies on Biomarkers, Clinical trials, Drug interactions, Pathways, Disease mechanisms and much more
- Make newer hypothesis using the latest relationships from PubMed within 10 days of publication
- Cut short your discovery and analysis time through innovative search features
- Track organizations and their hot areas of research
Drug Discovery Researcher:
XTractor Premium can help you with:
- Find closely associated Proteins, Drugs, Biological Processes and Diseases
- Sort your facts with our manually categorised sentences on Biomarkers, drug studies, pathway studies, knockout studies, mutation studies and many more.
- Obtain highly accurate data everyday for your research
- Discover new relationships and frame hypothesis
- Track competitors and organizations working on similar or related areas.
- Import the data onto your inhouse applications at ease.
Academic researcher:
XTractor Premium can help you with:
- Quick collection of facts required for setting up your experiments
- Enhanced review writing with our manually annotated sentences
- Regular updates on your areas of interest
- Sorting of the most relevant abstracts at ease.
- Customised downloadable Reports for your favorite biological and chemical entities
- Save time on PubMed searches
Add comment March 17, 2009