The right to read is the right to mine!
The ContentMine uses machines to liberate 100 000 000 facts from the scientific literature.
Blog
Mining Images for Identifiers
Figure images in scholarly articles typically contain a wealth of interesting data. In terms of textual data, identifiers such as GenBank / ENA accession numbers can be relatively common in certain types of figure images. Take for example the below figure reproduced from an Open Access article in IJSEM: Most tip labels of the … more
The ContentMine Workshop at Bath
On Tuesday 28th July, the ContentMine team held an introductory content mining workshop for biologists, at the University of Bath. With primarily internet-based advertising of the course we had signups for the event from all over the world including notably, Sri Lanka and Jordan. So I’m glad all the content we went over is captured … more
Examining variation in XML section tagging
This post is a brief overview, with examples, of the variation one can encounter in section tagging of academic journal article full text, in XML. The ContentMine aims to provide tools with which to mine academic articles by section (as an option) e.g. abstract, introduction, materials and methods, results, conclusion, acknowledgements, and references. The process of … more

