The right to read is the right to mine!

The ContentMine uses machines to liberate 100 000 000 facts from the scientific literature.

Blog

figure 3 from Nakai et al (2014) doi: 10.1099/ijs.0.060798-0 CC-BY

Mining Images for Identifiers

Figure images in scholarly articles typically contain a wealth of interesting data. In terms of textual data, identifiers such as GenBank / ENA accession numbers can be relatively common in certain types of figure images. Take for example the below figure reproduced from an Open Access article in IJSEM:   Most tip labels of the … more

The ContentMine Workshop at Bath

On Tuesday 28th July, the ContentMine team held an introductory content mining workshop for biologists, at the University of Bath. With primarily internet-based advertising of the course we had signups for the event from all over the world including notably, Sri Lanka and Jordan. So I’m glad all the content we went over is captured … more

xml

Examining variation in XML section tagging

This post is a brief overview, with examples, of the variation one can encounter in section tagging of academic journal article full text, in XML. The ContentMine aims to provide tools with which to mine academic articles by section (as an option) e.g. abstract, introduction, materials and methods, results, conclusion, acknowledgements, and references. The process of … more