Oscar3

Oscar3

Oscar3 is used in Sciborg. According to the tool’s web site:

Oscar3 is a tool for shallow, chemistry-specific parsing of chemical documents. It identifies (or attempts to identify):

  • Chemical names: singular nouns, plurals, verbs etc., also formulae and acronyms, some enzymes and reaction names.
  • Ontology terms: if you can do it by string-matching, you can get OSCAR to do it.
  • Chemical data: Spectra, melting/boiling point, yield etc. in experimental sections.

The second bullet is a problem as far as ABLE is concerned, because we cannot identify terms by string-matching in the absence a comprehensive taxonomic database to match against.

Oscar3 has additional tools to support the enhancement and maintenance of its own dictionary: “online management of a chemical/stopword lexicon”, as well as support for the “manual editing of SciXML fragments containing named entities, for creating of gold standards and training data.”

So, this tool has potential to form the basis of a follow on project to rework it to support taxonomic parsing.

Scratchpads developed and conceived by: Vince Smith, Simon Rycroft, Dave Roberts, Ben Scott...