XSLT to extract taxon names from taXMLit file

User login

This form on this page or block has been disabled.

About this site

This site is moderated by agw96 on behalf of the contributors who retain copyright.

Content can be used in accordance with a CC Licence.

This site uses Drupal and is based on a set of templates and modules defined by the Scratchpad team at the Natural History Museum, London.

XSLT to extract taxon names from taXMLit file

View
By term

The attached XSLTs enable you to recover taxon names from a taXMLit marked up file.

In taXMLit several elements contain taxonomic material as explained in the taXMLit documentation page. These XSLTs work with the TaxonName element.

The XSLTs are:

extractTaxonName - simply retrieves the content of all TaxonName nodes
extractTaxonNameSortUnique - retrieves the content of all TaxonName nodes, and presents the results in alphabetical order with duplicate names removed
extractTaxonNamePartOne - retrieves the explicitly cited taxon names into an XML file
extractTaxonNamePartTwo - uses the XML file from PartOne, removes duplicates and sorts the taxon names to produce a text file.

The output from the XSLTs are attached as Result_ files.

Note:

the XSLTs removes superfluous whitespace and indents present in the XML source. This can be commented out and will not alter the retrieval of actual text content
the XML and XSLT files have a .txt extension because they can not be attached to this post. To use the XSLTs you should alter their extension to .xslt.

Attachment	Size
extractTaxonNamePartOne.txt	806 bytes
extractTaxonNamePartTwo.txt	769 bytes
result_extractTaxonNamePartOne.txt	84.42 KB
result_extractTaxonNamePartTwo.txt	17.25 KB
extractTaxonName.txt	676 bytes
result_extractTaxonName.txt	64.11 KB
extractTaxonNameSortUnique.txt	824 bytes
result_extractTaxonNameSortUnique.txt	22.42 KB

Login or register to post comments
131 reads
Printer-friendly version

Comments

Thu, 07/09/2009 - 20:26 — Dauvit

And in PHP/DOM too

Should you wish to avoid the mysteries of XPath statements this PHP script will also extract the TaxonName nodes. It manages the XML file through the document object model, and so can be manipulated using the same techniques as though the source document was a web page. Note, this is a quick and dirty example for I have hard coded the input file Gold_BCA.xml within the script:

<?php $dom = new DomDocument(); $dom -> load('Gold_BCA.xml'); echo('Taxon names in Gold_BCA.xml are: '); $TaxonNames = $dom -> getElementsByTagName('TaxonName'); foreach($TaxonNames as $node) { echo($node -> textContent . "\n"); } ?>

ABLE project

User login

Project discussion

BHL

About this site

XSLT to extract taxon names from taXMLit file

Comments

And in PHP/DOM too