Identifies paragraph types based on keywords.
Run apply_paragraph_types.php against the *_abbyy_to_tei_by_xmlreader.xml files.
Keywords defined in a separate file, paragraph_keywords.php, to make changing them easier.
Script inserts comments with proposed paragraph type.
All eleven exemplar documents have been marked up, *_tei_annotated.xml, using the preferred comment approach.
One exemplar document, bulletinofbritis51entolond, has been marked up, *_tei_annotated_with_div.xml, using the TEI <div> approach. (See Mark up options and apply_paragraph_types_v1-1.php below.)
Attachment | Size |
---|---|
apply_paragraph_types.php_.txt [1] | 1.68 KB |
paragraph_keywords.php_.txt [2] | 3.71 KB |
apply_paragraph_types_v1-1.php_.txt [3] | 2.45 KB |
Links:
[1] http://able.myspecies.info/sites/able.myspecies.info/files/apply_paragraph_types.php_.txt
[2] http://able.myspecies.info/sites/able.myspecies.info/files/paragraph_keywords.php_.txt
[3] http://able.myspecies.info/sites/able.myspecies.info/files/apply_paragraph_types_v1-1.php_.txt