Skip to content

Commit

Permalink
Allow taxonomies and lists as root elements.
Browse files Browse the repository at this point in the history
  • Loading branch information
TomazErjavec committed Oct 5, 2024
1 parent e53fab7 commit 4bbd8dd
Show file tree
Hide file tree
Showing 7 changed files with 16 additions and 9 deletions.
2 changes: 1 addition & 1 deletion TEI/ParlaMint-schemaSpecs.editing.odd.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
xmlns:rng="http://relaxng.org/ns/structure/1.0"
xmlns:sch="http://purl.oclc.org/dsdl/schematron"
ident="parlamint"
start="TEI teiCorpus"
start="teiCorpus TEI listPerson listOrg taxonomy"
prefix="tei_"
docLang="en"
xml:lang="en">
Expand Down
2 changes: 1 addition & 1 deletion TEI/ParlaMint-schemaSpecs.odd.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
xmlns:rng="http://relaxng.org/ns/structure/1.0"
xmlns:sch="http://purl.oclc.org/dsdl/schematron"
ident="parlamint"
start="TEI teiCorpus"
start="teiCorpus TEI listPerson listOrg taxonomy"
prefix="tei_"
docLang="en"
xml:lang="en">
Expand Down
5 changes: 3 additions & 2 deletions TEI/ParlaMint.odd.rnc
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ namespace xi = "http://www.w3.org/2001/XInclude"
namespace xlink = "http://www.w3.org/1999/xlink"
namespace xsl = "http://www.w3.org/1999/XSL/Transform"

# Schema generated from ODD source 2024-05-14T10:06:52Z. 2024-05-14.
# Schema generated from ODD source 2024-10-05T08:48:34Z. 2024-05-14.
# TEI Edition: Version 4.6.0a. Last updated on
# 5th January 2023, revision 9074b9038
# TEI Edition Location: https://www.tei-c.org/Vault/P5/Version 4.6.0a./
Expand Down Expand Up @@ -3198,4 +3198,5 @@ tei_include =
}?,
empty
}
start = tei_TEI | tei_teiCorpus
start =
tei_teiCorpus | tei_TEI | tei_listPerson | tei_listOrg | tei_taxonomy
7 changes: 5 additions & 2 deletions TEI/ParlaMint.odd.rng
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
xmlns:xlink="http://www.w3.org/1999/xlink"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
ns="http://www.tei-c.org/ns/1.0"><!--
Schema generated from ODD source 2024-05-14T10:06:48Z. 2024-05-14.
Schema generated from ODD source 2024-10-05T08:48:29Z. 2024-05-14.
TEI Edition: Version 4.6.0a. Last updated on
5th January 2023, revision 9074b9038
TEI Edition Location: https://www.tei-c.org/Vault/P5/Version 4.6.0a./
Expand Down Expand Up @@ -4650,8 +4650,11 @@ On <name/>, either the @marks attribute should be used, or a paragraph of descri
</define>
<start>
<choice>
<ref name="tei_TEI"/>
<ref name="tei_teiCorpus"/>
<ref name="tei_TEI"/>
<ref name="tei_listPerson"/>
<ref name="tei_listOrg"/>
<ref name="tei_taxonomy"/>
</choice>
</start>
</grammar>
2 changes: 1 addition & 1 deletion TEI/ParlaMint.odd.sch
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="utf-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
<title>ISO Schematron rules</title>
<!-- This file generated 2024-05-14T10:06:56Z by 'extract-isosch.xsl'. -->
<!-- This file generated 2024-10-05T08:48:37Z by 'extract-isosch.xsl'. -->
<!-- ********************* -->
<!-- namespaces, declared: -->
<!-- ********************* -->
Expand Down
5 changes: 4 additions & 1 deletion TEI/validate.pl
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,10 @@
else {
die "First parameter must be 'samples' or 'master'\n"
}
$black = '(taxonomy|list)';
# Skip validatin of taxonomies, personList, orgLis:
#$black = '(taxonomy|list)';
# Validate all files:
$black = 'NULL';

$inDir = shift;
unless (-d $inDir) {
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -4436,6 +4436,6 @@
tei_teidata.xpath = text<a href="#index.xml-eg-d33e35142" class="anchorlink">⚓</a></pre></td></tr><tr><td class="wovenodd-col1"><span xml:lang="en" lang="en" class="label">Note</span></td><td class="wovenodd-col2"><p>Any XPath expression using the syntax defined in <a class="link_ref" href="https://www.tei-c.org/release/doc/tei-p5-doc/en/html/BIB.html#XSLT2">6.2.</a>.</p><p>When writing programs that evaluate XPath expressions, programmers should be mindful of the possibility of malicious code injection attacks. For further information about XPath injection attacks, see the <a class="link_ref" href="https://owasp.org/www-community/attacks/XPATH_Injection">article at OWASP</a>.</p></td></tr></table></div></div></div></section></div><!--Notes in [TEI]--><div class="notes"><div class="noteHeading">Notes</div><div class="note" id="Note1"><span class="noteLabel">1 </span><div class="noteBody">Note that this is a illustrative example, i.e. a valid ParlaMint corpus would also need certain attributes to be defined on the illustrated elements. This holds for all the examples in this section.</div> <a class="link_return" title="Go back to text" href="#Note1_return">↵</a></div><div class="note" id="Note2"><span class="noteLabel">2 </span><div class="noteBody">Note that parliaments also have unaffiliated (or independent) MPs, that can either belong to a special <span class="q">‘unaffiliated’</span> parliamentary group or don't belong to any parliamentary group. For the former, they are simply not affiliated to any parliamentary group. For the latter, an <span class="q">‘unaffiliated’</span> parlimentaryGroup organisation must be created, and such MPs are affiliated with it as members.</div> <a class="link_return" title="Go back to text" href="#Note2_return">↵</a></div><div class="note" id="Note3"><span class="noteLabel">3 </span><div class="noteBody">The ideal situation is that the organisation somebody is affiliated with is specificed as a organisation, using the <a class="link_ref" href="#TEI.org" title="&lt;org&gt;">&lt;org&gt;</a> element (cf. the Section on <a class="link_ref" href="#sec-orgs" title="Organisations">Organisations</a>) but if this is not the case, using <a class="link_ref" href="#TEI.orgName" title="&lt;orgName&gt;">&lt;orgName&gt;</a> directly in the <a class="link_ref" href="#TEI.affiliation" title="&lt;affiliation&gt;">&lt;affiliation&gt;</a> is an alternative encoding.</div> <a class="link_return" title="Go back to text" href="#Note3_return">↵</a></div><div class="note" id="Note4"><span class="noteLabel">4 </span><div class="noteBody">Note that, in general, the utterance can also be split in the middle of a sentence, which brings with it problems for automatic linguistic processing, as, ideally, the parts should be first joined, and only then processed.</div> <a class="link_return" title="Go back to text" href="#Note4_return">↵</a></div><div class="note" id="Note5"><span class="noteLabel">5 </span><div class="noteBody">These are typically tagset developed and used for specific languages and can be found in the XPOS column of CoNLL-U files, which is the native format for UD treebanks.</div> <a class="link_return" title="Go back to text" href="#Note5_return">↵</a></div><div class="note" id="Note6"><span class="noteLabel">6 </span><div class="noteBody">Note that the example is rendered in three lines, however, the correct encoding in the corpus is actually in a single line, without any spaces between the elements, as otherwise the new line and indenting spaces are actually a part of the word <span class="q">‘abyste’</span>.</div> <a class="link_return" title="Go back to text" href="#Note6_return">↵</a></div><div class="note" id="Note7"><span class="noteLabel">7 </span><div class="noteBody">Because <a class="link_ref" href="#TEI.name" title="&lt;name&gt;">&lt;name&gt;</a> and <a class="link_ref" href="#TEI.phr" title="&lt;phr&gt;">&lt;phr&gt;</a> can give conflicting markup (i.e. crossing tags) the current script annotates phrases only where they are not related to names, i.e. not only conflicting markup, but also nestings of phr/name and name/phr are forbidden and such MWEs are not retained in the XML. Furthermore, due to a bug in the script, phrases adjecent to names are also not retained. We hope to introduce a better script and encoding in the future.</div> <a class="link_return" title="Go back to text" href="#Note7_return">↵</a></div></div><div class="stdfooter autogenerated"><div class="footer"><!--standard links to project, institution etc--><a class="plain" href="https://github.com/clarin-eric/parla-clarin">Home</a> </div><address>Tomaž Erjavec, [email protected], Matyáš Kopp, [email protected] and Andrej Pančur, [email protected]. Date: 2024-05-14<br/><!--
Generated from index.xml using XSLT stylesheets version 7.55.0a
based on http://www.tei-c.org/Stylesheets/
on 2024-05-14T10:07:38Z.
on 2024-10-05T08:49:23Z.
SAXON HE 10.3.
--></address></div></body></html>

0 comments on commit 4bbd8dd

Please sign in to comment.