Introduction to the ISO Standards Tag Library

This document describes ISOSTS (ISO Standards Tag Set), a tag set developed for the International Standards Organization based on the Journal Publishing Tag Set of the current draft of NISO Z39.96 - JATS: Journal Article Tag Suite. The tag set has been created from the DTDs provided in the JATS supporting documentation provided by the National Library Medicine and all changes have been implemented using the procedures recommended in the JATS version 0.4 Tag Library.

The metadata structures needed to describe standards are significantly different from those needed for journal articles, so the metadata in this tag set has been designed for ISO. Since the body of standards is very similar to that for most other technical materials, the body structures of the JATS have been adopted with very little change.

See below for lists of elements added to the JATS base tag set, or modified from it, as part of the ISOSTS customization.

Structure of this Tag Library

This Tag Library contains the following sections:

Introduction

This introduction to the Tag Library contents.

Elements

Descriptions of all of the elements used in the tag set. Elements are nouns, like “Standard” and “International Classification for Standards”, that represent components of ISO Standards documents.

The element descriptions are listed in alphabetical order by tag name. (Note: Each element has two names: a “tag name” (sometimes called an element-type name) that is used in the DTDs and by the software, and an “element name” (usually longer) that provides a fuller name for the benefit of readers. For example, a tag name might be <ics> with the corresponding element name International Classification for Standards or a tag name might be <proj-id> with the corresponding element name ISO Project Identifier.)

Attributes

Descriptions of all of the attributes in the DTDs. Attributes are facts about an element, such as what type of information (e.g., forward, symbols, or tests) is contained in a <sec> tag. Like elements, attributes also have two names: the shorter machine-readable one and a (usually longer) human-readable one. Attributes are listed in order by the shorter machine-readable names, for example, the attribute short name “@valign” instead of the “Vertical Alignment (XHTML table model)”.

Context Table

Listing of where an element may occur. The Context Table is formatted in two columns. The first column lists an element, showing its tag name (its descriptive element name is available by hovering over the tag name). The second column lists the tags of all the elements which may contain the listed element (with element names available by hovering over the tag names).

For example, if the first column contains: “<doc-ident>”, and the second column contains “<iso-meta>”, this means that the Document Identification Section element may only be used inside a ISO Metadata element. Many elements may be used inside more than one other element. For example, the <prefix> element may be used inside the <name>, <speaker>, and <string-name> elements.

Note: This Context Table (which lists where an element may be used) is the reverse of the Contents section that is within each element description (which lists the content of an element, that is, what can be inside the named element).

Document Hierarchy Diagrams

Graphical representations of portions of the hierarchies defined in the DTDs.

Index

Where to find elements, tags, and terms used in this Tag Library. Includes synonyms (terms not used in this Tag Set) that direct the reader to elements used in this Tag Library, for example, “technical page count” is paired with <page-count>.

Typographic Conventions

<nat-meta> The tag name of an element. Almost always written in lower case with the entire name surrounded by “< >”.
National-body Metadata Descriptive name of an element or an attribute. Written in title case (important words capitalized) in italics with the words separated by spaces.
must not Emphasis added to stress a point

Origin and design principles

This tag set, ISOSTS, was originally developed in 2011 by the ISO Central Secretariat for use by ISO and other standards bodies, both in internal processes and for the exchange of standards documents among standards organizations. ISOSTS is based on the current draft of NISO Z39.96 - JATS: Journal Article Tag Suite (which in turn is based on the tag set developed at the U.S. National Library of Medicine).

The ISOSTS tag set has wherever possible been closely aligned with JATS and changes to the inventory of tags and to their definitions have been kept at a minimum, in order to simplify development, make maintenance of the tag set easier, and enable the straightforward reuse of tools set up to work with JATS. In consequence, the vocabulary includes some tags for textual structures which are rare, or perhaps even non-existent, in standards documents; the effort required to remove them entirely from the tag set was deemed out of proportion to the benefit of doing so, especially in view of the small but not negligeable likelihood that some of the text structures in question might in fact occur in some few documents that may need over time to be encoded using this tag set.

Specific goals of the development project include making possible the straightforward production of typeset output and the production of XML that can usefully be shared with ISO member bodies and reused.

The metadata fields in the <iso-meta> element include all the information needed to produce typeset versions of the document.

Among the several tag sets defined in the suite, the Journal Publishing Tag Set was chosen as the base of ISOSTS largely because it has better support for multilingual documents than the most plausible alternatives. In ISOSTS, multilingual documents are handled by having structures in the XML which can repeat and which bear language information (typically the @xml:lang attribute). Multi-lingual documents will typically be encoded with the text of a section or paragraph given first in one language and then in the other; ISOSTS does not prescribe whether to interleave the different languages section by section or paragraph by paragraph. For the special case of terminological data, the <tbx:termEntry> can contain information about synonymous terms in several languages; see the accompanying documentation on the TBX tag set.

The metadata section of the JATS Journal Publishing Tag Set has been completely suppressed and replaced with three distinct metadata blocks designed to hold ISO metadata (<iso-meta>), CEN metadata (<cen-meta>), or national-body metadata (<nat-meta>). In the current version of ISOSTS, the <nat-meta> element is a stub, containing only one or more occurrences of the <custom-meta-group> element. National bodies making use of this tag set may use the <nat-meta> as defined, or may customize the tag set by supplying their own definitions for the element, with more specific and more tightly constrained structures for their metadata.

Like the JATS Journal Publishing Tag Set, ISOSTS contains a number of rendering-oriented elements (e.g. <italic>, <bold>, or <sans-serif>). When the specific rendering of the text in a document has a specific meaning, and it is possible to mark the document up using semantically richer elements to capture that meaning, then it is wise to use the semantically richer markup and avoid the rendering-oriented tags. Doing so will tend to produce documents which are more easily reused. The rendering elements in the tag set are provided for use in cases where semantically richer markup is not feasible or where the vocabulary lacks elements with the appropriate semantics.

Like the JATS Journal Publishing Tag Set, ISOSTS contains standard definitions for a large number of special characters. The entities included are those in standard predefined entity sets; in practice the standard entity sets are constructed to cover the special characters most frequently needed, but no claim is made that they are complete in any particularly demanding sense. It should be noted there is no requirement that entity references be used instead of native Unicode characters or numeric character references, nor is there a requirement that standard entity references be used instead of entities defined in the document itself. If native Unicode is more convenient for users of a document production system using this tag set, the users should by all means employ native Unicode encoding. The entity declarations are provided as a convenience, especially for people creating XML documents using older systems, not as a suggestion that all users of ISOSTS should use entity references in preference to native Unicode characters.

Relation of this tag set to JATS

New elements

A number of elements have been added to the JATS tag set as part of the ISOSTS customization.

The <standard> element is the root element for documents encoded using the ISOSTS tag set.

Three new elements are intended for markup of terminological information. The <tbx:termEntry> and <term-display> elements are used to encode terminology defined in standards: <tbx:termEntry> uses a namespace-qualified version of the TermBase eXchange vocabulary (TBX) defined by ISO 30042, while <term-display> is used to provide a simple visually oriented encoding of the term entry, for cases where the correct formatted representation is difficult to generate from the TBX entry (or for cases where the TBX entry is difficult to generate from the form of the document provided by the authors). The <term-sec> element is a specialized form of <sec> which is modified to allow terminological data.

The <annex-type> is used in <app> elements to distinguish normative from non-normative annexes.

The <non-normative-note> and <non-normative-example> elements are used to tag non-normative material integrated into the text of the standard.

Three new elements are metadata containers for metadata relevant for the purposes of ISO, CEN, or national bodies: <iso-meta>, <cen-meta>, <nat-meta>.

Most of the new elements are for specific pieces of metadata and are included in the <iso-meta> element: <comm-ref>, <compl>, <content-language>, <doc-ident>, <doc-number>, <doc-ref>, <doc-type>, <full>, <ics>, <intro>, <is-proof>, <language>, <main>, <originator>, <part-number>, <proj-id>, <release-date>, <release-version>, <sdo>, <secretariat>, <std-ident>, <std-ref>, <suppl-number>, <suppl-type>, <suppl-version>, <title-wrap>, <urn>, and <version>.

Modifications to JATS elements

For a few elements in the JATS base vocabulary, changes have been made either to their formal definition or to their documentation, to make them serve the purposes of ISOSTS better. Those whose formal definitions have been modified include: