Preliminaries

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in BCP 14, RFC2119 and RFC8174 when, and only when, they appear in all capitals, as shown here.

This document is licensed under CC-BY-ND-4.0 (via SPDX).

The style of this documentation is adapted from OpenAPI 3.0.2 specification.

Status of this Document

The current version of this document is 0.4.0.

As long as the version number is not increased to 1.0.0 this document is considered to be an early draft and major changes MAY be implemented over night and without any announcement.

If we leave this very early stage, the version number will be applied according to Semantic Versioning 2.0.0.

Definitions

Cardinalities

Cardinality Description REQUIRED
1 exactly one yes
+ one or more yes
? zero or one no
* zero or more no

Data Types

Type Description
[…] array
string a sequence of characters, MAY be empty
URL a valid URL pointing to a resource
MIME type a valid MIME type according to IANA. See MDN
semver a string matching the pattern ^\d+\.\d+\.\d+$, representing a semantic version number
iso639-3 alpha-3 language code according to ISO 639-3
SPDX a valid SPDX identifier. See https://spdx.org/licenses/

URI Syntax

The URI Syntax is used according to the one described at IIIF.

Delivery Service

The TextAPI delivery service is the description of endpoints. It is REQUIRED to use the https protocol.

Collection

Returns a collection object.

https://{server}{/prefix}/{collection}/collection.json

Manifest

Returns a manifest object.

https://{server}{/prefix}/{collection}/{manifest}/manifest.json

Item

Returns an item object.

https://{server}{/prefix}/{collection}/{manifest}/{item}/{revision}/item.json

https://{server}{/prefix}/{collection}/{manifest}/{item}/latest/item.json

when a single item represents a complete text (whatever this means), the corresponding endpoint is REQUIRED to be available at

https://{server}{/prefix}/{collection}/{manifest}/{revision}/full.json

Schema

All fields that are not explicitly REQUIRED or described with MUST or SHALL are considered OPTIONAL. It is RECOMMENDED, however, to provide as much information as possible.

Collection Object

A collection contains a curated list of texts. It is REQUIRED to be served at the corresponding endpoint.

Field Name Cardinality Type Description
textapi 1 semver the TextAPI version covered by the implementation
title 1 [Title Object] the title of the collection
collector + [Actor Object] a personal entity responsible for the collection (collector)
description ? string description of the collection
sequence 1 [Sequence Object] a set of manifests included in this collection
annotationCollection ? URL URL pointing to an Annotation Collection for the complete collection
modules ? [Module Object] the modules in use for this collection

Sequence Object

Represents a sequence of collections, manifests or items Within a manifest it SHOULD contain items exclusively.

Field Name Cardinality Type Description
id 1 URL URL to find a Manifest Object, Collection Object or Item Object
type 1 string one of collection, manifest, item

Module Object

Gives information about which API modules are in use. If the Module Object is set, at least one modules has to be selected.

Note: The list of modules is contiuously enhanced.

Field Name Cardinality Type Description
edition-manuscripts ? string true or false
edition-prints ? string true or false

Manifest Object

This is the main object in the schema to represent a single text, its derivatives (e.g. html) and therefore containing the metadata. It is REQUIRED to be served at the corresponding endpoint.

Field Name Cardinality Type Description
textapi 1 semver version number satisfied by the implementation
id 1 URL URL pointing to this manifest
label 1 string human-readable name or title
sequence 1 [Sequence Object] a sequence of items
actor ? [Actor Object] a personal entity related to the document (e.g. author or editor)
repository ? [Repository Object] a repository archiving the document(s) or source
image ? Image Object an image representing the resources (e.g. thumbnail or logo)
metadata ? [Metadata Object] a list of further metadata
support ? [Support Object]
license 1 [license object] license under which the resource MUST be used
description ? string a short description of the object
annotationCollection ? URL URL pointing to an Annotation Collection for the complete manifest
modules ? [Module Object] the modules in use for this manifest.has to correspond to the Collection Object’s modules entry

Repository Object

A repository archiving the source or derivates, e.g. facsimiles or digitized versions.

Field Name Cardinality Type Description
label ? string the label as given by the hosting institution
url 1 URL URL pointing to the website of the institution
baseUrl 1 URL a base URL where id can be resolved
id 1 string the identifier at the hosting institution

Actor Object

Field Name Cardinality Type Description
role + [string] the role of a personal entity in relation to the parent object, MUST be collector in case of collections
name 1 string the principal name of the person
id ? string internal identifier
idref * Idref Object authority files related to the person

Metadata Object

A set of metadata describing the source or its context. Mainly used for key-value pairs.

Field Name Cardinality Type Description
key 1 string label
value 1 string [Metadata Object] property
metadata ? [Metadata Object] further metadata that is subordinant to the current metadata entry

Image Object

An image representing the source or its provider. It is recommended that a IIIF Image API service is available for this image for manipulations such as resizing.

Field Name Cardinality Type Description
id 1 URL URL pointing to the image
manifest ? URL URL pointing to the image’s manifest object
license 1 License Object the license for the image that MUST be used

License Object

The license or any other appropriate rights statement the resources is served under. It is REQUIRED to use one of SPDX.

Field Name Cardinality Type Description
id 1 SPDX an SPDX identifier
notes ? string further notes concerning the license. can be used e.g. for the attribution statement of CC BY

Support Object

Any material supporting the view is described and referenced in this object. This encompasses fonts and CSS, but also other material to support the rendering MAY be added on request.

Field Name Cardinality Type Description
type 1 one of font, css
mime 1 MIME type the MIME type for the resource
url 1 URL URL pointing to the resource

Item Object

It is REQUIRED to be served at the corresponding endpoint.

When an item serves a complete version of a text, the type SHOULD be full. Any other (sliced) material is either a section or a page.

Field Name Cardinality Type Description
textapi 1 semver version number satisfied by the implementation
title ? [Title Object] the title of the item
type 1 string one of section, page, full
n ? string division number
lang 1 [iso639-3] language codes describing the resource
langAlt ? [string] alternative language name or code (when there is no iso639-3 code, e.g. karshuni)
content 1 [Content Object] different serializations of the item, e.g. HTML, plain text, XML, …
description ? string a short description of the object
image ? Image Object corresponding image
annotationCollection ? URL URL pointing to an Annotation Collection for this item
modules ? [Module Object] the modules in use for this item. has to correspond to the Collection Object’s modules entry

Content Object

Field Name Cardinality Type Description
url 1 URL URL pointing to the content
type 1 MIME type a MIME type. If several Content Objects with the same MIME type are provided, these SHOULD be distinguished with a MIME type parameter where the key is type and the value can be freely chosen, e.g. text/html;type=transcription.

Title Object

Field Name Cardinality Type Description
title 1 string a single title
type 1 string one of main, sub

Idref Object

Field Name Cardinality Type Description
base ? URL the base URL to the authority file
type 1 string short title of the referenced authority
id 1 string the main ID referenced

Extensibility

All objects MAY be extended by own keywords, always prefixed by x-.

Revision History / Changelog

Version Date Description
0.0.1 2019-04 very early draft
0.0.2 2019-05 manifest specified
0.0.3 2019-10 collection specified
0.1.0 2020-01 first version used in project
0.2.0 2020-06 annotations added, language style improved
0.3.0 2020-11 update item endpoints
0.4.0 2021-01 add license information for images, add notes to License Object
0.5.0 2021-02 add Content Object for providing several serializations
0.6.0 2021-02 allow for hierarchy in Metadata Objects

Appendix

Class Diagram

UML class diagram

// Text-API
// ------------------

// classes
[Collection|entrypoint|- textapi; -description; -annotationCollection ? ; -modules ? {bg:yellow}]

[Manifest|entrypoint| -textapi 1; -id 1; label 1; -description ?; -annotationCollection ? ; -modules ? {bg:yellow}]

[Item| -textapi 1; -type 1; -n ?; -lang 1; -langAlt ?; -content 1; -description ?; -image ?; - annotationCollection ? ; -modules ? ]

// objects
[Sequence|-id;-type]

[Repository| -label; -url; baseUrl; -id]

[Actor| -role; -name; -id]

[Metadata| -key; -value; -metadata ?]

[Image| -id; -manifest; -license]

[License| -id; -notes ?]

[Support| -type 1; -mime 1; -url 1]

[Title| -title 1; -type 1]

[Idref| -base ?; -type 1| -id 1]

[Modules| -edition-manuscripts ?; -edition-prints ?]
[Content| -url 1; -type 1]

// imports
[Collection]-[Title]
[Collection]-[Actor]
[Collection]-[Sequence]
[Collection]-[Modules]

[Manifest]-[Sequence]
[Manifest]-[Actor]
[Manifest]-[Repository]
[Manifest]-[Image]
[Manifest]-[Metadata]
[Manifest]-[Support]
[Manifest]-[License]
[Manifest]-[Modules]

[Item]-[Title]
[Item]-[Image]
[Item]-[Modules]
[Item]-[Content]

[Actor]-[Idref]

[Image]-[License]
[Metadata]-[Metadata]

Example Objects

collection.json

{
  "textapi": "0.4.0",
  "title": {
    "title": "Example Collection", "type": "main"
  },
  "sequence": [
    {
      "id": "https://{server}{/prefix}/{collection}/{document}/manifest.json",
      "type": "manifest"
    },
    {
      "id": "https://{server}{/prefix}/{collection}/{document}/latest/info.json",
      "type": "item"
    }
  ]
}

manifest.json

{
  "textapi": "0.4.0",
  "label": "Example Document",
  "sequence": [
    {
      "id": "https://{server}{/prefix}/{collection}/{document}/latest/item.json",
      "type": "item"
    },
    {
      "id": "https://{server}{/prefix}/{collection}/{document}/latest/item.json",
      "type": "item"
    }
  ],
  "actor": [{
    "role": "editor",
    "name": "John Doe"
    },
    {
    "role": "author",
    "name": "Max Musterfrau"
    }
  ],
  "license": [{"id": "copyleft-next-0.3.1"}],
  "annotationCollection": "https://{server}{/prefix}/{collection}/{document}/annotationCollection.json"
}

item.json

{
  "textapi": "0.4.0",
  "title": {
    "title": "Example Document", "type": "main"
  },
  "type": "section",
  "lang": ["eng"],
  "content": [
    {
      "url": "https://example.com/some/path/to/resource.html",
      "type": "text/html"
    },
    {
      "url": "https://example.com/some/path/to/resource.txt",
      "type": "text/plain"
    }
  ]
}

Todos