API Documentation - Response

The response is always given in JSON format. It contains the analyzed text, the language of the text, and a response block for every service requested with its analysis result. The structure of all service responses is described in detail further below.

For an example of a complete response, see section Overview.

Basic Response Format

{
- text: "TXTWerk ist die Textmining-API der Neofonie GmbH, ein in Berlin ansässiger Fullservice-Provider. Neben Entitäten und Schlagwörtern erkennt TXTWerk in Texten unter anderem auch Datumsangaben (z.B. 08.09.2023) und Maßzahlen (z.B. 24h) und ordnet jeden Text einer passenden Textklasse zu.",
- timestamp: 1400247994051,
- language: "de",
- entities: [
- ]
- lexiconEntities:
- [
- ]
- nerEntities:
- [
- ]
- lexiconTags:
- [
- ]
- tags:
- [
- ]
- dates:
- [
- ]
- categories:
- [
- ]
- measures:
- [
- ]
- fingerprints:
- [
- ]
- legals:
- [
- ]
}

If a service has analyzed the text successfully, but did not find any results, an empty result list will be returned. In case of an error of a single service, the returned HTTP status will be 200 and the response content will contain the results of all the services, except for the failed service block. Services which were not included in the request, are generally also not included in the response.

Description of each field:

text	The analyzed text. If you passed an URL, the extracted plain text (with boiler plate removal) will be displayed. If you passed a plain text within the parameter 'text', the text will be returned unchanged. If you passed a json document, all single paragraphs will be concatenated in the response.
language	The language of the text, e.g. "de" , "en", or "ru".
timestamp	The timestamp of the response (in milliseconds since January 1, 1970).

Response Format: Entities

{
- entities: [
  - {
    - confidence: 36.218177795410156,
    - relevance: 25.53207015991211,
    - surface: "GmbH",
    - label: "Gesellschaft mit beschränkter Haftung",
    - uri: "https://www.wikidata.org/wiki/Q460178",
    - type: "CONCEPT",
    - start: 44,
    - end: 48
  - },
  - {
    - confidence: 39.26929473876953,
    - relevance: 11.950702667236328,
    - surface: "Berlin",
    - label: "Berlin",
    - uri: "https://www.wikidata.org/wiki/Q64",
    - type: "PLACE",
    - start: 57,
    - end: 63
  - },
  - {
    - confidence: 95.73828125,
    - relevance: 35.542537689208984,
    - surface: "Texten",
    - label: "Text",
    - uri: "https://www.wikidata.org/wiki/Q234460",
    - type: "CONCEPT",
    - start: 150,
    - end: 156
  - }
- ]
}

Description of each field:

label	The unique label of the entity.
surface	The surface form of the entity in the text.
type	Type of entity. Possible values are "PERSON", "PLACE", "ORGANISATION", "JOB TITLE", "WORK", "EVENT" and "CONCEPT". This is determined heuristically and may differ in some cases from the expected value. For example: A city can act as an employer and may therefore be classified as an organization.
uri	The Wikidata URI of the named entity. Set to 'null' if there is no entity URI in the Wikidata knowledge base.
confidence	Confidence value of the discovered entity. A higher value represents a more secure detection. The upper value of the confidence is unlimited.
relevance	Relevance value for the discovered entity. A higher value represents a more important entity in the given context. The upper value of the relevance is unlimited.
start	The start position of the entity in the text.
end	The end position of the entity in the text.

Dependent, on which optional parameters were requested, the following fields may be included in the response::

annotations	Additional annotation information about the entities. Only if the request contains the additional parameter ner Annotations .
annotations.aliases	Also-known-as for the entity. Not included in the response block, if there are no aliases in Wikidata.
annotations.wikipedia	Link to a suitable Wikipedia page. If no page is linked in Wikidata, this field is not included in the response.
candidates	All entities, which were disambiguation candidates for the entity concerned. Only existant if the request parameter nerFormat is set to 'candidates'.
candidates.uri	The Wikidata URI of the disambiguation candidate.
candidates.type	Type of disambiguation candidate. Possible values are "PERSON", "PLACE", "ORGANISATION", "JOBTITLE", "WORK", "EVENT" and "CONCEPT".This is determined heuristically and may differ in some cases from the expected value. For example: A city can act as an employer and may therefore be classified as an organization.
candidates.confidence	Confidence value of the disambiguation candidate. A higher value represents a more secure detection. The upper value of the confidence is unlimited.
candidates.label	The unique label of the disambiguation candidate.
userDefinedFields	Additional information on the entities, user-dependent. Only existant if the two request parameters nerMetadata and nerMetadataProperties are set. The response contains keys and values, which depend on what additional fields have been defined by the user.

Response Format: Top Entities

Top entities are included in the response if the request parameter nerFormat was set to 'aggregate'. Entities with the highest relevance values represent top entities.

{
- topEntities: [
  - {
    - confidence: 95.73828125,
    - relevance: 35.542537689208984,
    - label: "Text",
    - uri: "https://www.wikidata.org/wiki/Q234460",
    - type: "CONCEPT",
    - matches: [
      - {
        
        surface: "Texten",
        
        start: 150,
        
        end: 156
      - }
    - ]
  - },
  - {
    - confidence: 100.0,
    - relevance: 33.77956008911133,
    - label: "Neofonie GmbH",
    - uri: "Neofonie",
    - type: "Organisation",
    - userDefinedFields:
      - {
      - }
    - matches: [
      - {
        
        surface: "Neofonie GmbH",
        
        start: 35,
        
        end: 48
      - }
    - ]
  - },
  - {
    - confidence: 36.218177795410156,
    - relevance: 25.53207015991211,
    - label: "Gesellschaft mit beschränkter Haftung",
    - uri: "https://www.wikidata.org/wiki/Q460178",
    - type: "CONCEPT",
    - matches: [
      - {
        
        surface: "GmbH",
        
        start: 44,
        
        end: 48
      - }
    - ]
  - },
  - {
    - confidence: 39.26929473876953,
    - relevance: 11.950702667236328,
    - label: "Berlin",
    - uri: "https://www.wikidata.org/wiki/Q64",
    - type: "PLACE",
    - matches: [
      - {
        
        surface: "Berlin",
        
        start: 57,
        
        end: 63
      - }
    - ]
  - }
- ]
}

Description of each field:

label	The unique label of the entity.
type	Type of entity. Possible values are "PERSON", "PLACE", "ORGANISATION", "JOB TITLE", "WORK", "EVENT" and "CONCEPT". This is determined heuristically and may differ in some cases from the expected value. For example: A city can act as an employer and may therefore be classified as an organization.
uri	The Wikidata URI of the named entity. Set to 'null' if there is no entity URI in the Wikidata knowledge base.
confidence	Confidence value of the discovered entity. A higher value represents a more secure detection. The upper value of the confidence is unlimited.
relevance	Relevance value for the discovered entity. A higher value represents a more important entity in the given context. The upper value of the relevance is unlimited.
matches	Matches of the entity in the text.
matches.surface	The surface form of the entity match in the text.
matches.start	The start position of the entity match in the text.
matches.end	Die Endposition der Fundstelle im Text.

Response Format: Lexicon Entities

These Named Entities are based on a lexicon managed in TXTWerk. Contrary to the Wikidata entities, they are determined without any disambiguation. The response format is the same as for 'entities' except for the different response block name, which is 'lexiconEntities'.

Description of each field:

label	The unique label of the entity.
surface	The surface form of the entity in the text.
type	Type of entity. Possible values are managed in the lexicon and depend on its state.
uri	A URI associated with this named entity, typically an identifier in an external system.
relevance	Relevance value for the discovered entity. A higher value represents a more important entity in the given context. The upper value of the relevance is unlimited.
confidence	Confidence value of the discovered entity, which, in this case however, is 1 at all times since the service is based on the user lexicon.
start	The start position of the entity in the text.
end	The end position of the entity in the text.
userDefinedFields	Zusätzliche Informationen zu den Entitäten, abhängig vom User.

Response Format: NER Entities

{
- nerEntities: [
  - {
    - type: "ORGANISATION",
    - confidence: 0.6897694170475006,
    - start: 35,
    - end: 48,
    - surface: "Neofonie GmbH"
  - },
  - {
    - type: "PLACE",
    - confidence: 0.9957075119018555,
    - start: 57,
    - end: 63,
    - surface: "Berlin"
  - }
- ]
}

Description of each field:

surface	The surface form of the entity in the text.
type	Type of entity. Possible values are "PERSON" and "PLACE".
confidence	Confidence value of the discovered entity. A higher value represents a more secure detection. The upper value of the confidence is unlimited.
start	The start position of the entity in the text.
end	The end position of the entity in the text.

Response Format: Tags

{
- tags: [
  - {
    - confidence: 0.9989658313414402,
    - term: "TXTWerk"
  - },
  - {
    - confidence: 0.9782419755349671,
    - term: "Entitäten"
  - },
  - {
    - confidence: 0.9732933133596776,
    - term: "Textmining-API"
  - },
  - {
    - confidence: 0.9365462323616698,
    - term: "Neofonie GmbH"
  - },
  - {
    - confidence: 0.8993179739843555,
    - term: "Schlagwörter"
  - },
  - {
    - confidence: 0.8814831569459867,
    - term: "Berlin"
  - },
  - {
    - confidence: 0.874798029178814,
    - term: "Fullservice-Provider"
  - }
- ]
}

Description of each field:

term	The keyword found.
confidence	Confidence value of the phrase. It is always between 0 to 1.

Response Format: Lexicon Tags

{
- text: "TXTWerk ist die Textmining-API der Neofonie GmbH, ein in Berlin ansässiger Fullservice-Provider. Neben Entitäten und Schlagwörtern erkennt TXTWerk in Texten unter anderem auch Datumsangaben (z.B. 08.09.2023) und Maßzahlen (z.B. 24h) und ordnet jeden Text einer passenden Textklasse zu.",
- lexiconTags: [
  - {
    - id: "[unique id]",
    - tag: "ansässig",
    - score: 7.6243725,
    - analyzed: "ansässig",
    - observedSurfaces: [
      - {
        
        start: 64,
        
        end: 74,
        
        type: "TAG",
        
        observedSurface: "ansässiger",
        
        analyzed: "ansässig"
      - }
    - ]
  - }
- ]
}

Description of each field:

id	Unique ID of the tag in the user lexicon.
tag	Unique label of the tag
score	Value representing the quality of the match. Determined by the matching algorithm.
analyzed	Tags are converted into different (synonym) word forms algorithmically. Here the word form (of the tag), that has matched, is listed.
observedSurfaces	Matches of the tag in the text.
observedSurfaces.start	The start position of the tag match in the text.
observedSurfaces.end	The end position of the tag match in the text.
observedSurfaces.type	Type of match. Possible values are "TAG", "SYNONYM" and "GENDER".
observedSurfaces.observedSurface	The surface form of the match in the text.
observedSurfaces.analyzed	All tokens of the text are converted into different (synonym) word forms algorithmically. Here the word form (of the token), that has matched, is listed.

Response Format: Dates

{
- dates: [
  - {
    - surface: "08.09.2023",
    - start: 196,
    - end: 206,
    - dateStart:
      - {
        
        day: 8,
        
        month: 9,
        
        year: 2023,
        
        bc: false
      - }
    - dateEnd:
      - {
        
        day: 8,
        
        month: 9,
        
        year: 2023,
        
        bc: false
      - }
  - }
- ]
}

Description of each field:

surface	The surface form of the date in the text.
start	The start position of the date in the text.
end	The end position of the date in the text.
dateStart	The start date. A date is always represented as time periods, i.e. start and end date may have the same value.
dateEnd	The end date.
day	The day of the start or end date. Possible values are 1-31.
month	The month of the start or end date. Possible values are 1-12.
year	The year of the start or end date.
bc	Describes whether the date refers to the time before Christ. Possible values are true and false.

Response Format: Categories

{
- categories: [
  - {
    - confidence: 0.9999914614615732,
    - label: "internet"
  - },
  - {
    - confidence: 8.5340630740002E-6,
    - label: "kultur"
  - },
  - {
    - confidence: 3.4390082461387908E-9,
    - label: "auto+technik"
  - },
  - {
    - confidence: 7.942384268635301E-10,
    - label: "wirtschaft"
  - },
  - {
    - confidence: 1.1799574174439144E-10,
    - label: "reisen"
  - },
  - {
    - confidence: 8.06441429999464E-11,
    - label: "wissenschaft"
  - },
  - {
    - confidence: 4.031349737157026E-11,
    - label: "politik"
  - },
  - {
    - confidence: 3.152736753221788E-12,
    - label: "sport"
  - }
- ]
}

Description of each field:

label	The name of the category. Possible values are "Politics", "Business", "Car & Technology", "Internet", "Culture", "Travel", "Sports", "Human interest", and "Science".
confidence	Confidence value of the category. Always between 0 and 1.

Response Format: Measures

{
- measures: [
  - {
    - start: 228,
    - end: 231,
    - text: "24h",
    - valueString: "24",
    - unitString: "h",
    - type: "TIME",
    - alias: [
    - "24 h",
    - "24h",
    - "24Stunde",
    - "24 Stunde",
    - "24 Stunden",
    - "24Stunden",
    - ]
  - }
- ]
}

Description of each field:

start	The start position of the measurement in the text.
end	The end position of the measurement in the text.
text	The measurement string, exactly as it occurs in the text.
valueString	The value of the measurement as a string, exactly as it occurs in the text.
unitString	The unit as a string, exactly as it occurs in the text.
unitCanonical	Nur bei Währungen. Unabhängig vom konkreten String der Einheit im Text handelt es sich hier um den Drei-Buchstaben-Code der jeweiligen Währung.
type	The type of measurement. Possible values are "LENGTH", "AREA", "MASS", "TEMPERATURE", "VOLTAGE", "AMPERAGE", "RESISTANCE", "CHARGE", "CAPACITY", "CONDUCTANCE", "INDUCTANCE", "MAGNETIC_STRENGTH", "POWER", "ENERGY", "FORCE", "PRESSURE", "FREQUENCY", "VOLUME", "LUMINOSITY", "ILLUMINANCE", "SPIN", "SUBSTANCE", "RADIOACTIVITY", "CURRENCY", "TIME", "UNKNOWN"
alias	Further variants of the measurement string (with and without space, units with and without abbreviation, conversions).

Antwortformat: Fingerprints

{
- fingerprints: [
- 7493129,
- 18632078,
- 48467713,
- 64740551,
- 61803666,
- 57602,
- 20683602,
- 7169662,
- 124073776,
- 1324512,
- 48689911,
- 63618400,
- 82739683,
- 57114900,
- 86498997,
- 5531749,
- 43615458,
- 63266708,
- 35312651,
- 1767346,
- 166345084,
- 20994017,
- 10618634,
- 35187378,
- 52012568,
- 62221932,
- 101283997,
- 194238108,
- 24943142,
- 48857582,
- 214343186,
- 8807040,
- 11737208,
- 29004557,
- 33563369,
- 23510317,
- 54409541,
- 58494605,
- 55886581,
- 88208507,
- 10609552,
- 7042020,
- 21855281,
- 9560326,
- 22894461,
- 19569052,
- 11695122,
- 59192088,
- 11647472,
- 25992587,
- ]
}