API Documentation - Response
The response is always given in JSON format. It contains the analyzed text, the language of the text, and a response block for every service requested with its analysis result. The structure of all service responses is described in detail further below.
For an example of a complete response, see section Overview.
Basic Response Format
- {
- text: "TXTWerk ist die Textmining-API der Neofonie GmbH, ein in Berlin ansässiger Fullservice-Provider. Neben Entitäten und Schlagwörtern erkennt TXTWerk in Texten unter anderem auch Datumsangaben (z.B. 08.09.2023) und Maßzahlen (z.B. 24h) und ordnet jeden Text einer passenden Textklasse zu.",
- timestamp: 1400247994051,
- language: "de",
- entities: [
- ]
- lexiconEntities:
- [
- ]
- nerEntities:
- [
- ]
- lexiconTags:
- [
- ]
- tags:
- [
- ]
- dates:
- [
- ]
- categories:
- [
- ]
- measures:
- [
- ]
- fingerprints:
- [
- ]
- legals:
- [
- ]
If a service has analyzed the text successfully, but did not find any results, an empty result list will be returned. In case of an error of a single service, the returned HTTP status will be 200 and the response content will contain the results of all the services, except for the failed service block. Services which were not included in the request, are generally also not included in the response.
Description of each field:
text | The analyzed text. If you passed an URL, the extracted plain text (with boiler plate removal) will be displayed. If you passed a plain text within the parameter 'text', the text will be returned unchanged. If you passed a json document, all single paragraphs will be concatenated in the response. |
language | The language of the text, e.g. "de" , "en", or "ru". |
timestamp | The timestamp of the response (in milliseconds since January 1, 1970). |
Response Format: Entities
- {
- entities: [
- {
- confidence: 36.218177795410156,
- relevance: 25.53207015991211,
- surface: "GmbH",
- label: "Gesellschaft mit beschränkter Haftung",
- uri: "https://www.wikidata.org/wiki/Q460178",
- type: "CONCEPT",
- start: 44,
- end: 48
- },
- {
- confidence: 39.26929473876953,
- relevance: 11.950702667236328,
- surface: "Berlin",
- label: "Berlin",
- uri: "https://www.wikidata.org/wiki/Q64",
- type: "PLACE",
- start: 57,
- end: 63
- },
- {
- confidence: 95.73828125,
- relevance: 35.542537689208984,
- surface: "Texten",
- label: "Text",
- uri: "https://www.wikidata.org/wiki/Q234460",
- type: "CONCEPT",
- start: 150,
- end: 156
- }
- {
- ]
- entities: [
- }
Description of each field:
label | The unique label of the entity. |
surface | The surface form of the entity in the text. |
type | Type of entity. Possible values are "PERSON", "PLACE", "ORGANISATION", "JOB TITLE", "WORK", "EVENT" and "CONCEPT". This is determined heuristically and may differ in some cases from the expected value. For example: A city can act as an employer and may therefore be classified as an organization. |
uri | The Wikidata URI of the named entity. Set to 'null' if there is no entity URI in the Wikidata knowledge base. |
confidence | Confidence value of the discovered entity. A higher value represents a more secure detection. The upper value of the confidence is unlimited. |
relevance | Relevance value for the discovered entity. A higher value represents a more important entity in the given context. The upper value of the relevance is unlimited. |
start | The start position of the entity in the text. |
end | The end position of the entity in the text. |
Dependent, on which optional parameters were requested, the following fields may be included in the response::
annotations | Additional annotation information about the entities. Only if the request contains the additional parameter ner Annotations . | |
annotations.aliases | Also-known-as for the entity. Not included in the response block, if there are no aliases in Wikidata. | |
annotations.wikipedia | Link to a suitable Wikipedia page. If no page is linked in Wikidata, this field is not included in the response. | |
candidates | All entities, which were disambiguation candidates for the entity concerned. Only existant if the request parameter nerFormat is set to 'candidates'. | |
candidates.uri | The Wikidata URI of the disambiguation candidate. | |
candidates.type | Type of disambiguation candidate. Possible values are "PERSON", "PLACE", "ORGANISATION", "JOBTITLE", "WORK", "EVENT" and "CONCEPT".This is determined heuristically and may differ in some cases from the expected value. For example: A city can act as an employer and may therefore be classified as an organization. | |
candidates.confidence | Confidence value of the disambiguation candidate. A higher value represents a more secure detection. The upper value of the confidence is unlimited. | |
candidates.label | The unique label of the disambiguation candidate. | |
userDefinedFields | Additional information on the entities, user-dependent. Only existant if the two request parameters nerMetadata and nerMetadataProperties are set. The response contains keys and values, which depend on what additional fields have been defined by the user. |
Response Format: Top Entities
Top entities are included in the response if the request parameter nerFormat was set to 'aggregate'. Entities with the highest relevance values represent top entities.
- {
- topEntities: [
- {
- confidence: 95.73828125,
- relevance: 35.542537689208984,
- label: "Text",
- uri: "https://www.wikidata.org/wiki/Q234460",
- type: "CONCEPT",
- matches: [
- {
- surface: "Texten",
- start: 150,
- end: 156
- }
- {
- ]
- },
- {
- confidence: 100.0,
- relevance: 33.77956008911133,
- label: "Neofonie GmbH",
- uri: "Neofonie",
- type: "Organisation",
- userDefinedFields:
- {
- }
- {
- matches: [
- {
- surface: "Neofonie GmbH",
- start: 35,
- end: 48
- }
- {
- ]
- },
- {
- confidence: 36.218177795410156,
- relevance: 25.53207015991211,
- label: "Gesellschaft mit beschränkter Haftung",
- uri: "https://www.wikidata.org/wiki/Q460178",
- type: "CONCEPT",
- matches: [
- {
- surface: "GmbH",
- start: 44,
- end: 48
- }
- {
- ]
- },
- {
- confidence: 39.26929473876953,
- relevance: 11.950702667236328,
- label: "Berlin",
- uri: "https://www.wikidata.org/wiki/Q64",
- type: "PLACE",
- matches: [
- {
- surface: "Berlin",
- start: 57,
- end: 63
- }
- {
- ]
- }
- {
- ]
- topEntities: [
- }
Description of each field:
label | The unique label of the entity. |
type | Type of entity. Possible values are "PERSON", "PLACE", "ORGANISATION", "JOB TITLE", "WORK", "EVENT" and "CONCEPT". This is determined heuristically and may differ in some cases from the expected value. For example: A city can act as an employer and may therefore be classified as an organization. |
uri | The Wikidata URI of the named entity. Set to 'null' if there is no entity URI in the Wikidata knowledge base. |
confidence | Confidence value of the discovered entity. A higher value represents a more secure detection. The upper value of the confidence is unlimited. |
relevance | Relevance value for the discovered entity. A higher value represents a more important entity in the given context. The upper value of the relevance is unlimited. |
matches | Matches of the entity in the text. |
matches.surface | The surface form of the entity match in the text. |
matches.start | The start position of the entity match in the text. |
matches.end | Die Endposition der Fundstelle im Text. |
Response Format: Lexicon Entities
These Named Entities are based on a lexicon managed in TXTWerk. Contrary to the Wikidata entities, they are determined without any disambiguation. The response format is the same as for 'entities' except for the different response block name, which is 'lexiconEntities'.
Description of each field:
label | The unique label of the entity. |
surface | The surface form of the entity in the text. |
type | Type of entity. Possible values are managed in the lexicon and depend on its state. |
uri | A URI associated with this named entity, typically an identifier in an external system. |
relevance | Relevance value for the discovered entity. A higher value represents a more important entity in the given context. The upper value of the relevance is unlimited. |
confidence | Confidence value of the discovered entity, which, in this case however, is 1 at all times since the service is based on the user lexicon. |
start | The start position of the entity in the text. |
end | The end position of the entity in the text. |
userDefinedFields | Zusätzliche Informationen zu den Entitäten, abhängig vom User. |
Response Format: NER Entities
- {
- nerEntities: [
- {
- type: "ORGANISATION",
- confidence: 0.6897694170475006,
- start: 35,
- end: 48,
- surface: "Neofonie GmbH"
- },
- {
- type: "PLACE",
- confidence: 0.9957075119018555,
- start: 57,
- end: 63,
- surface: "Berlin"
- }
- {
- ]
- nerEntities: [
- }
Description of each field:
surface | The surface form of the entity in the text. |
type | Type of entity. Possible values are "PERSON" and "PLACE". |
confidence | Confidence value of the discovered entity. A higher value represents a more secure detection. The upper value of the confidence is unlimited. |
start | The start position of the entity in the text. |
end | The end position of the entity in the text. |
Response Format: Tags
- {
- tags: [
- {
- confidence: 0.9989658313414402,
- term: "TXTWerk"
- },
- {
- confidence: 0.9782419755349671,
- term: "Entitäten"
- },
- {
- confidence: 0.9732933133596776,
- term: "Textmining-API"
- },
- {
- confidence: 0.9365462323616698,
- term: "Neofonie GmbH"
- },
- {
- confidence: 0.8993179739843555,
- term: "Schlagwörter"
- },
- {
- confidence: 0.8814831569459867,
- term: "Berlin"
- },
- {
- confidence: 0.874798029178814,
- term: "Fullservice-Provider"
- }
- {
- ]
- tags: [
- }
Description of each field:
term | The keyword found. |
confidence | Confidence value of the phrase. It is always between 0 to 1. |
Response Format: Lexicon Tags
- {
- text: "TXTWerk ist die Textmining-API der Neofonie GmbH, ein in Berlin ansässiger Fullservice-Provider. Neben Entitäten und Schlagwörtern erkennt TXTWerk in Texten unter anderem auch Datumsangaben (z.B. 08.09.2023) und Maßzahlen (z.B. 24h) und ordnet jeden Text einer passenden Textklasse zu.",
- lexiconTags: [
- {
- id: "[unique id]",
- tag: "ansässig",
- score: 7.6243725,
- analyzed: "ansässig",
- observedSurfaces: [
- {
- start: 64,
- end: 74,
- type: "TAG",
- observedSurface: "ansässiger",
- analyzed: "ansässig"
- }
- {
- ]
- }
- {
- ]
- }
Description of each field:
id | Unique ID of the tag in the user lexicon. |
tag | Unique label of the tag |
score | Value representing the quality of the match. Determined by the matching algorithm. |
analyzed | Tags are converted into different (synonym) word forms algorithmically. Here the word form (of the tag), that has matched, is listed. |
observedSurfaces | Matches of the tag in the text. |
observedSurfaces.start | The start position of the tag match in the text. |
observedSurfaces.end | The end position of the tag match in the text. |
observedSurfaces.type | Type of match. Possible values are "TAG", "SYNONYM" and "GENDER". |
observedSurfaces.observedSurface | The surface form of the match in the text. |
observedSurfaces.analyzed | All tokens of the text are converted into different (synonym) word forms algorithmically. Here the word form (of the token), that has matched, is listed. |
Response Format: Dates
- {
- dates: [
- {
- surface: "08.09.2023",
- start: 196,
- end: 206,
- dateStart:
- {
- day: 8,
- month: 9,
- year: 2023,
- bc: false
- }
- {
- dateEnd:
- {
- day: 8,
- month: 9,
- year: 2023,
- bc: false
- }
- {
- }
- {
- ]
- dates: [
- }
Description of each field:
surface | The surface form of the date in the text. |
start | The start position of the date in the text. |
end | The end position of the date in the text. |
dateStart | The start date. A date is always represented as time periods, i.e. start and end date may have the same value. |
dateEnd | The end date. |
day | The day of the start or end date. Possible values are 1-31. |
month | The month of the start or end date. Possible values are 1-12. |
year | The year of the start or end date. |
bc | Describes whether the date refers to the time before Christ. Possible values are true and false. |
Response Format: Categories
- {
- categories: [
- {
- confidence: 0.9999914614615732,
- label: "internet"
- },
- {
- confidence: 8.5340630740002E-6,
- label: "kultur"
- },
- {
- confidence: 3.4390082461387908E-9,
- label: "auto+technik"
- },
- {
- confidence: 7.942384268635301E-10,
- label: "wirtschaft"
- },
- {
- confidence: 1.1799574174439144E-10,
- label: "reisen"
- },
- {
- confidence: 8.06441429999464E-11,
- label: "wissenschaft"
- },
- {
- confidence: 4.031349737157026E-11,
- label: "politik"
- },
- {
- confidence: 3.152736753221788E-12,
- label: "sport"
- }
- {
- ]
- categories: [
- }
Description of each field:
label | The name of the category. Possible values are "Politics", "Business", "Car & Technology", "Internet", "Culture", "Travel", "Sports", "Human interest", and "Science". |
confidence | Confidence value of the category. Always between 0 and 1. |
Response Format: Measures
- {
- measures: [
- {
- start: 228,
- end: 231,
- text: "24h",
- valueString: "24",
- unitString: "h",
- type: "TIME",
- alias: [
- "24 h",
- "24h",
- {
- "24Stunde",
- measures: [
- "24 Stunde",
- "24 Stunden",
- "24Stunden",
- ]
Description of each field:
start | The start position of the measurement in the text. |
end | The end position of the measurement in the text. |
text | The measurement string, exactly as it occurs in the text. |
valueString | The value of the measurement as a string, exactly as it occurs in the text. |
unitString | The unit as a string, exactly as it occurs in the text. |
unitCanonical | Nur bei Währungen. Unabhängig vom konkreten String der Einheit im Text handelt es sich hier um den Drei-Buchstaben-Code der jeweiligen Währung. |
type | The type of measurement. Possible values are "LENGTH", "AREA", "MASS", "TEMPERATURE", "VOLTAGE", "AMPERAGE", "RESISTANCE", "CHARGE", "CAPACITY", "CONDUCTANCE", "INDUCTANCE", "MAGNETIC_STRENGTH", "POWER", "ENERGY", "FORCE", "PRESSURE", "FREQUENCY", "VOLUME", "LUMINOSITY", "ILLUMINANCE", "SPIN", "SUBSTANCE", "RADIOACTIVITY", "CURRENCY", "TIME", "UNKNOWN" |
alias | Further variants of the measurement string (with and without space, units with and without abbreviation, conversions). |
Antwortformat: Fingerprints
- {
- fingerprints: [
- 7493129,
- 18632078,
- 48467713,
- 64740551,
- 61803666,
- 57602,
- 20683602,
- 7169662,
- 124073776,
- 1324512,
- 48689911,
- 63618400,
- 82739683,
- 57114900,
- 86498997,
- 5531749,
- 43615458,
- 63266708,
- 35312651,
- 1767346,
- 166345084,
- 20994017,
- 10618634,
- 35187378,
- 52012568,
- 62221932,
- 101283997,
- 194238108,
- 24943142,
- 48857582,
- 214343186,
- 8807040,
- 11737208,
- 29004557,
- 33563369,
- 23510317,
- 54409541,
- 58494605,
- 55886581,
- 88208507,
- 10609552,
- 7042020,
- 21855281,
- 9560326,
- 22894461,
- 19569052,
- 11695122,
- 59192088,
- 11647472,
- 25992587,
- ]