API Documentation - Request
The request can be send as GET or POST request to the URL
https://api.txtwerk.de/rest/txt/analyzer
The document and the used services are passed as parameters.
Document
The document, which should be annotated, can be passed directly as text.
Alternatively, you can simply specify the URL of a website to be analyzed. In this case, the site of the main text content is crawled, determined and processed. Foreign elements, such as navigation or teaser text will be removed.
Provide input text as JSON-string to multiply result relevance with the input weight. The response will be structured by the provided input texts.
Services
The document can be analyzed with different techniques. Choose from the following services:
entities | Named Entities based on the Wikidata ontology. |
tags | Keywords which appear in the text and describe and summarize the content. |
categories | Assignment of text to categories of politics, business, cars & technology, internet, culture, travel, sports, human interest, science. |
dates | Dates and periods. |
measures | Measurements that occur in the text. |
authors | Authors of the article available as an HTML document. |
fingerprints | Fingerprints for the text for near duplicate detection. |
lexiconEntities | Named Entites based on a lexicon managed in TXT Werk. |
lexiconTags | Keywords based on a lexicon maintained in TXTWerk. |
nerEntities | Flair/Ner Entities |
Service Control
More parameters are available for individual services affecting the analysis or the result.
Example Request
Example of a POST request where the document is passed directly as text:
curl "https://api.txtwerk.de/rest/txt/analyzer" \ -H "X-Api-Key: ..." \ -d text='Angela Merkel wurde am 17. Juli 1954 in Hamburg als Angela Dorothea Kasner geboren.' \ -d services='entities'
Example of a POST request where a HTML file is passed directly as input parameter:
curl "https://api.txtwerk.de/rest/txt/analyzer" \ -H "X-Api-Key: ..." \ -F htmlFile='@' \ -F services='entities'
Example of a POST request where document is provided as JSON:
curl "https://api.txtwerk.de/rest/txt/analyzer" \ -H "X-Api-Key: ..." \ -d services='entities' -d document='[{ "text": "Titel", "weight": "2.0" } , { "text": "Teaser", "weight": "1.5" }]'
Overview of Parameters
Parameter | Area | Description |
---|---|---|
text | Document |
Contains the annotated to document as text. If you have longer texts, please send the request as POST request and pass the text in the request body.
mandatory: either text or htmlFile or document values: text |
htmlFile | Document |
Contains the annotated to document as html text.
mandatory: either text or htmlFile or document values: html file text |
document | Document |
Contains the annotated to document as JSON string.
mandatory: either text or htmlFile or document values: JSON-String |
title | Document |
Title of the document. By additionally specifying a title, the result can be improved and will only be applied to the following services: tags.
mandatory: no values: text |
teaser | Document |
Teaser of the document. By adding a teaser, the result can be improved and will only be applied to the following services: tags.
mandatory: no values: text |
services | Services |
List of requested services.
mandatory: yes values: comma-separated list that contains at least one of the supported services: [entities, tags, categories, dates, measures, authors, fingerprints, lexiconEntities, lexiconTags, nerEntities] |
language | Service control |
Language of the document. Language-dependent components can be specifically activated by setting this parameter.
mandatory: no, will then be auto-detected values: 'en' or 'de' |
ntags | Service control |
Maximum number of keywords (tags) which are requested.
Service: tags. mandatory: no, default: 10 values: non-negative integer |
ncategories | Service control |
Number of returned categories.
Service: categories. mandatory: no values: non-negative integer |
nentities | Service control |
Number of returned entities.
Service: entities. mandatory: no values: non-negative integer |
nerMinConfidence | Service control |
Threshold for the entity confidence.
Service: entities. mandatory: no values: non-negative integer |
nerMinRelevance | Service control |
Schwellwert für die Relevanz bei den Entitäten.
Service: entities. mandatory: no values: non-negative integer |
nerFormat | Service control |
Response format for the entities.
Service: entities. mandatory: no values: 'list', 'aggregate' (aggregated list of entities, sorted by relevance), 'candidates' (for each possible entity disambiguation candidate list is delivered) |