📦 Data Models & Schemas

This guide documents all request and response data structures for the Sentor ML API.

Request Models

PredictInput

Schema for sentiment prediction requests.

Fields:

docs (array, required): Array of documents to analyze

Document Object:

doc_id (string, required): Unique identifier for the document
doc (string, required): Text content to analyze
entities (array of strings, required): Entities to focus analysis on

Example:

{
  "docs": [
    {
      "doc_id": "doc1",
      "doc": "Apple's new iPhone is amazing!",
      "entities": ["Apple", "iPhone"]
    }
  ]
}

ClusteringRequest

Schema for document clustering requests.

Fields:

documents (array, required): Array of documents to cluster (minimum 5 required)

Document Object:

doc_id (string, required): Unique identifier
text (string, required): Document text
entities (array of strings, required): Associated entities

Example:

{
  "documents": [
    {
      "doc_id": "doc1",
      "text": "Apple announced new iPhone features.",
      "entities": ["Apple", "iPhone"]
    },
    {
      "doc_id": "doc2",
      "text": "Samsung launched Galaxy smartphone.",
      "entities": ["Samsung", "Galaxy"]
    }
  ]
}

TopicNamingRequest

Schema for topic name generation requests.

Fields:

cluster_id (integer, required): ID of the cluster
documents (array, required): Documents in the cluster
entities (array of strings, optional): Entities found in cluster
top_words (array of strings, optional): Top words characterizing the cluster

Document Object:

doc_id (string, required): Document identifier
text (string, required): Document text
entities (array of strings, required): Document entities
cluster_probability (number, required): Probability of belonging to cluster (0-1)

Example:

{
  "cluster_id": 0,
  "documents": [
    {
      "doc_id": "doc1",
      "text": "Apple announced new iPhone features.",
      "entities": ["Apple", "iPhone"],
      "cluster_probability": 0.92
    }
  ],
  "entities": ["Apple", "iPhone", "iOS"],
  "top_words": ["apple", "iphone", "features", "technology"]
}

Response Models

PredictionResponse

Schema for sentiment prediction responses.

Fields:

results (array): Array of prediction results

Result Object:

doc_id (string): Document identifier
predicted_class (integer): Predicted class (0=negative, 1=neutral, 2=positive)
predicted_label (string): Label name ("negative", "neutral", "positive")
probabilities (object): Probability scores for each sentiment
- negative (number): Probability of negative sentiment (0-1)
- neutral (number): Probability of neutral sentiment (0-1)
- positive (number): Probability of positive sentiment (0-1)
details (array): Sentence-level breakdown

Detail Object:

sentence_index (integer): Index of the sentence
sentence_text (string): Text of the sentence
predicted_class (integer): Predicted class for this sentence
predicted_label (string): Label for this sentence
probabilities (object): Probabilities for this sentence

Example:

{
  "results": [
    {
      "doc_id": "doc1",
      "predicted_class": 2,
      "predicted_label": "positive",
      "probabilities": {
        "negative": 0.0001,
        "neutral": 0.0003,
        "positive": 0.9996
      },
      "details": [
        {
          "sentence_index": 0,
          "sentence_text": "Apple's new iPhone is amazing!",
          "predicted_class": 2,
          "predicted_label": "positive",
          "probabilities": {
            "negative": 0.0001,
            "neutral": 0.0003,
            "positive": 0.9996
          }
        }
      ]
    }
  ]
}

ClusteringResponse

Schema for clustering responses.

Fields:

total_clusters (integer): Total number of clusters found
min_clusters (integer): Natural cluster count from HDBSCAN
outliers (integer): Number of outlier documents
clusters (array): Array of cluster objects

Cluster Object:

cluster_id (integer): Cluster identifier
documents (array): Documents belonging to this cluster
top_words (array of strings): Most characteristic words
entities (array of strings): Entities found in cluster documents

Cluster Document Object:

doc_id (string): Document identifier
text (string): Document text
entities (array of strings): Document entities
cluster_probability (number): Probability of belonging to this cluster (0-1)

Example:

{
  "total_clusters": 3,
  "min_clusters": 2,
  "outliers": 1,
  "clusters": [
    {
      "cluster_id": 0,
      "documents": [
        {
          "doc_id": "doc1",
          "text": "Apple announced new iPhone features.",
          "entities": ["Apple", "iPhone"],
          "cluster_probability": 0.92
        },
        {
          "doc_id": "doc3",
          "text": "Apple revealed iOS updates.",
          "entities": ["Apple", "iOS"],
          "cluster_probability": 0.88
        }
      ],
      "top_words": ["apple", "iphone", "ios", "features", "updates"],
      "entities": ["Apple", "iPhone", "iOS"]
    }
  ]
}

TopicNamingResponse

Schema for topic naming responses.

Fields:

cluster_id (integer): ID of the cluster
topic_name (string): Generated descriptive topic name
generated_using (string): API key source ("company_key" or "customer_key")

Example:

{
  "cluster_id": 0,
  "topic_name": "Apple iPhone and iOS Technology Updates",
  "generated_using": "company_key"
}

HealthResponse

Schema for health check responses.

Fields:

status (string): Service health status ("healthy" or "unhealthy")
version (string): API version
models (object, optional): Model information
- sentiment (string): Sentiment model version
uptime (string, optional): Service uptime percentage

Example:

{
  "status": "healthy",
  "version": "1.0.0",
  "models": {
    "sentiment": "v2.1"
  },
  "uptime": "99.99%"
}

Error Response Models

Standard Error Response

All API errors follow this consistent structure.

Fields:

error (object): Error details
- code (string): Error code
- message (string): Human-readable error message
- status (integer): HTTP status code
- details (object, optional): Additional error context
- documentation_url (string, optional): Link to error documentation

Example:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded",
    "status": 429,
    "details": {
      "limit": 3,
      "remaining": 0,
      "reset": 1234567890,
      "retry_after": 30
    },
    "documentation_url": "https://sentor.app/docs/api/rate-limits"
  }
}

Common Error Codes

Code	HTTP Status	Description
`invalid_request`	400	Malformed request or missing required fields
`invalid_api_key`	401	Invalid or missing API key
`rate_limit_exceeded`	429	Rate limit exceeded for your plan
`insufficient_documents`	400	Too few documents for clustering (minimum 5 required)
`internal_error`	500	Internal server error
`service_unavailable`	503	Service temporarily unavailable

Query Parameters

Language Parameter

Used across multiple endpoints to specify content language.

Parameter: language Type: String (enum) Values: en (English), nl (Dutch) Default: en Required: No

Example:

GET /predicts?language=nl
POST /predicts/cluster?language=en
POST /predicts/topic-name?language=nl

Headers

Required Headers

x-api-key

Description: Your Sentor API key for authentication
Type: String
Required: Yes (for all endpoints)
Example: x-api-key: sk_live_abc123...

Content-Type

Description: Request content type
Type: String
Required: Yes (for POST requests)
Value: application/json

Optional Headers

X-Google-API-Key

Description: Your Google API key for topic naming (lower pricing)
Type: String
Required: No
Endpoint: /predicts/topic-name only
Example: X-Google-API-Key: AIza...

Related Resources

Support

Email: sentor@nikx.one
Discord: Join our community