Docs

📦 Data Models & Schemas

This guide documents all request and response data structures for the Sentor ML API.

Request Models

PredictInput

Schema for sentiment prediction requests.

Fields:

  • docs (array, required): Array of documents to analyze

Document Object:

  • doc_id (string, required): Unique identifier for the document
  • doc (string, required): Text content to analyze
  • entities (array of strings, required): Entities to focus analysis on

Example:

{
  "docs": [
    {
      "doc_id": "doc1",
      "doc": "Apple's new iPhone is amazing!",
      "entities": ["Apple", "iPhone"]
    }
  ]
}

ClusteringRequest

Schema for document clustering requests.

Fields:

  • documents (array, required): Array of documents to cluster (minimum 5 required)

Document Object:

  • doc_id (string, required): Unique identifier
  • text (string, required): Document text
  • entities (array of strings, required): Associated entities

Example:

{
  "documents": [
    {
      "doc_id": "doc1",
      "text": "Apple announced new iPhone features.",
      "entities": ["Apple", "iPhone"]
    },
    {
      "doc_id": "doc2",
      "text": "Samsung launched Galaxy smartphone.",
      "entities": ["Samsung", "Galaxy"]
    }
  ]
}

TopicNamingRequest

Schema for topic name generation requests.

Fields:

  • cluster_id (integer, required): ID of the cluster
  • documents (array, required): Documents in the cluster
  • entities (array of strings, optional): Entities found in cluster
  • top_words (array of strings, optional): Top words characterizing the cluster

Document Object:

  • doc_id (string, required): Document identifier
  • text (string, required): Document text
  • entities (array of strings, required): Document entities
  • cluster_probability (number, required): Probability of belonging to cluster (0-1)

Example:

{
  "cluster_id": 0,
  "documents": [
    {
      "doc_id": "doc1",
      "text": "Apple announced new iPhone features.",
      "entities": ["Apple", "iPhone"],
      "cluster_probability": 0.92
    }
  ],
  "entities": ["Apple", "iPhone", "iOS"],
  "top_words": ["apple", "iphone", "features", "technology"]
}

Response Models

PredictionResponse

Schema for sentiment prediction responses.

Fields:

  • results (array): Array of prediction results

Result Object:

  • doc_id (string): Document identifier
  • predicted_class (integer): Predicted class (0=negative, 1=neutral, 2=positive)
  • predicted_label (string): Label name ("negative", "neutral", "positive")
  • probabilities (object): Probability scores for each sentiment
    • negative (number): Probability of negative sentiment (0-1)
    • neutral (number): Probability of neutral sentiment (0-1)
    • positive (number): Probability of positive sentiment (0-1)
  • details (array): Sentence-level breakdown

Detail Object:

  • sentence_index (integer): Index of the sentence
  • sentence_text (string): Text of the sentence
  • predicted_class (integer): Predicted class for this sentence
  • predicted_label (string): Label for this sentence
  • probabilities (object): Probabilities for this sentence

Example:

{
  "results": [
    {
      "doc_id": "doc1",
      "predicted_class": 2,
      "predicted_label": "positive",
      "probabilities": {
        "negative": 0.0001,
        "neutral": 0.0003,
        "positive": 0.9996
      },
      "details": [
        {
          "sentence_index": 0,
          "sentence_text": "Apple's new iPhone is amazing!",
          "predicted_class": 2,
          "predicted_label": "positive",
          "probabilities": {
            "negative": 0.0001,
            "neutral": 0.0003,
            "positive": 0.9996
          }
        }
      ]
    }
  ]
}

ClusteringResponse

Schema for clustering responses.

Fields:

  • total_clusters (integer): Total number of clusters found
  • min_clusters (integer): Natural cluster count from HDBSCAN
  • outliers (integer): Number of outlier documents
  • clusters (array): Array of cluster objects

Cluster Object:

  • cluster_id (integer): Cluster identifier
  • documents (array): Documents belonging to this cluster
  • top_words (array of strings): Most characteristic words
  • entities (array of strings): Entities found in cluster documents

Cluster Document Object:

  • doc_id (string): Document identifier
  • text (string): Document text
  • entities (array of strings): Document entities
  • cluster_probability (number): Probability of belonging to this cluster (0-1)

Example:

{
  "total_clusters": 3,
  "min_clusters": 2,
  "outliers": 1,
  "clusters": [
    {
      "cluster_id": 0,
      "documents": [
        {
          "doc_id": "doc1",
          "text": "Apple announced new iPhone features.",
          "entities": ["Apple", "iPhone"],
          "cluster_probability": 0.92
        },
        {
          "doc_id": "doc3",
          "text": "Apple revealed iOS updates.",
          "entities": ["Apple", "iOS"],
          "cluster_probability": 0.88
        }
      ],
      "top_words": ["apple", "iphone", "ios", "features", "updates"],
      "entities": ["Apple", "iPhone", "iOS"]
    }
  ]
}

TopicNamingResponse

Schema for topic naming responses.

Fields:

  • cluster_id (integer): ID of the cluster
  • topic_name (string): Generated descriptive topic name
  • generated_using (string): API key source ("company_key" or "customer_key")

Example:

{
  "cluster_id": 0,
  "topic_name": "Apple iPhone and iOS Technology Updates",
  "generated_using": "company_key"
}

HealthResponse

Schema for health check responses.

Fields:

  • status (string): Service health status ("healthy" or "unhealthy")
  • version (string): API version
  • models (object, optional): Model information
    • sentiment (string): Sentiment model version
  • uptime (string, optional): Service uptime percentage

Example:

{
  "status": "healthy",
  "version": "1.0.0",
  "models": {
    "sentiment": "v2.1"
  },
  "uptime": "99.99%"
}

Error Response Models

Standard Error Response

All API errors follow this consistent structure.

Fields:

  • error (object): Error details
    • code (string): Error code
    • message (string): Human-readable error message
    • status (integer): HTTP status code
    • details (object, optional): Additional error context
    • documentation_url (string, optional): Link to error documentation

Example:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded",
    "status": 429,
    "details": {
      "limit": 3,
      "remaining": 0,
      "reset": 1234567890,
      "retry_after": 30
    },
    "documentation_url": "https://sentor.app/docs/api/rate-limits"
  }
}

Common Error Codes

CodeHTTP StatusDescription
invalid_request400Malformed request or missing required fields
invalid_api_key401Invalid or missing API key
rate_limit_exceeded429Rate limit exceeded for your plan
insufficient_documents400Too few documents for clustering (minimum 5 required)
internal_error500Internal server error
service_unavailable503Service temporarily unavailable

Query Parameters

Language Parameter

Used across multiple endpoints to specify content language.

Parameter: language Type: String (enum) Values: en (English), nl (Dutch) Default: en Required: No

Example:

GET /predicts?language=nl
POST /predicts/cluster?language=en
POST /predicts/topic-name?language=nl

Headers

Required Headers

x-api-key

  • Description: Your Sentor API key for authentication
  • Type: String
  • Required: Yes (for all endpoints)
  • Example: x-api-key: sk_live_abc123...

Content-Type

  • Description: Request content type
  • Type: String
  • Required: Yes (for POST requests)
  • Value: application/json

Optional Headers

X-Google-API-Key

  • Description: Your Google API key for topic naming (lower pricing)
  • Type: String
  • Required: No
  • Endpoint: /predicts/topic-name only
  • Example: X-Google-API-Key: AIza...

Related Resources

Support