Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.interhuman.ai/llms.txt

Use this file to discover all available pages before exploring further.

Use this quickstart to get your first successful analysis response in a few minutes. In this guide, you will:
  1. Upload a local video file to POST /v1/upload/analyze
  2. Read engagement_state, signals[] (including per-signal rationale), and optional conversation_quality outputs
You’ll need an API key. Follow the API key guide for details. You’ll also need a video file. You can download an example from here.

1) Upload and analyze a video

Use one of the requests below to send a local video file (mp4, avi, mov, mkv, mpeg-ts, webm; minimum 10 KB, maximum 32MB) to POST /v1/upload/analyze. The API returns core analysis by default. You can optionally request conversation-quality sections by passing include[] flags (shown below).
export API_KEY="YOUR_API_KEY"
export VIDEO_PATH="path_to_your_video.mp4"

curl -X POST https://api.interhuman.ai/v1/upload/analyze \
  -H "Authorization: Bearer ${API_KEY}" \
  -F "file=@${VIDEO_PATH};type=video/mp4" \
  -F "include[]=conversation_quality_overall" \
  -F "include[]=conversation_quality_timeline"
If you want only core outputs, remove the include[] lines. Reference: Upload & Analyze API

2) Read the response

After your upload is processed, the API returns a structured response with three complementary outputs:
  • engagement_state: Time-bounded labels such as engaged, disengaged, or neutral.
  • signals[]: Time-bounded social signals, each with type, probability, and rationale.
  • conversation_quality (optional): Reuses the conversation_quality_values shape in both overall and each timeline window’s values object.
Time fields (start, end) are expressed in seconds from the start of the uploaded video. conversation_quality_values shape (reused by conversation_quality.overall and conversation_quality.timeline[].values):
{
  "quality_index": 45,
  "energy": 53,
  "rapport": 50,
  "authority": 49,
  "learning": 50,
  "clarity": 48
}
Here’s an example of what the API returns:
{
  "engagement_state": [
    {
      "start": 0,
      "end": 10,
      "state": "engaged"
    },
    {
      "start": 10,
      "end": 20,
      "state": "disengaged"
    }
  ],
  "signals": [
    {
      "start": 0,
      "end": 10,
      "type": "agreement",
      "probability": "high",
      "rationale": "The speaker provides a quick affiliative nod while the partner is speaking."
    },
    {
      "start": 5,
      "end": 15,
      "type": "confidence",
      "probability": "medium",
      "rationale": "The speaker maintains upright posture and responds with steady, fluent delivery."
    }
  ],
  "conversation_quality": {
    "overall": {
      "quality_index": 45,
      "energy": 53,
      "rapport": 50,
      "authority": 49,
      "learning": 50,
      "clarity": 48
    },
    "timeline": [
      {
        "start": 0,
        "end": 10,
        "values": {
          "quality_index": 72,
          "energy": 80,
          "rapport": 75,
          "authority": 68,
          "learning": 70,
          "clarity": 67
        }
      }
    ]
  }
}

How to interpret it quickly

  • signals[] gives moment-level events; rationale explains why each signal was inferred.
  • engagement_state shows attention level over contiguous windows.
  • conversation_quality.overall is a single interaction summary.
  • conversation_quality.timeline[] shows how quality changes over time.

Next steps