Skip to main content
Build a small app that uploads a video from your computer to Interhuman and returns structured analysis results. In this codealong, you will:
  • exchange your key_id and key_secret for an access token
  • upload a local video to POST /v1/upload/analyze
  • optionally request conversation-quality outputs
  • render returned signals[] (including per-signal rationale) and quality metrics in your UI
This page is written so you can follow the build without the video, while the video helps with pacing and implementation details.

Watch the codealong video

Keep the video and your editor side by side; pause after each major step. Reference implementation: interhuman-video-analyzer-codealong on GitHub

Build steps

1) Prepare credentials and auth

Before calling upload, make sure you have an API key from platform.interhuman.ai. If needed, follow Get an API key. Then exchange your key_id and key_secret for an access token using the Authentication endpoint, and request the interhumanai.upload scope. Without this scope, upload requests will fail with an insufficient-scope error.

2) Add a simple file-upload UI

Create a minimal UI with:
  • a file input that accepts a local video file
  • an Analyze button
  • a results panel (for JSON output or parsed cards)
For this codealong, keep it simple. A plain HTML input and a single action button are enough to validate your integration end to end.

3) Call Upload & Analyze

When the user clicks Analyze, send the selected file to:
  • POST /v1/upload/analyze
  • with Authorization: Bearer <access_token>
  • as multipart form data (file)
Reference: Upload & Analyze API

4) Optionally include conversation quality

You can request conversation-quality outputs with include flags:
  • conversation_quality_overall
  • conversation_quality_timeline
If included, the API returns conversation quality both as a single overall view and as timeline windows. References:

5) Render results in the app

Once processing completes, render the response in your results panel. At minimum, show:
  • signals[] entries (type, start, end, probability, rationale)
  • optional conversation_quality.overall
  • optional conversation_quality.timeline[]
This gives users both event-level insight (what happened and why) and quality-level insight (how the interaction performed across time).

What to inspect in the response

  • Signal timing: start/end are in seconds from the beginning of the uploaded video.
  • Probability: confidence bucket for each detected signal.
  • Rationale: a structured explanation of observed behavior (for example speech pauses, tone shifts, gaze, and pacing cues).
  • Conversation quality: overall plus core dimensions (quality_index, energy, rapport, authority, learning, clarity), and optionally a timeline that shows change over time.

Next steps