> ## Documentation Index
> Fetch the complete documentation index at: https://docs.interhuman.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Stream analysis

> Codealong: connect over WebSocket, capture camera and microphone, and stream segments to Interhuman.

Stream analysis from a **live camera**: connect to Interhuman over WebSocket, capture media, and send **binary** chunks as they are recorded.

In this codealong, you will:

1. **Connect** to `wss://api.interhuman.ai/v1/stream/analyze`
2. **Get camera** (and microphone where your stack supports it)
3. **Send** each recorded segment to Interhuman and read typed server events

Wire the three steps together in your app. Use **JavaScript** in the browser (`getUserMedia` + `MediaRecorder`) or **Python** on the desktop (`opencv-python` for video; see notes below for audio).

<Note>
  You’ll need an API key. Follow the [API key guide](/how-to/get-api-key) for details.
</Note>

## 1) Connect to the WebSocket

Open a TLS WebSocket to the stream endpoint. On connect, send **session config** as a **text** frame (UTF-8 JSON), then listen for **text** replies and parse JSON. Branch on `type` (`signal.detected`, `engagement.updated`, `conversation_quality.updated`, `error`).

<CodeGroup>
  ```javascript JavaScript icon="square-js" theme={null}
  const WS_URL = "wss://api.interhuman.ai/v1/stream/analyze";
  const apiKey = "YOUR_API_KEY"; // In production, do not hardcode—use your app's auth flow.

  const ws = new WebSocket(WS_URL, apiKey);
  ws.binaryType = "arraybuffer";

  ws.addEventListener("open", () => {
    const sessionConfig = {
      include: [
        "conversation_quality_overall",
        "conversation_quality_timeline",
      ],
    };
    ws.send(JSON.stringify(sessionConfig));
  });

  ws.addEventListener("message", (event) => {
    if (typeof event.data !== "string") return;
    const payload = JSON.parse(event.data);
    console.log(payload.type, payload);
  });
  ```

  ```python Python icon="python" theme={null}
  import asyncio
  import json
  import os

  import websockets

  WS_URL = "wss://api.interhuman.ai/v1/stream/analyze"


  async def connect():
      api_key = os.environ["API_KEY"]
      headers = {"Authorization": f"Bearer {api_key}"}

      ws = await websockets.connect(
          WS_URL,
          additional_headers=headers,
          max_size=None,
      )

      session_config = {
          "include": [
              "conversation_quality_overall",
              "conversation_quality_timeline",
          ],
      }
      await ws.send(json.dumps(session_config))
      return ws


  # Example: ws = asyncio.run(connect())
  ```
</CodeGroup>

Reference: [Stream & analyze](/api-reference/stream-analyze)

## 2) Get camera and microphone

<CodeGroup>
  ```javascript JavaScript icon="square-js" theme={null}
  const mediaStream = await navigator.mediaDevices.getUserMedia({
    video: true,
    audio: true,
  });

  const preview = document.querySelector("#preview");
  preview.srcObject = mediaStream;
  preview.play();
  ```

  ```python Python icon="python" theme={null}
  import cv2

  cap = cv2.VideoCapture(0)
  if not cap.isOpened():
      raise RuntimeError("Could not open camera (device index 0)")

  # Optional local preview while you build:
  ok, frame = cap.read()
  if ok:
      cv2.imshow("preview", frame)
  ```
</CodeGroup>

## 3) Send segments to Interhuman

Send each non-empty recording as a **binary** WebSocket frame. Start recording **after** the WebSocket is open and session config is sent.

<CodeGroup>
  ```javascript JavaScript icon="square-js" theme={null}
  const SEGMENT_MS = 3000;

  const mimeType =
    ["video/webm;codecs=vp9,opus", "video/webm;codecs=vp8,opus", "video/webm"].find(
      (m) => MediaRecorder.isTypeSupported(m)
    ) || "";

  const recorder = new MediaRecorder(
    mediaStream,
    mimeType ? { mimeType } : undefined
  );

  recorder.addEventListener("dataavailable", async (event) => {
    if (!event.data || event.data.size === 0) return;
    if (ws.readyState !== WebSocket.OPEN) return;

    const buffer = await event.data.arrayBuffer();
    ws.send(buffer);
  });

  // Call once the WebSocket is open and session config is sent:
  recorder.start(SEGMENT_MS);
  ```

  ```python Python icon="python" theme={null}
  import asyncio
  import time

  import cv2

  SEGMENT_SECONDS = 3
  SEGMENT_PATH = "segment.mp4"


  def record_segment(cap: cv2.VideoCapture, path: str, seconds: float) -> None:
      fps = cap.get(cv2.CAP_PROP_FPS) or 20.0
      width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
      height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
      fourcc = cv2.VideoWriter_fourcc(*"mp4v")
      writer = cv2.VideoWriter(path, fourcc, fps, (width, height))

      deadline = time.time() + seconds
      while time.time() < deadline:
          ok, frame = cap.read()
          if ok:
              writer.write(frame)
      writer.release()


  async def send_segment(ws) -> None:
      record_segment(cap, SEGMENT_PATH, SEGMENT_SECONDS)
      with open(SEGMENT_PATH, "rb") as f:
          await ws.send(f.read())

      msg = await asyncio.wait_for(ws.recv(), timeout=60.0)
      print(msg)


  # After connect(): asyncio.run(send_segment(ws))
  ```
</CodeGroup>

When the user stops, release the camera and close the connection:

<CodeGroup>
  ```javascript JavaScript icon="square-js" theme={null}
  if (recorder && recorder.state !== "inactive") {
    recorder.stop();
  }
  ws.close();
  mediaStream.getTracks().forEach((track) => track.stop());
  ```

  ```python Python icon="python" theme={null}
  cap.release()
  await ws.close()
  cv2.destroyAllWindows()  # if you opened a preview window
  ```
</CodeGroup>

## 4) Read server envelopes

Every server message shares the same outer shape: `type`, `timestamp`, `correlation_id`, and `data`. Narrow on `type` before reading fields inside `data`.

### `signal.detected`

```json theme={null}
{
  "type": "signal.detected",
  "timestamp": "2025-01-01T00:00:00.000000Z",
  "correlation_id": "550e8400-e29b-41d4-a716-446655440000",
  "data": {
    "signals": [
      {
        "type": "agreement",
        "start": 3.0,
        "end": 11.0,
        "probability": "high",
        "rationale": "Subject nodded repeatedly while maintaining eye contact."
      }
    ]
  }
}
```

Each entry in `data.signals[]` uses the same shape as upload responses: `type`, `start`, `end`, `probability`, and `rationale`.

### `engagement.updated`

```json theme={null}
{
  "type": "engagement.updated",
  "timestamp": "2025-01-01T00:00:00.000000Z",
  "correlation_id": "550e8400-e29b-41d4-a716-446655440000",
  "data": {
    "state": "engaged",
    "start": 3.0,
    "end": 11.0
  }
}
```

### `conversation_quality.updated` (when opted in)

When your session config `include` lists `conversation_quality_overall` and/or `conversation_quality_timeline`, you may receive `conversation_quality.updated` with `data.overall` and/or `data.timeline` for the window that was just processed. See [Conversation quality](/explanations/conversation-quality).

### `error`

Errors use the same envelope with `type: "error"` and structured fields under `data` (for example `code`, `message`, `link`, and `segment` when applicable). See [Error handling](/api-reference/error-handling).

### How to interpret it quickly

* **`signal.detected`**: `data.signals[]` lists moment-level [social signals](/explanations/social-signals) for the segment; each `rationale` explains that detection.
* **`engagement.updated`**: attention level for a time window within the segment (`start` / `end` are seconds within that segment).
* **`conversation_quality.updated`**: optional overall and per-window quality metrics when requested.

## Next steps

* [Stream & analyze](/api-reference/stream-analyze) — full AsyncAPI channel, headers, and message schemas.
* [Authentication](/api-reference/authentication) — API key usage; browser WebSockets use the subprotocol as above.
* [Error handling](/api-reference/error-handling) — structured error codes and recovery.
* [Video upload quickstart](/getting-started/video-upload-quickstart) — one-shot `POST /v1/upload/analyze` flow.
* [Agent Skills](/how-to/agent-skills) — installable skills that wrap upload and stream calls.
* [Social signals](/explanations/social-signals) and [Conversation quality](/explanations/conversation-quality) — meaning of outputs.
