You can provide raw text or JSON; it will be forwarded to the token service.
Leave blank to use the URL returned by the token service (when available).

Audio Monitor

Status

Transcripts

User

    Agent

      Latency Metrics Explained

      Transcription Delay

      Time (seconds) between the end of speech and when the final transcript text is available.

      End of Utterance

      Time from the VAD-detected end of speech until the user turn is considered complete. This already includes any transcription_delay.

      TTS Duration

      The amount of time (seconds) it took for the TTS model to synthesize the entire audio stream.

      TTS Time To First Byte

      Time (seconds) for the TTS model to produce the first byte of audio.

      LLM Time To First Token

      Time (seconds) for the LLM to emit the first token of the completion.

      LLM Duration

      The amount of time (seconds) it took for the LLM to stream the entire completion.

      Total Latency

      Total latency combines perception, reasoning, and speech: eou.end_of_utterance_delay + llm_metric.ttft + tts_metric.ttfb.