Skip to content
Gladia Help Center home
Gladia Help Center home

How to detect speaker turn detection ?

Gladia provides speech_start and speech_end events so you can detect exactly when a person starts and stops talking, much faster than waiting for transcripts.

Why use them?

  • Ultra low latency: trigger in real time as soon as speech is detected.

  • Speaker turn detection: combine both events to know when a speaker takes or ends a turn.

  • Better UX: react instantly (show β€œspeaking” state, cut latency, manage bot handoffs).

πŸ’‘Β Pro tip

When building voice agents, low latency is crucial. By combining speech activity detection with partial transcriptions, you can feed text to your LLM faster β€” which means it can start making sense of the conversation even before the final, polished transcription is ready.

How it works

  1. speech_start β†’ fired when human speech begins.

  2. speech_end β†’ fired when speech stops.

  3. Transcript text might arrives later, but you already know the turn boundaries.

API References