How to combine real-time transcription with translation?

When you enable translation in a real-time transcription session, Gladia sends not only the usual transcription payloads but also a dedicated translation payload. This message provides the translated text alongside the original utterance and its metadata.

Enable the translation

To enable the translation for a realtime translation, you need to add this to your configuration :


"realtime_processing": {
   "translation": true,
   "translation_config": ["fr", "de"]
}

To learn more about the translation parameters, you can check here : PLAIN CENTER TRANSLATION PAGE

The Translation Payload

A translation message always has type: "translation". It contains both the original utterance and its translated counterpart, so you can display or store them together.


{
  "session_id": "4a39145c-2844-4557-8f34-34883f7be7d9",
  "created_at": "2021-09-01T12:00:00.123Z",
  "error": null,
  "type": "translation",
  "data": {
    "utterance_id": "00-00000011",
    "utterance": {
      "language": "en",
      "start": 123,
      "end": 123,
      "confidence": 123,
      "channel": 1,
      "speaker": 1,
      "words": [
        { "word": "<string>", "start": 123, "end": 123, "confidence": 123 }
      ],
      "text": "<string>"
    },
    "original_language": "af",
    "target_language": "af",
    "translated_utterance": {
      "language": "en",
      "start": 123,
      "end": 123,
      "confidence": 123,
      "channel": 1,
      "speaker": 1,
      "words": [
        { "word": "<string>", "start": 123, "end": 123, "confidence": 123 }
      ],
      "text": "<string>"
    }
  }
}

To learn more about the translation event : https://docs.gladia.io/api-reference/v2/live/callback/translation

How to Use It

Display dual captions: Show the source utterance.text and the translated translated_utterance.text together.
Align by utterance ID: Ensure the translation matches the correct transcription.
Leverage metadata: Use speaker, start, and end to keep captions well-timed and attributed.
Store bilingual transcripts: Save both utterance objects to allow replay, subtitle generation, or exporting to SRT/VTT later.