How to deal with noisy or bad quality audio ?

Overview

If you're experiencing issues with real-time transcription not working, it might be due to the quality of your audio. Therefor, the speech_threshold parameter is here to help. This guide will help you understand and resolve the problem.


"pre_processing": {
   "speech_treshold": 0.7
}

Understanding the Speech Threshold Parameter

The speech_threshold parameter is crucial for determining what is considered speech versus silence in audio processing. The idea is to set it to the right value so it’ll filter the background noise, and allow the system to only transmit clean speech to the model.

Setting this parameter to 1 will disable all speech detection, causing the system to interpret all audio as silence.

Recommended Settings

The default value for speech_threshold is 0.6.
- For poor quality audio sources, try testing with 0.7
It is advisable to test values between 0.5 and 0.8 to find the optimal setting for your needs.

Steps to Resolve the Issue

Check your current speech_threshold setting.
If it is set to 1, adjust it to a value between 0.1 and 0.9.
Test the transcription again to ensure it is working correctly.

Additional Troubleshooting

If adjusting the speech_threshold does not resolve the issue, consider the following:

Verify that your API key and session IDs are correct.
Ensure that there are no network issues affecting the API calls.
Contact support if the problem persists.

Here is the link to the real-time speech-to-text speech_treshold configuration : https://docs.gladia.io/api-reference/v2/live/init#body-pre-processing-speech-threshold