Improving audio quality for better transcription

Improving audio quality for better transcription

Vijayalakshmi Raghavan (Viji)

Co-Founder & CEO | November 1, 2024

Good audio quality is the key for accurate transcription using ASR models. Poor recording quality can result in inaudible sections reducing the accuracy of the transcript.

At pradhi.ai we recommend that users try and follow these suggestions for improving the quality of their audio:

1. Pay close attention to the room acoustics. Know that large empty rooms produce echoes, which decrease sound quality. Simple things like closing the doors and windows to reduce noise from outside can help with improving quality of audio

2. Aim for a consistent environment - the same room or the same meeting recorder or recording device. This helps create consistent audio quality.

3. Record in the M4A file format as this produces a high-quality sound file that is also small. The WAV format is also a good alternate choice. Both M4A and Wav are lossless audio formats. However MP3 is also good alternate choices.

4. Test your microphone or recording device beforehand. Doing a trial run allows you to choose the best spot for the microphone in advance.

5. Use a good omnidirectional microphone (there are bluetooth enables ones available) that can be moved around for a group recording.

6. For online meetings on common meeting tools such as Zoom, GMeet or Teams, do ask participants to set advanced noise cancellations, background sounds suppression settings for maximum effect. Do test audio quality once these settings are in place for good quality recordings.

During the recording:

1. Place your microphone close to the speaker

The best spot is either directly below or to the side of the speaker’s mouth. The further away from the speaker’s mouth the microphone is, the more muffled the sound. Ideally the microphone should be not more than a foot away. In a meeting context, using the meeting recorder to record audio rather than use the phone to record the audio. Ask speakers to speak close to their microphone. In case of group discussion, try to place multiple microphones strategically through the room if required. In case of an online meeting, ensure that everyone

2. Limit background noise & overlapping conversation.

When conversations overlap, transcription quality is impacted. If multiple speakers are speaking at the same time, the facilitator should make sure each person speaks one at a time afterwards so that the speech is covered. The facilitator should limit unnecessary noise as much as possible to ensure clarity.

3. To ensure good speaker recognition, it would be good to have speakers introduce themselves in the beginning of the conversation so that they can be properly identified in the transcript.

4. Repeat questions or important statements: The facilitator should repeat questions and any important statements made by the group. This ensures that nothing crucial is missed by the recording.

5. Monitor input volume as you record: If someone in the group is speaking too softly do ask them to speak up or adjust the microphone so that they are audible. Make sure you stay in the ideal “green zone”- that the volume is neither too high nor too low, both of which negatively affect audio quality.

It is possible to use audio limiters and sound editors to enhance the audio by using options such as reducing background noise, canceling echo, providing a volume boost etc. A transcription service can handle files with difficult audio and still deliver high accuracy and quick turnaround. However, following these best practices helps make the process smooth for customers.

Leave a Comment