To process audio in real-time, you will stream your audio to Deepgram as it happens. Deepgram will return temporary transcripts that finalize and self-correct as additional words are spoken.
Deepgram gives you streamlined access to automatic transcription from Deepgram's off-the-shelf and trained speech recognition models. This product is very fast, can understand nearly every audio format available, and is customizable.
When streaming audio in real-time, you will live-stream your audio to Deepgram and receive both live transcriptions and transcript corrections in return. During the process, you will receive multiple response messages as new transcripts become available and old transcripts are corrected.
Deepgram's real-time streaming is ideal for use cases involving live audio streams that need to be analyzed and transcribed as words are being spoken.
You need to dictate a message to your phone. You want to be able to see the words appear as soon as they are spoken, so that you can check the message and make sure that it is correct. As you speak, words begin to appear on your screen. As you continue talking, more words appear (as new transcripts become available). Eventually, phrases that start out wrong correct themselves (as old transcripts are corrected) until you have a final message.
You manage a call center. You want all calls taken by your call center agents to be transcribed in real-time, so you can provide each agent with timely information to provide more effective customer service.
Real-time streaming uses WebSockets, a communications protocol that enables full-duplex communication, which means that you can stream new audio to Deepgram at the same time the latest transcription results are streaming back to you. Using WebSockets is further eased by the wide variety of third-party client libraries that have been written to support a range of languages and production environments.