Deepgram
Enterprise speech-to-text, text-to-speech, and voice-agent APIs metered by the minute.
What is Deepgram?
Deepgram sells enterprise voice infrastructure built around the Nova-3 speech-to-text family, the Aura text-to-speech engine, and Flux, a unified voice-agent API that bundles ASR, TTS and LLM orchestration into one endpoint. Deployment options cover real-time streaming, batch pre-recorded, cloud or self-hosted, with custom model training available for regulated industries. Pricing is metered per minute of audio for transcription and per thousand characters for synthesis, with a published Pay-As-You-Go tier and a committed Growth tier. The positioning is unapologetically enterprise: high-throughput, low-latency, contact-center and voice-agent workloads where milliseconds and per-minute cost both matter.
Voice agents and conversational AI platforms for calls, qualification, scheduling, support, and audio workflows.
See the full Voice AI guide to compare more tools, buyer criteria, and related workflows.
Use cases to evaluate
Powering real-time voice agents that combine STT, LLM reasoning and TTS in one Flux call
Transcribing high-volume call-center recordings with redaction and diarization
Generating natural-sounding voice responses with Aura-2 for IVR and assistant products
Running self-hosted ASR in regulated environments where audio cannot leave the network
Fit to evaluate
Voice-AI product teams shipping conversational agents at production scale
Contact-center platforms needing low-latency streaming transcription
Healthcare, finance and government buyers requiring on-prem or VPC deployment
Developers who want a single API for both ASR and TTS rather than stitching vendors
Business fit
Right for you if you are building a voice agent, transcribing call-center audio at scale, or need self-hosted ASR for compliance reasons. Skip if you only transcribe a handful of meetings a week or want a finished consumer transcription product rather than a developer API.
How to evaluate Deepgram
Use this category when missed calls, slow qualification, or phone support volume affects revenue.
Confirm the exact workflow
Map Deepgram to one concrete workflow first, such as powering real-time voice agents that combine stt, llm reasoning and tts in one flux call. Avoid buying before the owner, trigger, output, and success metric are clear.
Check category fit
Test voice quality, latency, interruptions, and escalation behavior.
Compare practical alternatives
Shortlist Deepgram against Retell AI, Vapi, Bland AI so the decision is based on fit, effort, and workflow ownership rather than brand recognition alone.
Validate cost and rollout effort
Pay-As-You-Go Nova-3 streaming at $0.0048/min monolingual and $0.0058/min multilingual; pre-recorded at $0.0077/min and $0.0092/min respectively. Growth tier (from $4K+/year prepaid) drops streaming to $0.0042/min and pre-recorded to $0.0065/min, saving up to 20%. Aura-2 TTS at $0.030 per 1k characters PAYG, $0.027 on Growth; Aura-1 at $0.0150 per 1k characters. New accounts get $200 free credit. Enterprise pricing for large-volume or self-hosted deployments is contact-sales. Also confirm implementation time, support needs, and whether the medium setup matches your team.
Compare Deepgram with alternatives
Use this quick comparison before booking demos or moving data into a new system.
| Primary workflow | Powering real-time voice agents that combine STT, LLM reasoning and TTS in one Flux call, Transcribing high-volume call-center recordings with redaction and diarization |
|---|---|
| Best-fit team | Voice-AI product teams shipping conversational agents at production scale, Contact-center platforms needing low-latency streaming transcription |
| Implementation effort | Medium setup and maintenance profile |
| Pricing check | Per-minute audio metering for transcription, per-1k-characters for TTS, with PAYG, Growth (prepaid annual), and Enterprise tiers; $200 signup credit. |
| Closest alternatives | Retell AIVapiBland AISynthflow |
Deepgram pricing
| Model | Per-minute audio metering for transcription, per-1k-characters for TTS, with PAYG, Growth (prepaid annual), and Enterprise tiers; $200 signup credit. |
|---|---|
| Snapshot | Pay-As-You-Go Nova-3 streaming at $0.0048/min monolingual and $0.0058/min multilingual; pre-recorded at $0.0077/min and $0.0092/min respectively. Growth tier (from $4K+/year prepaid) drops streaming to $0.0042/min and pre-recorded to $0.0065/min, saving up to 20%. Aura-2 TTS at $0.030 per 1k characters PAYG, $0.027 on Growth; Aura-1 at $0.0150 per 1k characters. New accounts get $200 free credit. Enterprise pricing for large-volume or self-hosted deployments is contact-sales. |
| Checked |
Common questions about Deepgram
What is Deepgram?
Deepgram sells enterprise voice infrastructure built around the Nova-3 speech-to-text family, the Aura text-to-speech engine, and Flux, a unified voice-agent API that bundles ASR, TTS and LLM orchestration into one endpoint. Deployment options cover real-time streaming, batch pre-recorded, cloud or self-hosted, with custom model training available for regulated industries. Pricing is metered per minute of audio for transcription and per thousand characters for synthesis, with a published Pay-As-You-Go tier and a committed Growth tier. The positioning is unapologetically enterprise: high-throughput, low-latency, contact-center and voice-agent workloads where milliseconds and per-minute cost both matter.
What is Deepgram used for?
Common use cases: Powering real-time voice agents that combine STT, LLM reasoning and TTS in one Flux call; Transcribing high-volume call-center recordings with redaction and diarization; Generating natural-sounding voice responses with Aura-2 for IVR and assistant products; Running self-hosted ASR in regulated environments where audio cannot leave the network.
How much does Deepgram cost?
Pay-As-You-Go Nova-3 streaming at $0.0048/min monolingual and $0.0058/min multilingual; pre-recorded at $0.0077/min and $0.0092/min respectively. Growth tier (from $4K+/year prepaid) drops streaming to $0.0042/min and pre-recorded to $0.0065/min, saving up to 20%. Aura-2 TTS at $0.030 per 1k characters PAYG, $0.027 on Growth; Aura-1 at $0.0150 per 1k characters. New accounts get $200 free credit. Enterprise pricing for large-volume or self-hosted deployments is contact-sales.
Who is Deepgram best for?
Deepgram fits Voice-AI product teams shipping conversational agents at production scale, Contact-center platforms needing low-latency streaming transcription, Healthcare, finance and government buyers requiring on-prem or VPC deployment, Developers who want a single API for both ASR and TTS rather than stitching vendors. Right for you if you are building a voice agent, transcribing call-center audio at scale, or need self-hosted ASR for compliance reasons. Skip if you only transcribe a handful of meetings a week or want a finished consumer transcription product rather than a developer API.
What are alternatives to Deepgram?
Common alternatives to Deepgram include Retell AI, Vapi, Bland AI, Synthflow, ElevenLabs Conversational AI, PolyAI.