Speech-to-Text

Convert speech to text using AI

Overview

Transcribe audio and video files to text using leading AI providers. Supports multiple languages, timestamps, and speaker diarization.

Setup

Add the Speech-to-Text block to your workflow
Select your preferred provider
Enter the provider's API key
Upload an audio/video file or provide a URL

Providers

Provider	Models	Features
OpenAI Whisper	whisper-1	Translation to English
Deepgram	nova-3, nova-2, enhanced, base	Speaker diarization
ElevenLabs	scribe_v1	High accuracy
AssemblyAI	best, nano	Sentiment analysis, entity detection, PII redaction, summarization
Google Gemini	gemini-2.0-flash, gemini-2.5-pro	Large file support

Configuration

Parameter	Type	Required	Description
`provider`	dropdown	Yes	STT provider
`model`	dropdown	Yes	Provider-specific model
`apiKey`	string	Yes	Provider API key
`audioFile`	file	Yes	Audio/video file to transcribe
`audioUrl`	string	No	Publicly accessible audio/video URL
`language`	dropdown	Yes	Language code or auto-detect
`timestamps`	dropdown	Yes	Timestamp granularity: `none`, `sentence`, `word`
`diarization`	boolean	No	Speaker diarization (Deepgram/AssemblyAI)
`translateToEnglish`	boolean	No	Translate to English (Whisper only)

AssemblyAI-specific Options

Parameter	Type	Description
`sentiment`	boolean	Enable sentiment analysis
`entityDetection`	boolean	Enable entity detection
`piiRedaction`	boolean	Enable PII redaction
`summarization`	boolean	Enable auto-summarization

Output

Parameter	Type	Description
`transcript`	string	Full transcribed text
`segments`	array	Timestamped segments with speaker labels
`language`	string	Detected or specified language
`duration`	number	Audio duration in seconds
`confidence`	number	Confidence score (Deepgram/AssemblyAI/Gemini)
`sentiment`	array	Sentiment results (AssemblyAI only)
`entities`	array	Detected entities (AssemblyAI only)
`summary`	string	Auto-generated summary (AssemblyAI only)

Supported Formats

Audio: MP3, M4A, WAV, WebM, OGG, FLAC, AAC, OPUS

Video: MP4, MOV, AVI, MKV

Notes

Category: tools
Type: stt
20+ language options available

Stripe Supabase

On this page

On this page

AssemblyAI-specific Options

Supported Formats