Audio & Voice
Text-to-speech, voice cloning, transcription, and music generation
10 tools
Text-to-speech, voice cloning, transcription, and music generation
10 tools
We use analytics cookies (PostHog) including session replay and heatmaps to improve AI Jungle. No tracking happens before you choose. or read our Cookie Policy.
Deploy real-time AI agents that talk, type, and take action with voice synthesis, multimodal capabilities, and enterprise integration for customer support and business automation.
Deepgram helps developers build conversational AI using enterprise voice AI APIs for speech-to-text, text-to-speech, and real-time voice agent processing.
Cartesia Sonic is a real-time text-to-speech API with ultra-low latency (90ms) generating natural, expressive voices with laughter and emotion controls across 40+ languages for AI voice agents.
Noiz AI is a text-to-speech platform that generates emotionally expressive voice output using emoji-based tone control. It enables users to create natural, nuanced voices for storytelling and messaging.
Beatoven.ai generates royalty-free background music and soundscapes for videos and podcasts using AI composition technology.
Play.ht generates natural-sounding AI voiceovers from text, offering multiple voice options for content creators, marketers, and developers.
Krisp uses AI to remove background noise and distractions from audio during calls and recordings, improving voice quality in virtual meetings.
Synthwave is an AI music creation platform that generates original compositions and soundtracks in various genres for creative projects.
Vocapia uses AI to transcribe and index audio and video content for search and accessibility, helping organizations make media searchable.
Udio is an AI music generation platform that enables creators to compose, customize, and generate original music tracks using artificial intelligence.