Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessDoes GPT-2 Have a Fear Direction?lesswrong.comY Combinator's CEO says he ships 37,000 lines of AI code per dayHacker News AI TopShow HN: SpeechSDK – free, open-source SDK that unifies all AI voice modelsHacker News AI TopWe Ditched LangChain. Here’s What We Built Instead — and Why It’s Better for Serious AI Research.Medium AII Broke Up With ChatGPT (And My Productivity Thanked Me)Medium AIAI startup envisions '100M new people' making videogamesHacker News AI TopMost Students Think ChatGPT Helps Them Study — Here’s Why It Actually Slows Them Down (And How to…Medium AIWhen the server crashes the soulMedium AIDeepfakes and malware: AI menu grows longer for threat actors, causing headaches for defenders - SiliconANGLEGNews AI deepfakeAMD vs. Nvidia: The AI Supercycle Is Big Enough for Both. Here's the Better Buy. - The Motley FoolGNews AI NVIDIAThe AI That Refuses to Advise, And Why That Changes EverythingMedium AICan AI Replace Content Writers? My Honest OpinionMedium AIBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessDoes GPT-2 Have a Fear Direction?lesswrong.comY Combinator's CEO says he ships 37,000 lines of AI code per dayHacker News AI TopShow HN: SpeechSDK – free, open-source SDK that unifies all AI voice modelsHacker News AI TopWe Ditched LangChain. Here’s What We Built Instead — and Why It’s Better for Serious AI Research.Medium AII Broke Up With ChatGPT (And My Productivity Thanked Me)Medium AIAI startup envisions '100M new people' making videogamesHacker News AI TopMost Students Think ChatGPT Helps Them Study — Here’s Why It Actually Slows Them Down (And How to…Medium AIWhen the server crashes the soulMedium AIDeepfakes and malware: AI menu grows longer for threat actors, causing headaches for defenders - SiliconANGLEGNews AI deepfakeAMD vs. Nvidia: The AI Supercycle Is Big Enough for Both. Here's the Better Buy. - The Motley FoolGNews AI NVIDIAThe AI That Refuses to Advise, And Why That Changes EverythingMedium AICan AI Replace Content Writers? My Honest OpinionMedium AI
AI NEWS HUBbyEIGENVECTOREigenvector

Show HN: SpeechSDK – free, open-source SDK that unifies all AI voice models

Hacker News AI Topby PiersonMarksApril 3, 20261 min read0 views
Source Quiz

Article URL: https://www.speechsdk.dev/ Comments URL: https://news.ycombinator.com/item?id=47633441 Points: 3 # Comments: 0

The Unified Text-to-Speech SDK

The SpeechSDK is a free, open-source toolkit for building AI audio applications with multiple voice providers.

12

Providers

25+

Models

Built

For Production

Open Source

MIT License

Multi-Provider

One interface across OpenAI, ElevenLabs, Deepgram, Cartesia, Google, Mistral, Hume, and more. Unified model strings, consistent response format, BYO API keys.

Cross-Platform

Runs everywhere — Node.js, Edge runtimes, and the browser. Same API, zero platform-specific code.

Node.jsEdgeBrowser

Minimal Dependencies

Lightweight by design. Built-in retries, typed errors, and lazy base64 encoding. No heavy frameworks.

AI Engineering

For Production Voice Applications

Lazy base64 conversion

Only computes the format you access — uint8Array or base64 — and caches it. No unnecessary encoding or wasted memory.

Content-type awareness

The mediaType is read directly from each provider's response headers. You always know the actual audio format — MP3 from OpenAI, WAV from Cartesia, etc.

Custom fetch & Base URL

Every provider accepts a custom fetch and baseURL. Point at OpenAI-compatible proxies, Azure OpenAI, LiteLLM, or local models. Swap in undici, a proxy-aware fetch, or a mock.

Smart retries

Built-in retry with exponential backoff via p-retry. Retries 5xx and network errors automatically. 4xx errors (auth failures, bad requests) abort immediately — no wasted time.

Zero runtime dependencies

Only dependency is p-retry. The SDK uses raw fetch and Uint8Array — no heavy audio libraries, no provider SDK wrappers. Works anywhere fetch works.

Works seamlessly with Speech Gateway

Speech Gateway adds production infrastructure — queuing, quality processing, voice management, and analytics. One config change to connect. Coming Soon.

ProviderModel StringDefaultOpenAIopenai/gpt-4o-mini-ttsYesOpenAIopenai/tts-1—OpenAIopenai/tts-1-hd—ElevenLabselevenlabs/eleven_multilingual_v2YesElevenLabselevenlabs/eleven_v3—ElevenLabselevenlabs/eleven_flash_v2_5—ElevenLabselevenlabs/eleven_flash_v2—Deepgramdeepgram/aura-2YesCartesiacartesia/sonic-3YesHumehume/octave-2YesGooglegoogle/gemini-2.5-flash-preview-ttsYesGooglegoogle/gemini-2.5-pro-preview-tts—Fish Audiofish-audio/s2-proYesUnreal Speechunreal-speech/defaultYesMurfmurf/GEN2YesResembleresemble/defaultYesfalfal-ai/—Mistralmistral/voxtral-mini-tts-2603Yes

  • Pass just the provider name to use its default model — e.g. model: 'openai' resolves to openai/gpt-4o-mini-tts.

Frequently asked questions

Each provider has its own SDK, request format, auth pattern, and response shape. SpeechSDK gives you one interface for all of them — same function call, same result type, same error handling. Switch providers by simply changing a model string.

One SDK, every provider. Add text-to-speech to your app in minutes with a unified, open-source interface.

Original source

Hacker News AI Top

https://www.speechsdk.dev/
Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Show HN: Sp…modelopen-sourceHacker News…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 129 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models