Skip to main content

Voice Notes API Documentation

Add voice recording and transcription to your web app in minutes. VocaFuse handles browser audio recording, secure upload, automatic transcription via OpenAI Whisper, and webhook delivery. Works with React, Vue, vanilla JavaScript, Python, and Node.js.

What is VocaFuse?

VocaFuse provides a complete voice notes solution for web applications:

  • 🎤 Browser Recording - Pre-built recording widget for any web app
  • ☁️ Secure Upload - Automatic file upload with authentication
  • 📝 Auto-Transcription - OpenAI Whisper transcription (20-30s)
  • 🔔 Webhook Delivery - Real-time transcription delivery to your backend
  • 🔒 Production-Ready - Secure, scalable, and reliable

Example Use Cases

  • Note-Taking Apps - Voice-to-text note capture
  • Healthcare - Patient notes and clinical documentation
  • Education - Lecture transcription and student notes
  • CRM Systems - Sales call notes and customer interactions
  • Productivity Tools - Voice memos and task capture

Frequently Asked Questions

Can I add voice notes without a DevOps team?

Yes! VocaFuse is designed for small teams and solo developers. You don't need to manage infrastructure, storage, queues, or scaling. Just integrate the SDK and receive transcriptions via webhook—no DevOps expertise required.

How is this different from using OpenAI Whisper directly?

OpenAI Whisper handles transcription, but you still need to build: secure file upload, audio storage, webhook delivery with retries, multi-tenant authentication, and monitoring. VocaFuse provides the complete production infrastructure around Whisper.

Do I need to handle audio file storage?

No. VocaFuse handles secure upload, storage, and cleanup automatically. You receive transcriptions via webhook and can optionally retrieve audio files through the API—no S3 buckets or storage management needed.

Get Your API Keys

Ready to get started? Sign up for an account to get your API keys.