Skip to main content

Voice Notes Quickstart Guide

Add voice recording and transcription to your web app in 15 minutes. Install the VocaFuse JavaScript SDK, record audio in the browser, and receive transcriptions via webhook. Works with React, Vue, vanilla JavaScript, Python, and Node.js.

Prerequisites

  • Node.js 16.x or higher
  • Python 3.7+
  • VocaFuse API keys (get keys)

Step 1: Install SDKs

JavaScript SDK

Install the frontend SDK:

npm install @vocafuse/frontend-sdk

Python SDK

Install the backend SDK:

pip install vocafuse

Step 2: Environment Setup

Create a .env file:

VOCAFUSE_API_KEY=sk_live_your_api_key_here
VOCAFUSE_API_SECRET=your_api_secret_here
VOCAFUSE_WEBHOOK_SECRET=your_webhook_secret_here
warning

Never commit your .env file to version control!

Step 3: Backend Token Endpoint

Create an endpoint that generates JWT tokens for authenticated users:

from flask import Flask, jsonify
from vocafuse import AccessToken
import os

app = Flask(__name__)

@app.route('/api/vocafuse-token', methods=['POST'])
def generate_token():
user_id = get_current_user_id() # Your auth logic

token = AccessToken(
api_key=os.environ['VOCAFUSE_API_KEY'],
api_secret=os.environ['VOCAFUSE_API_SECRET'],
identity=str(user_id)
)

return jsonify(token(expires_in=3600))

Step 4: Frontend Integration

Initialize the SDK and add voice note functionality:

import { VocaFuseSDK } from '@vocafuse/frontend-sdk';

const sdk = new VocaFuseSDK({ tokenEndpoint: '/api/vocafuse-token' });
await sdk.init();

// Create a recorder instance you control
const recorder = sdk.createRecorder({
maxDuration: 60,
onComplete: (result) => console.log('Uploaded:', result.voicenote_id),
onError: (e) => console.error('Error:', e)
});

// Wire up any UI
const btn = document.querySelector('#record-btn');
btn.addEventListener('click', async () => {
if (recorder.isRecording) {
await recorder.stop(); // auto-uploads by default
btn.textContent = 'Start Recording';
} else {
await recorder.start();
btn.textContent = 'Stop & Upload';
}
});

Step 5: Webhook Handler

Create an endpoint to receive transcription results:

from vocafuse import RequestValidator

validator = RequestValidator(os.environ['VOCAFUSE_WEBHOOK_SECRET'])

@app.route('/api/webhooks/vocafuse', methods=['POST'])
def handle_webhook():
payload = request.get_data(as_text=True)
signature = request.headers.get('X-VocaFuse-Signature')

if not validator.validate(payload, signature):
return jsonify({'error': 'Invalid signature'}), 401

data = request.get_json()

if data['event'] == 'voicenote.completed':
transcription = client.voicenotes(data['voicenote']['id']).transcription.get()
print(f"Transcription: {transcription['data']['text']}")

return jsonify({'status': 'received'}), 200

Frequently Asked Questions

How long does it take to add voice notes to my app?

About 15 minutes with VocaFuse. Building voice recording, transcription, and webhook delivery yourself typically takes 2-4 weeks—handling browser compatibility, secure file upload, transcription API integration, and webhook infrastructure.

Can I use VocaFuse with React or Vue?

Yes! VocaFuse works with React, Vue, Svelte, Angular, and vanilla JavaScript. The SDK integrates seamlessly with any frontend framework. Check out React and Vue examples for implementation patterns.

What backends does VocaFuse support?

VocaFuse works with any backend that can generate tokens and receive webhooks—Python (Flask/Django), Node.js (Express), Ruby on Rails, PHP, and more. The Python SDK simplifies server-side integration.

Is there a free plan to try VocaFuse?

Yes! The free tier includes 100 voice note transcriptions per month with no credit card required. Sign up to start building immediately.

Next Steps