Voice Notes Quickstart Guide
Add voice recording and transcription to your web app in 15 minutes. Install the VocaFuse JavaScript SDK, record audio in the browser, and receive transcriptions via webhook. Works with React, Vue, vanilla JavaScript, Python, and Node.js.
Prerequisites
- Node.js 16.x or higher
 - Python 3.7+
 - VocaFuse API keys (get keys)
 
Step 1: Install SDKs
JavaScript SDK
Install the frontend SDK:
- npm
 - yarn
 - pnpm
 
npm install @vocafuse/frontend-sdk
yarn add @vocafuse/frontend-sdk
pnpm add @vocafuse/frontend-sdk
Python SDK
Install the backend SDK:
pip install vocafuse
Step 2: Environment Setup
Create a .env file:
VOCAFUSE_API_KEY=sk_live_your_api_key_here
VOCAFUSE_API_SECRET=your_api_secret_here
VOCAFUSE_WEBHOOK_SECRET=your_webhook_secret_here
Never commit your .env file to version control!
Step 3: Backend Token Endpoint
Create an endpoint that generates JWT tokens for authenticated users:
- Python
 - Node.js
 
from flask import Flask, jsonify
from vocafuse import AccessToken
import os
app = Flask(__name__)
@app.route('/api/vocafuse-token', methods=['POST'])
def generate_token():
    user_id = get_current_user_id()  # Your auth logic
    
    token = AccessToken(
        api_key=os.environ['VOCAFUSE_API_KEY'],
        api_secret=os.environ['VOCAFUSE_API_SECRET'],
        identity=str(user_id)
    )
    
    return jsonify(token(expires_in=3600))
const express = require('express');
const { AccessToken } = require('@vocafuse/backend-sdk');
const app = express();
app.post('/api/vocafuse-token', async (req, res) => {
    const userId = getCurrentUserId(); // Your auth logic
    
    const token = new AccessToken(
        process.env.VOCAFUSE_API_KEY,
        process.env.VOCAFUSE_API_SECRET,
        userId
    );
    
    res.json(await token.generate({ expiresIn: 3600 }));
});
Step 4: Frontend Integration
Initialize the SDK and add voice note functionality:
- Vanilla JS
 - React
 
import { VocaFuseSDK } from '@vocafuse/frontend-sdk';
const sdk = new VocaFuseSDK({ tokenEndpoint: '/api/vocafuse-token' });
await sdk.init();
// Create a recorder instance you control
const recorder = sdk.createRecorder({
  maxDuration: 60,
  onComplete: (result) => console.log('Uploaded:', result.voicenote_id),
  onError: (e) => console.error('Error:', e)
});
// Wire up any UI
const btn = document.querySelector('#record-btn');
btn.addEventListener('click', async () => {
  if (recorder.isRecording) {
    await recorder.stop(); // auto-uploads by default
    btn.textContent = 'Start Recording';
  } else {
    await recorder.start();
    btn.textContent = 'Stop & Upload';
  }
});
import { useEffect, useState } from 'react';
import { VocaFuseSDK } from '@vocafuse/frontend-sdk';
function VoiceRecorder() {
  const [recorder, setRecorder] = useState(null);
  const [status, setStatus] = useState('idle');
  useEffect(() => {
    (async () => {
      const sdk = new VocaFuseSDK({ tokenEndpoint: '/api/vocafuse-token' });
      await sdk.init();
      setRecorder(sdk.createRecorder({ onStateChange: setStatus }));
    })();
  }, []);
  if (!recorder) return <button disabled>Loading…</button>;
  const toggle = async () => {
    if (recorder.isRecording) {
      await recorder.stop();
    } else {
      await recorder.start();
    }
  };
  return (
    <button onClick={toggle}>
      {recorder.isRecording ? 'Stop & Upload' : 'Start Recording'} ({status})
    </button>
  );
}
Step 5: Webhook Handler
Create an endpoint to receive transcription results:
- Python
 - Node.js
 
from vocafuse import RequestValidator
validator = RequestValidator(os.environ['VOCAFUSE_WEBHOOK_SECRET'])
@app.route('/api/webhooks/vocafuse', methods=['POST'])
def handle_webhook():
    payload = request.get_data(as_text=True)
    signature = request.headers.get('X-VocaFuse-Signature')
    
    if not validator.validate(payload, signature):
        return jsonify({'error': 'Invalid signature'}), 401
    
    data = request.get_json()
    
    if data['event'] == 'voicenote.completed':
        transcription = client.voicenotes(data['voicenote']['id']).transcription.get()
        print(f"Transcription: {transcription['data']['text']}")
    
    return jsonify({'status': 'received'}), 200
const { RequestValidator } = require('@vocafuse/backend-sdk');
const validator = new RequestValidator(process.env.VOCAFUSE_WEBHOOK_SECRET);
app.post('/api/webhooks/vocafuse', (req, res) => {
    const payload = JSON.stringify(req.body);
    const signature = req.headers['x-vocafuse-signature'];
    
    if (!validator.validate(payload, signature)) {
        return res.status(401).json({ error: 'Invalid signature' });
    }
    
    if (req.body.event === 'voicenote.completed') {
        console.log('Transcription:', req.body.voicenote.transcription);
    }
    
    res.json({ status: 'received' });
});
Frequently Asked Questions
How long does it take to add voice notes to my app?
About 15 minutes with VocaFuse. Building voice recording, transcription, and webhook delivery yourself typically takes 2-4 weeks—handling browser compatibility, secure file upload, transcription API integration, and webhook infrastructure.
Can I use VocaFuse with React or Vue?
Yes! VocaFuse works with React, Vue, Svelte, Angular, and vanilla JavaScript. The SDK integrates seamlessly with any frontend framework. Check out React and Vue examples for implementation patterns.
What backends does VocaFuse support?
VocaFuse works with any backend that can generate tokens and receive webhooks—Python (Flask/Django), Node.js (Express), Ruby on Rails, PHP, and more. The Python SDK simplifies server-side integration.
Is there a free plan to try VocaFuse?
Yes! The free tier includes 100 voice note transcriptions per month with no credit card required. Sign up to start building immediately.
Next Steps
- API Reference - Explore all available endpoints
 - Python SDK - Learn advanced backend features
 - JavaScript SDK - Customize the frontend experience
 - Webhooks - Production webhook patterns