Text-to-Speech Stream V2

Convert text to speech using advanced TTS providers and stream the audio response. Code blocks and inline code are automatically processed for better speech generation.

This feature requires special activation. Please contact our support team to enable TTS for your account.

curl --request POST \
  --url https://api.gurubase.io/api/v1/{guru_slug}/text-to-speech-v2/stream/ \  
  --header 'x-api-key: YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "text": "Hello, this is a test message using Gurubase text to speech"
  }'

Path Parameters

guru_type

string

required

The guru type identifier for the text-to-speech request

Headers

x-api-key

string

required

Your API key for authentication. You can obtain your API key from the Gurubase.io dashboard.

Body Parameters

text

string

required

The text to convert to speech

Response

The response is a streaming audio file in MP3 format.

audio_stream

audio/mpeg

Streaming audio response in MP3 format

{
  "msg": "No text provided"
}

Streaming Response

The endpoint returns a streaming audio response with the following headers:

Content-Type: audio/mpeg
Cache-Control: no-cache
X-Accel-Buffering: no (disables nginx buffering for real-time streaming)

Code Examples

The following examples show how to implement streaming TTS in your web application. These are complete, working examples that you can use immediately.

Streaming the TTS

This example demonstrates how to send a request and start streaming without waiting for the whole response to finish. It is a full HTML + JS example that you can run immediately. Setup Instructions:

Save the HTML code below as index.html
Save the JavaScript code as tts.js in the same directory
Update the constants in tts.js with your API key and guru slug
Open index.html in your browser to test

Make sure both files are in the same directory so the script can load properly. The HTML file references ./tts.js - update this path if you place the JavaScript file elsewhere.

index.html

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Gurubase TTS Demo</title>
    <style>
      * {
        box-sizing: border-box;
      }
      
      body {
        font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
        max-width: 600px;
        margin: 0 auto;
        padding: 40px 20px;
        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
        min-height: 100vh;
        color: #333;
      }

      .container {
        background: white;
        border-radius: 16px;
        padding: 32px;
        box-shadow: 0 20px 40px rgba(0,0,0,0.1);
      }

      h1 {
        text-align: center;
        margin: 0 0 32px 0;
        color: #2d3748;
        font-weight: 600;
        font-size: 28px;
      }

      .input-group {
        margin-bottom: 24px;
      }

      label {
        display: block;
        margin-bottom: 8px;
        font-weight: 500;
        color: #4a5568;
      }

      textarea {
        width: 100%;
        padding: 12px 16px;
        border: 2px solid #e2e8f0;
        border-radius: 12px;
        font-size: 16px;
        font-family: inherit;
        resize: vertical;
        min-height: 120px;
        transition: all 0.2s ease;
      }

      textarea:focus {
        outline: none;
        border-color: #667eea;
        box-shadow: 0 0 0 3px rgba(102, 126, 234, 0.1);
      }

      .controls {
        display: flex;
        gap: 12px;
        align-items: center;
        margin-bottom: 24px;
      }

      .btn {
        padding: 12px 24px;
        border: none;
        border-radius: 12px;
        font-size: 16px;
        font-weight: 500;
        cursor: pointer;
        transition: all 0.2s ease;
        display: inline-flex;
        align-items: center;
        gap: 8px;
      }

      .btn-primary {
        background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
        color: white;
      }

      .btn-primary:hover:not(:disabled) {
        transform: translateY(-2px);
        box-shadow: 0 10px 20px rgba(102, 126, 234, 0.3);
      }

      .btn-secondary {
        background: #f7fafc;
        color: #4a5568;
        border: 2px solid #e2e8f0;
      }

      .btn-secondary:hover:not(:disabled) {
        background: #edf2f7;
        border-color: #cbd5e0;
      }

      .btn:disabled {
        opacity: 0.6;
        cursor: not-allowed;
      }

      .status {
        display: inline-flex;
        align-items: center;
        gap: 8px;
        font-size: 14px;
        color: #718096;
        font-weight: 500;
      }

      .status.loading {
        color: #667eea;
      }

      .status.error {
        color: #e53e3e;
      }

      .spinner {
        width: 16px;
        height: 16px;
        border: 2px solid #e2e8f0;
        border-top: 2px solid #667eea;
        border-radius: 50%;
        animation: spin 1s linear infinite;
      }

      @keyframes spin {
        0% { transform: rotate(0deg); }
        100% { transform: rotate(360deg); }
      }

      .audio-container {
        background: #f7fafc;
        border-radius: 12px;
        padding: 16px;
        margin-top: 16px;
      }

      audio {
        width: 100%;
      }

      .error-message {
        background: #fed7d7;
        color: #c53030;
        padding: 12px 16px;
        border-radius: 8px;
        margin-bottom: 16px;
        font-size: 14px;
        display: none;
      }
    </style>
  </head>
  <body>
    <div class="container">
      <h1>Text to Speech Demo - Gurubase</h1>
      
      <div id="errorMessage" class="error-message"></div>

      <div class="input-group">
        <label for="text">Enter text to convert to speech:</label>
        <textarea id="text" placeholder="Type your text here...">Hello, this is an example for streaming TTS using Gurubase API.</textarea>
      </div>

      <div class="controls">
        <button id="generateBtn" class="btn btn-primary">
          🎵 Generate Speech
        </button>
        <button id="stopBtn" class="btn btn-secondary" style="display: none;">
          ⏹️ Stop
        </button>
        <div id="status" class="status"></div>
      </div>

      <div class="audio-container" id="audioContainer" style="display: none;">
        <audio id="tts" controls></audio>
      </div>
    </div>

    <script src="./tts.js"></script>
  </body>
</html>

tts.js

// ====== Constants - UPDATE THESE VALUES ======
const API_KEY = 'your-api-key-here';  // Get from https://gurubase.io/api-keys
const BASE_URL = 'https://api.gurubase.io/api/v1';
const GURU_SLUG = 'your-guru-slug';   // Replace with your actual guru slug

const API_URL = `${BASE_URL}/${GURU_SLUG}/text-to-speech-v2/stream/`;

// ====== DOM Elements ======
const audioEl = document.getElementById("tts");
const generateBtn = document.getElementById("generateBtn");
const stopBtn = document.getElementById("stopBtn");
const textEl = document.getElementById("text");
const statusEl = document.getElementById("status");
const errorMessageEl = document.getElementById("errorMessage");
const audioContainer = document.getElementById("audioContainer");

// ====== State ======
let abortCtrl = null;
let mediaSource = null;
let sourceBuffer = null;
let reader = null;
let queue = [];
let isGenerating = false;

// ====== UI Helper Functions ======
function showError(message) {
  errorMessageEl.textContent = message;
  errorMessageEl.style.display = 'block';
  setStatus('error', 'Error occurred');
}

function hideError() {
  errorMessageEl.style.display = 'none';
}

function setStatus(type = '', message = '') {
  statusEl.className = `status ${type}`;
  if (type === 'loading') {
    statusEl.innerHTML = `<div class="spinner"></div> ${message}`;
  } else {
    statusEl.textContent = message;
  }
}

function updateButtons(generating) {
  isGenerating = generating;
  generateBtn.disabled = generating;
  generateBtn.innerHTML = generating ? '⏳ Generating...' : '🎵 Generate Speech';
  stopBtn.style.display = generating ? 'inline-flex' : 'none';
}

function resetAudio() {
  try {
    audioEl.pause();
  } catch {}
  
  if (abortCtrl) {
    try {
      abortCtrl.abort();
    } catch {}
  }
  
  if (mediaSource && mediaSource.readyState === "open") {
    try {
      mediaSource.endOfStream();
    } catch {}
  }
  
  audioEl.removeAttribute("src");
  audioEl.load();

  // Reset state
  abortCtrl = null;
  mediaSource = null;
  sourceBuffer = null;
  reader = null;
  queue = [];
  
  // Update UI
  updateButtons(false);
  setStatus('', '');
  audioContainer.style.display = 'none';
}

// ====== Media Source & Streaming Functions ======
function ensureMediaSource() {
  return new Promise((resolve, reject) => {
    mediaSource = new MediaSource();
    audioEl.src = URL.createObjectURL(mediaSource);
    mediaSource.addEventListener(
      "sourceopen",
      () => {
        try {
          sourceBuffer = mediaSource.addSourceBuffer("audio/mpeg");
          sourceBuffer.mode = "sequence";
          sourceBuffer.addEventListener("updateend", onUpdateEnd);
          resolve();
        } catch (e) {
          reject(e);
        }
      },
      { once: true }
    );
  });
}

function onUpdateEnd() {
  if (queue.length && !sourceBuffer.updating) {
    sourceBuffer.appendBuffer(queue.shift());
  } else if (
    !isGenerating &&
    !sourceBuffer.updating &&
    mediaSource?.readyState === "open"
  ) {
    try {
      mediaSource.endOfStream();
      setStatus('', 'Ready to play');
    } catch {}
  }
}

async function startNetworkStream(text) {
  abortCtrl = new AbortController();
  
  try {
    const resp = await fetch(API_URL, {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        "X-API-Key": API_KEY
      },
      body: JSON.stringify({ text: text || " " }),
      signal: abortCtrl.signal
    });

    if (!resp.ok) {
      const errorText = await resp.text().catch(() => 'Unknown error');
      throw new Error(`API Error (${resp.status}): ${errorText}`);
    }

    if (!resp.body) {
      throw new Error('No response body received from API');
    }

    reader = resp.body.getReader();
    setStatus('loading', 'Streaming audio...');
    
    // Start pumping data (keep UI responsive)
    pump().catch((err) => {
      if (err.name !== "AbortError") {
        showError(`Streaming error: ${err.message}`);
      }
    });
    
  } catch (error) {
    if (error.name !== "AbortError") {
      throw error;
    }
  }
}

async function pump() {
  try {
    while (true) {
      const { done, value } = await reader.read();

      if (done) break;
      
      if (value && value.byteLength) {
        if (!sourceBuffer || sourceBuffer.updating || queue.length) {
          queue.push(value.buffer);
        } else {
          try {
            sourceBuffer.appendBuffer(value.buffer);
          } catch {
            queue.push(value.buffer);
          }
        }
      }
    }
  } catch (e) {
    if (e.name !== "AbortError") {
      throw e;
    }
  } finally {
    updateButtons(false);
    if (
      sourceBuffer &&
      !sourceBuffer.updating &&
      mediaSource?.readyState === "open"
    ) {
      try {
        mediaSource.endOfStream();
        setStatus('', 'Audio ready');
      } catch {}
    }
  }
}

// ====== Main Functions ======
async function generateSpeech() {
  if (isGenerating) return;
  
  const text = textEl.value.trim();
  if (!text) {
    showError('Please enter some text to convert to speech');
    return;
  }

  hideError();
  updateButtons(true);
  setStatus('loading', 'Preparing audio...');

  try {
    // Reset any previous audio
    resetAudio();
    updateButtons(true); // Reset might have changed button state
    
    // Create MediaSource & SourceBuffer
    await ensureMediaSource();
    
    // Start network stream
    await startNetworkStream(text);
    
    // Show audio player and start playback
    audioContainer.style.display = 'block';
    await audioEl.play().catch(() => {
      setStatus('', 'Click play to listen');
    });
    
  } catch (error) {
    console.error('Generation error:', error);
    showError(error.message);
    updateButtons(false);
  }
}

function stopGeneration() {
  if (!isGenerating) return;
  
  try {
    if (abortCtrl) {
      abortCtrl.abort();
    }
  } catch {}
  
  updateButtons(false);
  setStatus('', 'Stopped');
}

// ====== Event Listeners ======
generateBtn.addEventListener("click", generateSpeech);
stopBtn.addEventListener("click", stopGeneration);

// Auto-focus text area on load
textEl.focus();

Testing Tips:

Ensure your browser supports the MediaSource Extensions API (most modern browsers do)
If you encounter CORS issues, make sure you’re testing from a proper web server, not just opening the HTML file directly
The audio will start playing as soon as data begins streaming from the server
Check the browser console for any error messages if the audio doesn’t play

API Documentation

Endpoints

Text-to-Speech Stream V2

Path Parameters

Headers

Body Parameters

Response

Streaming Response

Code Examples

Streaming the TTS

API Documentation

Endpoints

​Path Parameters

​Headers

​Body Parameters

​Response

​Streaming Response

​Code Examples

​Streaming the TTS

Path Parameters

Headers

Body Parameters

Response

Streaming Response

Code Examples

Streaming the TTS