LingoX API Documentation

This document provides comprehensive instructions on how to interact with the LingoX real-time translation API.

Back to App

REST API Endpoints

GET /api/v1/languages

Fetches the list of supported languages, along with default source and target languages for UI configuration.

Success Response (200)

{
  "supported_languages": [
    ["en-US", "English"],
    ["vi", "Vietnamese"],
    ...
  ],
  "default_source": "en-US",
  "default_target": "vi"
}

Error Response (503)

{
  "error": "Service temporarily unavailable."
}

Languages Table

Language Code Language Name

POST /api/v1/session/create

Creates a session (a room) for a Multi-Device Mode conversation. Sessions expire after 15 minutes (900 seconds) of inactivity.

Request Body

{
  "language": "en-US",
  "translate_language": "vi"
}

Success Response (200)

{
  "conversation_id": "a-unique-uuid-string",
  "join_url": "http://your-host/join/a-unique-uuid-string"
}

Error Response (503)

{
  "error": "Redis service is unavailable."
}

GET /api/v1/session/{conversation_id}

Retrieves details for an existing multi-device session, primarily to check the initiator's language settings.

Success Response (200)

{
  "user_a_lang": "en-US",
  "user_a_translate": "vi"
}

Error Response (404)

{
  "error": "Session not found."
}

Error Response (503)

{
  "error": "Redis service is unavailable."
}

WebSocket Endpoints & Data Formats

Flow 1: Personal & Single-Device Modes

These modes use a single, stateful WebSocket connection. The client sends a configuration message to set the languages and can send a new one at any time to change them.

Endpoint

wss://your-domain.com/api/v1/ws/single

Sequence of Events

  1. Client establishes a WebSocket connection.
  2. Client sends a JSON `config` message to set the source and target languages.
  3. Client begins streaming binary audio data.
  4. Server streams back JSON messages containing transcription and translation results.
  5. To switch languages, the client sends a new `config` message and then resumes streaming audio.

Client-to-Server Messages

Server-to-Client Messages (JSON)

{
  "is_final": false,
  "original": "The transcribed text from the source language.",
  "translation": "The translated text in the target language."
}

Flow 2: Multi-Device Mode

This mode uses a REST endpoint to create a session and then a unique WebSocket endpoint for each participant in that session.

Endpoint

wss://your-domain.com/api/v1/ws/{conversation_id}?language={user_language_code}

Example: wss://localhost:5566/api/v1/ws/xyz-123?language=en-US

Connection Errors

Query Parameters

Sequence of Events

  1. Client A (Initiator) calls `POST /api/v1/session/create` to get a `conversation_id`.
  2. Client A connects to the WebSocket endpoint using the `conversation_id` and their chosen language as a query parameter.
  3. Client B (Joiner) gets the `conversation_id` (e.g., via QR code or link) and connects to the same WebSocket endpoint with their chosen language.
  4. The server notifies clients when users join or leave via system messages.
  5. When a client sends audio, the server sends interim transcripts back to the speaker and the final, translated transcript to both participants.

Client-to-Server Messages

Server-to-Client Messages (JSON)

System Events:

// Notifies clients of connection status changes
{
  "type": "user_joined" | "user_left",
  "client_count": 2
}

// Informs a client of their partner's language upon connection
{
  "type": "partner_language",
  "lang": "vi"
}

Transcription/Translation:

{
  "is_mine": true,
  "is_interim": false,
  "original": "The transcribed text.",
  "translation": "The translated text (or original if no translation needed)."
}