ITH-Backend

Home /
ITH-Backend

Info

Currently this page is only available in English.

What does the ITH-Backend do?

Provides API for ITH session
Handels transcription and speech generation
Computes mouth movements and facial expressions
Serves all nessesary files for the client.

Hosting ITH-Backend with Docker

1. Install docker

If docker is not alreay installed install it from docker.com.

Linux
Windows, Mac

Install Docker Engine.

Install Docker Desktop.

1. Create docker-compose.yaml

services:
  caddy:
    image: caddy:latest
    container_name: caddy
    restart: unless-stopped
    network_mode: host
    ports:
      - "443:443"
      - "80:80"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile
  ith-backend:
    build:
      context: .
    ports:
      - "8000:8000"
    volumes:
      - ./assets:/app/assets
      - ./configs/.docker.env:/app/configs/.env
      - ./static:/app/static
      - ./templates:/app/templates
    env_file:
      - ./configs/.secret_keys.env
  redis:
    image: redis:latest
    container_name: redis
    ports:
      - "6379:6379"
    command: [ "redis-server", "--requirepass", "123456" ]
  influxdb:
    image: influxdb:2.7
    container_name: influxdb
    ports:
      - "8086:8086"
    volumes:
      - influxdb-data:/var/lib/influxdb2
      - influxdb-config:/etc/influxdb2
    environment:
      - DOCKER_INFLUXDB_INIT_MODE=setup
      - DOCKER_INFLUXDB_INIT_USERNAME=admin
      - DOCKER_INFLUXDB_INIT_PASSWORD=adminpassword
      - DOCKER_INFLUXDB_INIT_ORG=avaluma
      - DOCKER_INFLUXDB_INIT_BUCKET=ith-v1
      - DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=secret-influx-token
      - DOCKER_INFLUXDB_INIT_RETENTION=14d
  whisper:
    image: onerahmet/openai-whisper-asr-webservice
    container_name: whisper
    environment:
      ASR_ENGINE: openai_whisper # [openai_whisper, faster_whisper, whisperx]
      ASR_MODEL: base # (tiny, base, small, medium, large-v3, etc.)
      ASR_DEVICE: cpu
    ports:
      - "9000:9000"
    volumes:
      - $PWD/.cache/whisper:/root/.cache/ # reduce container startup time

volumes:
  influxdb-data:
  influxdb-config:

Remove the whisper service from docker-compose.yml

Set the following environment variables for ith-backend:

  WHISPER_ENDPOINT="https://api.openai.com/v1/audio/transcriptions" # or use openai compatible endpoint
  WHISPER_MODEL="whisper-1"
  WHISPER_REQUEST_BODY_AUDIO_FIELD_NAME="file"

Edit Env Variables

Ith-Backend

VARIABLE	Description	Default Value	Possible Values
SERVE_DEMO_PAGE	If it is TRUE you can test speaking and transcribing functionality on localhost:8000/demo. Should be FALSE for production.	FALSE	TRUE, FALSE
TTS_SYSTEM	Defines the default TTS provider. Further comming soon.	elevenlabs	elevenlabs, cartesia
REDIS_HOST	redis server	redis	URL to redis server
REDIS_PORT	redis server port	6379	port number
REDIS_DB=0	redis database	0	0-15
REDIS_PASSWORD	redis password	123456	strong password
OPENAI_API_KEY	Nessesary for using OpenAI hosted Whisper	-	sk-….
WHISPER_ENDPOINT	openai compatible endpoint of a self hosted whisper	https://api.openai.com/v1/audio/transcriptions	“https://api.openai.com/v1/audio/transcriptions", “http://whisper:9000/asr?encode=true&task=transcribe&output=json”
WHISPER_REQUEST_BODY_AUDIO_FIELD_NAME	Use “file” for OpenAI hosted Whisper or “audio_file” for self-hosted Whisper	file	file, audio_file

Start server

docker-compose up

2. Set Up Reverse Proxy for HTTPS

Info

Modern browsers only allow microphone access over secure contexts (https) or localhost.

For providing SSL certificates for the own domain or the ip address of the server, using Caddy as reverse proxy is quick and easy solution.

Create Caddyfile Replace YOUR-IP with your server’s IP address.

YOUR-IP:443 {
    reverse_proxy YOUR-IP:8000
    tls internal
}

Add Caddy Service to docker-compose.yaml

# services:
  caddy:
    image: caddy:latest
    container_name: caddy
    restart: unless-stopped
    network_mode: host
    ports:
      - "443:443"
      - "80:80"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile

Run docker-compose up

Done!