Info
Currently this page is only available in English.
What does the ITH-Backend do?
- Provides API for ITH session
- Handels transcription and speech generation
- Computes mouth movements and facial expressions
- Serves all nessesary files for the client.
Hosting ITH-Backend with Docker
1. Install docker
If docker is not alreay installed install it from docker.com.
- Linux
- Windows, Mac
Install Docker Engine.
Install Docker Desktop.
1. Create docker-compose.yaml
services:
caddy:
image: caddy:latest
container_name: caddy
restart: unless-stopped
network_mode: host
ports:
- "443:443"
- "80:80"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile
ith-backend:
build:
context: .
ports:
- "8000:8000"
volumes:
- ./assets:/app/assets
- ./configs/.docker.env:/app/configs/.env
- ./static:/app/static
- ./templates:/app/templates
env_file:
- ./configs/.secret_keys.env
redis:
image: redis:latest
container_name: redis
ports:
- "6379:6379"
command: [ "redis-server", "--requirepass", "123456" ]
influxdb:
image: influxdb:2.7
container_name: influxdb
ports:
- "8086:8086"
volumes:
- influxdb-data:/var/lib/influxdb2
- influxdb-config:/etc/influxdb2
environment:
- DOCKER_INFLUXDB_INIT_MODE=setup
- DOCKER_INFLUXDB_INIT_USERNAME=admin
- DOCKER_INFLUXDB_INIT_PASSWORD=adminpassword
- DOCKER_INFLUXDB_INIT_ORG=avaluma
- DOCKER_INFLUXDB_INIT_BUCKET=ith-v1
- DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=secret-influx-token
- DOCKER_INFLUXDB_INIT_RETENTION=14d
whisper:
image: onerahmet/openai-whisper-asr-webservice
container_name: whisper
environment:
ASR_ENGINE: openai_whisper # [openai_whisper, faster_whisper, whisperx]
ASR_MODEL: base # (tiny, base, small, medium, large-v3, etc.)
ASR_DEVICE: cpu
ports:
- "9000:9000"
volumes:
- $PWD/.cache/whisper:/root/.cache/ # reduce container startup time
volumes:
influxdb-data:
influxdb-config:
- Remove the
whisper
service fromdocker-compose.yml
- Set the following environment variables for
ith-backend
:WHISPER_ENDPOINT="https://api.openai.com/v1/audio/transcriptions" # or use openai compatible endpoint WHISPER_MODEL="whisper-1" WHISPER_REQUEST_BODY_AUDIO_FIELD_NAME="file"
Edit Env Variables
Ith-Backend
VARIABLE | Description | Default Value | Possible Values |
---|---|---|---|
SERVE_DEMO_PAGE | If it is TRUE you can test speaking and transcribing functionality on localhost:8000/demo. Should be FALSE for production. | FALSE | TRUE, FALSE |
TTS_SYSTEM | Defines the default TTS provider. Further comming soon. | elevenlabs | elevenlabs, cartesia |
REDIS_HOST | redis server | redis | URL to redis server |
REDIS_PORT | redis server port | 6379 | port number |
REDIS_DB=0 | redis database | 0 | 0-15 |
REDIS_PASSWORD | redis password | 123456 | strong password |
OPENAI_API_KEY | Nessesary for using OpenAI hosted Whisper | - | sk-…. |
WHISPER_ENDPOINT | openai compatible endpoint of a self hosted whisper | https://api.openai.com/v1/audio/transcriptions | “https://api.openai.com/v1/audio/transcriptions", “http://whisper:9000/asr?encode=true&task=transcribe&output=json” |
WHISPER_REQUEST_BODY_AUDIO_FIELD_NAME | Use “file” for OpenAI hosted Whisper or “audio_file” for self-hosted Whisper | file | file, audio_file |
Start server
docker-compose up
2. Set Up Reverse Proxy for HTTPS
Info
Modern browsers only allow microphone access over secure contexts (https) or localhost.
For providing SSL certificates for the own domain or the ip address of the server, using Caddy as reverse proxy is quick and easy solution.
- Create
Caddyfile
Replace YOUR-IP with your server’s IP address.
YOUR-IP:443 {
reverse_proxy YOUR-IP:8000
tls internal
}
- Add Caddy Service to
docker-compose.yaml
# services:
caddy:
image: caddy:latest
container_name: caddy
restart: unless-stopped
network_mode: host
ports:
- "443:443"
- "80:80"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile
Run docker-compose up
Done!