Wordcab Transcribe

💬 Speech recognition is now a commodity

FastAPI based API for transcribing audio files using faster-whisper and Auto-Tuning-Spectral-Clustering for diarization (based on this GitHub implementation).

Important

If you want to see the great performance of Wordcab-Transcribe compared to all the available ASR tools on the market, please check out our benchmark project: Rate that ASR.

Key features

⚡ Fast: The faster-whisper library and CTranslate2 make audio processing incredibly fast compared to other implementations.
🐳 Easy to deploy: You can deploy the project on your workstation or in the cloud using Docker.
🔥 Batch requests: You can transcribe multiple audio files at once because batch requests are implemented in the API.
💸 Cost-effective: As an open-source solution, you won't have to pay for costly ASR platforms.
🫶 Easy-to-use API: With just a few lines of code, you can use the API to transcribe audio files or even YouTube videos.
🤗 Open-source (commercial-use under WTLv0.1 license, please reach out to info@wordcab.com): Our project is open-source and based on open-source libraries, allowing you to customize and extend it as needed until you don't sell this as a hosted service.

Requirements

Local development

Linux (tested on Ubuntu Server 20.04/22.04)
Python >=3.8, <3.12
Hatch
FFmpeg

Deployment

Docker (optional for deployment)
NVIDIA GPU + NVIDIA Container Toolkit (optional for deployment)

How to start?

You need to clone the repository and install the dependencies:

git clone https://github.com/Wordcab/wordcab-transcribe.git

cd wordcab-transcribe

hatch env create

Then, you can start using the API. Head to the Usage section to learn more.

Last update: 2023-10-12
Created: 2023-10-12