BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.
-
Updated
Mar 1, 2025 - Python
BanglaSpeech2Text: An open-source offline speech-to-text package for Bangla language. Fine-tuned on the latest whisper speech to text model for optimal performance.
🔊😊 A fastapi voice-assistant framework to quickly prototype LLM-powered voice assistants in <5 minutes.
French audio transcription using gradio
A real-time voice-to-text and text-to-speech AI pipeline using Whisper, an LLM, and Edge-TTS with tunable parameters for low-latency audio processing and response generation.
📝 Turn audio into text effortlessly. Audio transcription powered by OpenAI's Whisper API.
Subtitles Generator: Автоматический генератор субтитров для видео с поддержкой перевода на различные языки, использующий модель Whisper от OpenAI.
This model predicts grammar scores (1–5) from audio files. It uses Whisper to transcribe speech to text, cleans the text, and extracts features with TF-IDF. A Random Forest Regressor is trained to learn grammar score patterns. Evaluation via Pearson Correlation showed good results.
The Whisper Subtitle Generator leverages OpenAI's Whisper model to generate subtitles from audio and video files. This Python-based tool supports multiple languages and employs advanced audio processing techniques to ensure high accuracy in transcription.
Generates subtitles from a video speech (Whisper OpenAI LLM) or extracts existing subtitles, translates them into a different language using Mistral LLM and adds them to the video. Uses ffmpeg for extracting and encoding
Convert YouTube videos to text files. Why spend 30 minutes watching a video when you can skim the transcript in a couple minutes?
MinutesOfMeeting and Gmail is a collaborative crew of AI agents that autonomously understand audio, transcripts, summarizes, writes and drafts an email in Gmail account.
Projeto que transcreve e traduz em tempo real para português.
A real time chat application using Next, Redis, Pub/Sub, Audio-To-Text LLM, Next-auth. I am still working on it
This repository contains notebook that shows how to fine-tune OpenAI's Whisper model on custom Hindi dataset.
Add a description, image, and links to the whisper-model topic page so that developers can more easily learn about it.
To associate your repository with the whisper-model topic, visit your repo's landing page and select "manage topics."