asr - 技术专题深度解读 | GitHub 中文社区

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

speech speech-recognition speech-to-text whisper asr

Updated Jul 13, 2026
Python

modelscope / FunASR

Open-source speech recognition toolkit for training, inference, streaming ASR, VAD, punctuation, speaker diarization pipelines, and OpenAI-compatible/MCP serving.

Updated Jul 27, 2026
Python

NVIDIA-NeMo / Speech

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr speech-translation speaker-diariazation generative-ai

Updated Jul 28, 2026
Python

alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Updated Jul 2, 2026
Jupyter Notebook

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

android windows macos linux lazarus raspberry-pi ios text-to-speech csharp cpp dotnet speech-to-text aarch64 mfc risc-v object-pascal asr arm32 onnx vits

Updated Jul 28, 2026
C++

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Jul 20, 2026
Python

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Updated Jun 15, 2026
Python

debpalash / OmniVoice-Studio

Local voice clone, video dubbing, dictation and audiobook maker. The open-source ElevenLabs alternative.

text-to-speech self-hosted audiobook tts speech-recognition speech-to-text transcription video-editing asr dubbing voice-cloning voice-generation voice-ai elevenlabs local-ai dubbing-ai omnivoice omnivoice-studio

Updated Jul 27, 2026
Python

QwenAudio / SenseVoice

Open-source SenseVoiceSmall model for Mandarin, Cantonese, English, Japanese, and Korean ASR, language ID, emotion recognition, and audio event detection.

Updated Jul 27, 2026
C

jdepoix / youtube-transcript-api

Sponsor

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

python cli youtube youtube-video youtube-api captions subtitles transcript subtitle transcripts asr youtube-subtitles youtube-transcripts youtube-captions youtube-transcript translating-transcripts youtube-asr

Updated May 19, 2026
Python

wzpan / wukong-robot

Sponsor

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

alexa ai amazon-echo muse tts openai google-home unit bci speaker homeassistant snowboy asr anyq raspeberry-pi gpt3 chatgpt

Updated Oct 25, 2024
Python

modelscope / FunClip

FunASR-powered video transcription, subtitle generation, and LLM-assisted clipping tool with a local Gradio UI.

Updated Jul 26, 2026
Python

xiangyuecn / Recorder

html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式，支持pc和Android、iOS部分Web浏览器、Hybrid App（提供Android iOS App源码）、微信，提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码

audio javascript html browser web html5 dtmf webrtc webm mp3 wav recording recorder amr ogg record h5 asr sound-record luyin

Updated Jul 9, 2026
JavaScript

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

speech speech-recognition speech-to-text whisper asr speaker-diarization

Updated Feb 23, 2026
Jupyter Notebook

wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

pytorch transformer speech-recognition automatic-speech-recognition production-ready whisper asr conformer e2e-models

Updated Jun 15, 2026
Python

umlx5h / LLPlayer

The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!

player ocr video csharp video-player wpf language-learning media-player whisper asr flyleaf yt-dlp llm ollama

Updated Jul 19, 2026
C#

PeterH0323 / Streamer-Sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋

chat chatbot text-generation tts gpt chat-application asr rag digital-human llm chatgpt internlm-chat-7b internlm2 meta-human

Updated Mar 8, 2025
Python

ahmetoner / whisper-asr-webservice

Sponsor

OpenAI Whisper ASR Webservice API

docker speech speech-recognition automatic-speech-recognition speech-to-text asr openai-whisper

Updated Nov 23, 2025
Python

Purfview / whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

subtitles speech-recognition openai speech-to-text whisper asr speaker-diarization uvr transcriber diarization faster-whisper ctranslate2 whisperx whisper-faster vocal-extractor

Updated Nov 7, 2025

CheshireCC / faster-whisper-GUI

faster_whisper GUI with PySide6

openai vad whisper asr transcribe voice-transcription faster-whisper whisperx

Updated Dec 8, 2024
Python

asr - 技术专题

Here are 2,431 public repositories matching this topic...

m-bain / whisperX

modelscope / FunASR

NVIDIA-NeMo / Speech

alphacep / vosk-api

k2-fsa / sherpa-onnx

PaddlePaddle / PaddleSpeech

speechbrain / speechbrain

debpalash / OmniVoice-Studio

QwenAudio / SenseVoice

jdepoix / youtube-transcript-api

wzpan / wukong-robot

modelscope / FunClip

xiangyuecn / Recorder

MahmoudAshraf97 / whisper-diarization

wenet-e2e / wenet

umlx5h / LLPlayer

PeterH0323 / Streamer-Sales

ahmetoner / whisper-asr-webservice

Purfview / whisper-standalone-win

CheshireCC / faster-whisper-GUI