List: TTS / STT / Audio Processing | Curated by Simon Kwan

Feb 14, 2025
141 stories
2 saves
TTS / STT / Audio Processing
Tommy Huang
利用OPENAI: whisper來進行語音轉文字使用whisper來進行語音轉文字，需要安裝ffmpeg和whisper套件
Jan 8
Jan 8
vignesh yaadav
Fine-Tuning wav2vec2 on your Google Colab: Take a deep dive into advanced audio classification…The inspiration for embarking on this remarkable audio classification journey struck during my early morning jogs. As I ventured into the…
Sep 8, 2023
Sep 8, 2023
In
Akvelon
by
Dmitrii Lukianov
Generating Subtitles with OpenAI WhisperKey Takeaways
Mar 15, 2023
2
Mar 15, 2023
2
tttzof351
Build text-to-speech from scratch.In the series of small articles, we will write step-by-step a toy text-to-speech model. It will be a simple model with a modest goal — to…
Aug 2, 2023
1
Aug 2, 2023
1
In
Level Up Coding
by
Ali
5 Killer Python Libraries For Audio ProcessingData Science projects, music composition, and lots more…
Nov 28, 2022
1
Nov 28, 2022
1
In
Python in Plain English
by
Everton Gomede, PhD
Audio Segmentation and Artificial Intelligence: A Harmonious SymphonyIntroduction
Oct 4, 2023
1
Oct 4, 2023
1
In
axinc-ai
by
David Cochard
SileroVAD : Machine Learning Model to Detect Speech SegmentsThis is an introduction to「SileroVAD」, a machine learning model that can be used with ailia SDK. You can easily use this model to create…
Dec 26, 2023
Dec 26, 2023
In
sho.jp
by
Sho Nakagome
Fourier Transform 101 — Part 5: Fast Fourier Transform (FFT)Disclaimer! It’s not Final Fantasy Tactics! I mean I like the game, but this one is cool too.
Mar 17, 2019
1
Mar 17, 2019
1
In
Analytics Vidhya
by
David Castro Piñol
Multichannel Speech Enhancement Using Deep Neural NetworksA brief description of current methods in the state of the art literature
Feb 25, 2022
Feb 25, 2022
In
sho.jp
by
Sho Nakagome
Fourier Transform 101 — Part 1: Real Fourier SeriesIn this series, I’m going to explain about Fourier Transform. Have you heard of the term? If not, that’s totally fine. This will be the…
Sep 27, 2018
1
Sep 27, 2018
1
In
sho.jp
by
Sho Nakagome
Fourier Transform 101 — Part 2: Complex Fourier SeriesLast time, I covered Real Fourier Series based on lectures from Dr. Wim van Drongelen.
Oct 2, 2018
1
Oct 2, 2018
1
In
sho.jp
by
Sho Nakagome
Fourier Transform 101 — Part 3: Fourier TransformPreviously, we covered the basic ideas behind Fourier Series starting from the “Real Fourier Series”.
Oct 10, 2018
Oct 10, 2018
In
TDS Archive
by
Kung-Hsiang, Huang (Steeve)
Automatic Speech Recognition Data Collection with Youtube V3 API, Mask-RCNN and Google Vision APIBackground
Aug 26, 2018
6
Aug 26, 2018
6
In
Linagora LABS
by
Rudy BARAGLIA
Voice Activity Detection for Voice User Interface.As a part of a R&D team at Linagora, I have been working on several Speech based technologies involving Voice Activity Detection (VAD) for…
Jun 20, 2018
1
Jun 20, 2018
1
In
ViVoLab
by
Pablo Gimeno Jordán
ViVoVAD: A Voice Activity Detection Tool Based on Recurrent Neural NetworksPablo Gimeno Jordán, @Ignacio Viñals, @Alfonso Ortega, Antonio Miguel Artiaga, Eduardo Lleida
Jun 4, 2019
Jun 4, 2019
Antony M. Gitau
From Raw Data to Accurate Speech Recognition (ASR): My journey of Data Preparation.My journey started by visiting the Mozilla Common Voice project[1], a publicly available database of crowd-sourced voice datasets for…
Feb 15, 2023
Feb 15, 2023
Onkar Patil
How to Remove Silence from an Audio using PythonThere are many ways available that remove the silence part or the dead spaces from an audio file but it’s time consuming to know which one…
Jun 30, 2022
1
Jun 30, 2022
1
In
TDS Archive
by
Max Hilsdorf
AI Music Source Separation: How it Works and Why It Is So HardSource Separation AI, explained
Sep 21, 2023
3
Sep 21, 2023
3
In
TDS Archive
by
Ahmed Besbes
Deploy a Voice-Based Chatbot with BentoML, LangChain, and GradioBentoML is Like Lego for ML engineers
May 2, 2023
3
May 2, 2023
3
In
TDS Archive
by
Zoumana Keita
How to Perform Speech-to-Text and Translate Any Speech to English With OpenAI’s WhisperHow to use cutting-edge NLP models for audio transcription to text and machine translation.
Dec 14, 2022
3
Dec 14, 2022
3