Monkey patched WhisperX with changed segmentation
Jul 26, 2024
Because it was quicker than looking at the API examples
Jul 25, 2024
tl;dr: OWSM-CTC is good enough for alignment for Irish
Jun 27, 2024
Creating synthetic data for training
Jun 19, 2024
For a student project
Mar 3, 2024
For a student project
Feb 29, 2024
Generating sentences from Riksdag: in progress
Feb 26, 2024
Also basic pieces for scraping Sveriges Radio pages
Feb 17, 2024
For a student project
Feb 16, 2024
Runs ASR + phonetic recognition on two versions of Dubliners from Librivox: one (v2) with correct pronunciations, the other read by Americans
Feb 15, 2024
I can't remember what this was for; I'm sure I'll be reminded
Dec 15, 2023
Reading old data
Oct 17, 2023
Between English Wiktionary and phoneme recognition output
Oct 10, 2023
Fairseq data preparation for Waxholm phonetic transcriptions
Aug 10, 2023
Mostly, it's the push_to_hub part that I'll forget
Jan 21, 2023