Task list, 28/9/2021

Daily todo

Sep 27, 2021 • 1 min read

Today

separation script: spleeter: see run_spleeter.py
Extend abair xml to return list of timestamps; segment long recordings: notebook
Rebase w2v notebook on this or this
Add LM and timings: see here, repo, file, this issue, parlance/ctcdecode, wav2vec2_kenlm.py
Fingerprint for known audio: dejavu
Pass over input data, with this or something similar
MFA, based on this

Look into:

Personal

Run this See this:

--match-filter "license='Creative Commons Attribution license (reuse allowed)'"

Longer term

TG4 Foghlaim scraper Lessons
Scrape more Ros na Rún
Compare this with stuff from last year
Segmentation: run_cleanup_segmentation.sh, tedlium, AMI
VOSK LM
CUNY-CL

Look at: