What little Kashubian text there is on the internet seems to be in PDF. smh
Apr 23, 2021
Check if I haven't left anything out
Apr 23, 2021
Apr 22, 2021
tl;dr - there's a missing symlink
Apr 20, 2021
Apr 19, 2021
Case folding in Irish is odd; ICU can be used from most languages
Apr 18, 2021
M2M100 used CC-Aligned
Apr 14, 2021
Making/testing the dataset
Apr 14, 2021
So, do massively multilingual MT models trained on massively crawled datasets lead to great output? No
Apr 13, 2021
This took a while
Apr 6, 2021
TTS test corpus for Irish from IDLAK
Apr 6, 2021
How does it fare with closely related languages? Part 1: Processing
Mar 28, 2021
Speech recognition with wav2vec2, as input to the DSAlign aligner
Mar 27, 2021
An attempt to use a word processor for Wikisource. Not a success.
Jan 9, 2021
Compared with Abair's IPA
Nov 25, 2020