Original here

In the section Preparation of speech and text data of the readme, it says:

Similar to wav2vec 2.0, data folders contain {train,valid,test}.{tsv,wrd,phn} files, where audio paths are stored in tsv files, and word, letter or phoneme transcriptions are stored in .{wrd,ltr,phn}. The .wrd and .ltr files are outputs of libri_labels.py

%%capture
!pip install phonemizer
%%capture
!apt-get -y install espeak
%%capture
!apt-get -y install zsh

This is just my best guess at what the .wrd files contain - it seems to match up with what libri_labels.py does: given input like

1272-128104-0000 MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL

it does " ".join(items[1:]), which is basically the same

!cat /kaggle/input/download-common-voice-swedish/cv-corpus-6.1-2020-12-11/sv-SE/test.tsv | awk -F'\t' '{print $3}'|grep -v '^sentence$' | perl -C7 -ane 'chomp;$_=lc($_);s/[^\p{L}\p{N}\p{M}'"\'"' \-]/ /g;s/  +/ /g;s/ $//;s/^ //;print "$_\n";' > test.wrd
!cat /kaggle/input/download-common-voice-swedish/cv-corpus-6.1-2020-12-11/sv-SE/dev.tsv | awk -F'\t' '{print $3}'|grep -v '^sentence$' | perl -C7 -ane 'chomp;$_=lc($_);s/[^\p{L}\p{N}\p{M}'"\'"' \-]/ /g;s/  +/ /g;s/ $//;s/^ //;print "$_\n";' > valid.wrd
!cat /kaggle/input/download-common-voice-swedish/cv-corpus-6.1-2020-12-11/sv-SE/train.tsv | awk -F'\t' '{print $3}'|grep -v '^sentence$' | perl -C7 -ane 'chomp;$_=lc($_);s/[^\p{L}\p{N}\p{M}'"\'"' \-]/ /g;s/  +/ /g;s/ $//;s/^ //;print "$_\n";' > train.wrd
for i in ['train', 'test', 'valid']:
    with open(f'/kaggle/working/{i}.wrd', 'r') as inf, open(f'/kaggle/working/{i}.ltr', 'w') as out:
        for line in inf.readlines():
            print(" ".join(list(line.strip().replace(" ", "|"))) + " |", file=out)
!head train.ltr
v a d | ä r | d e t | i | e u r o |
d u | s k a | v e t a | a t t | d e t | ä r | d u | s o m | h a r | f e l |
g å | n e r | p å | k n ä |
f ö r s t | m å s t e | j a g | s l å | s ö n d e r | d e n | d ä r | s t o r a | s k r o t h ö g e n |
d e t | b l i r | s v å r t |
v a d | f ö r | j ä v l a | f r å g a | ä r | d e t |
j a g | å t e r v ä n d e r | i n t e | t i l l | s k i t h å l e t |
t i t t a | p å | s ö m m a r n a |
f e s | d u | p r e c i s |
a k t r i s e r | h a r | e t t | b ä s t | f ö r e d a t u m |

There are some warnings about switching, so echo the filename first to known where the errors are

!for i in train test valid; do echo $i.wrd; cat $i.wrd | PHONEMIZER_ESPEAK_PATH=$(which espeak) phonemize -o $i.phn -p ' ' -w '' -l sv  -j 70 --language-switch remove-flags ;done
train.wrd
[WARNING] 2 utterances containing language switches on lines 254, 1457
[WARNING] extra phones may appear in the "sv" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
test.wrd
[WARNING] 1 utterances containing language switches on lines 81
[WARNING] extra phones may appear in the "sv" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
valid.wrd
[WARNING] 1 utterances containing language switches on lines 1831
[WARNING] extra phones may appear in the "sv" phoneset
[WARNING] language switch flags have been removed (applying "remove-flags" policy)
!cat test.wrd|awk 'BEGIN{ln=1}{if(ln==81){print $0};ln++}'
!cat train.wrd|awk 'BEGIN{ln=1}{if(ln==254||ln==1457){print $0};ln++}'
!cat valid.wrd|awk 'BEGIN{ln=1}{if(ln==1831){print $0};ln++}'
det är taskigt
och så unik design
internet slutade fungera
det finns inget internet
!cat test.phn|awk 'BEGIN{ln=1}{if(ln==81){print $0};ln++}'
!cat train.phn|awk 'BEGIN{ln=1}{if(ln==254||ln==1457){print $0};ln++}'
!cat valid.phn|awk 'BEGIN{ln=1}{if(ln==1831){print $0};ln++}'
d eː t ɛː r  t a s k ɪ ɡ t  
ɔ k s oː ɵ n iː k  d ɪ z aɪ n  
 ɪ n t ə n ɛ t  s l ʉ t a d ə f ɵ n ɡ eː r a 
d eː t f ɪ n s ɪ ŋ ə t  ɪ n t ə n ɛ t  

"design" and "internet" are clearly the English words that are causing the switch in their respective sentences, but I'm not sure what the problem in test.wrd is: "taskigt"?

!echo taskigt|espeak -v sv --ipa 2> /dev/null
 (en)tˈaskɪɡt(sv)
!cat test.phn|sed -e 's/^ //;s/t a s k ɪ ɡ t/t a s k ɪ t/' > tmp
!mv tmp test.phn
!cat train.phn|sed -e 's/^ //;s/d ɪ z aɪ n/d ɛ s a j n/;s/ɪ n t ə n ɛ t/ɪ n t ɛ r n ɛ t/' > tmp
!mv tmp train.phn
!cat valid.phn|sed -e 's/^ //;s/ɪ n t ə n ɛ t/ɪ n t ɛ r n ɛ t/' > tmp
!mv tmp valid.phn
!for i in train test valid; do cat $i.wrd|tr ' ' '\n'|sort|uniq |grep -v '^internet$'|grep -v '^design$'|grep -v '^taskigt$' > /tmp/$i.wl; cat /tmp/$i.wl | PHONEMIZER_ESPEAK_PATH=$(which espeak) phonemize -o /tmp/$i.wl.phn -p ' ' -w '' -l sv  -j 70 --language-switch remove-flags;paste /tmp/$i.wl /tmp/$i.wl.phn > dict.$i; done
!printf "taskigt\tt a s k ɪ t\n" >> dict.test
!printf "design\td ɛ s a j n\n" >> dict.train
!printf "internet\tɪ n t ɛ r n ɛ t\n" >> dict.train
!printf "internet\tɪ n t ɛ r n ɛ t\n" >> dict.valid
!for i in dic*;do cat $i |sort > tmp;mv tmp $i;done
cat: valid: No such file or directory