Phone-Level Pronunciation Scoring for L1 Using Weighted-Dynamic Time Warping

@INPROCEEDINGS{sini2023wdtw,
  author={Sini, Aghilas and Perquin, Antoine and Lolive, Damien and Delhay, Arnaud},
  booktitle={2022 IEEE Spoken Language Technology Workshop (SLT)}, 
  title={Phone-Level Pronunciation Scoring for L1 Using Weighted-Dynamic Time Warping}, 
  year={2023},
  pages={1081-1087},
  doi={10.1109/SLT54892.2023.10023182}
}

!pip install rapidfuzz

Scoring method (WMPS)

$$\begin{equation*}\text{WMPS}(u_{1..K,i}, v_{j})=\max_{k\in[1,K]}(p(u_{k,i})\times Lev(a_{u_{k,i}},a_{v_{j}}))\tag{2}\end{equation*}$$

WMPS with probability threshold

$$ \begin{gather*} s=Lev(a_{u_{k,i}},a_{v_{j}})\\ d=\begin{cases} p(u_{k,i}) & \text{if}\ s=1\\ p(u_{k,i})\times\alpha & \text{if}\ \beta\leq s < 1\\ 0 & \text{if}\ s < \beta\end{cases}\tag{3}\\ \text{WMPS}(u_{1..K,i},v_{j})=\max_{k\in[1,K]}(d)\end{gather*} $$