CC-G2PnP: Streaming Grapheme-to-Phoneme and prosody with Conformer-CTC for unsegmented languages

BEST-STD: Bidirectional Mamba-Enhanced Speech Tokenization for Spoken Term Detection, code — MIT

Language-Agnostic Speech Tokenizer for Spoken Term Detection with Efficient Retrieval, code

facebookresearch/eb_jepa

Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens

ReMoRa: Multimodal Large Language Model based on Refined Motion Representation for Long-Video Understanding

hyperreality/American-British-English-Translator

spelling_convention_nlm

ELITR/online-text-flow — Online event streaming to improve data and text flows

kyutai-labs/delayed-streams-modeling — Kyutai’s Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.

mlx-examples - T5

kyutai-labs/unmute

cohogain/whisper-large-v2-ga-IE

47 Years of HARDCORE Riffs

Irish

Caint Chonamara

RnaG shows

Bailiúchán Béaloidis Árann

Íbíotsa

Teanga Submissions, Special edition — 30 June 2026