Interesting links, 20/02/2026

Misc. interesting things.

Feb 20, 2026 • 1 min read

Irish

Tiny Aya reimplementation From Scratch!

Have been reading through the technical reports of the recent wave of open-weight LLM releases (more on that soon).
Tiny Aya (2 days ago) was a bit under the radar. Looks like a nice, small 3.35B model with strongest multilingual support… pic.twitter.com/aWb0FwKheW
— Sebastian Raschka (@rasbt) February 19, 2026

CC-G2PnP: Streaming Grapheme-to-Phoneme and prosody with Conformer-CTC for unsegmented languages

BEST-STD: Bidirectional Mamba-Enhanced Speech Tokenization for Spoken Term Detection, code — MIT

Language-Agnostic Speech Tokenizer for Spoken Term Detection with Efficient Retrieval, code

facebookresearch/eb_jepa

Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens

ReMoRa: Multimodal Large Language Model based on Refined Motion Representation for Long-Video Understanding

hyperreality/American-British-English-Translator

spelling_convention_nlm

ELITR/online-text-flow — Online event streaming to improve data and text flows

kyutai-labs/delayed-streams-modeling — Kyutai’s Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.

mlx-examples - T5

kyutai-labs/unmute

cohogain/whisper-large-v2-ga-IE

47 Years of HARDCORE Riffs

Irish

Caint Chonamara

Bailiúchán Béaloidis Árann

Teanga Submissions, Special edition — 30 June 2026