Interesting links, 20/02/2026
Misc. interesting things.
Tiny Aya reimplementation From Scratch!
— Sebastian Raschka (@rasbt) February 19, 2026
Have been reading through the technical reports of the recent wave of open-weight LLM releases (more on that soon).
Tiny Aya (2 days ago) was a bit under the radar. Looks like a nice, small 3.35B model with strongest multilingual support… pic.twitter.com/aWb0FwKheW
CC-G2PnP: Streaming Grapheme-to-Phoneme and prosody with Conformer-CTC for unsegmented languages
BEST-STD: Bidirectional Mamba-Enhanced Speech Tokenization for Spoken Term Detection, code — MIT
Language-Agnostic Speech Tokenizer for Spoken Term Detection with Efficient Retrieval, code
Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens
hyperreality/American-British-English-Translator
ELITR/online-text-flow — Online event streaming to improve data and text flows
kyutai-labs/delayed-streams-modeling — Kyutai’s Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
cohogain/whisper-large-v2-ga-IE
Irish
Teanga Submissions, Special edition — 30 June 2026