Interesting links, 29/01/2024

Gaussian Adaptive Attention is All You Need: Robust Contextual Representations Across Multiple Modalities, code

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty, code

The 3 papers I summarized above:

- Averaging Weights Leads to Wider Optima and Better Generalization, Izmailov et al. (2018), https://t.co/S4LsGICsnH)

- Early Weight Averaging meets High Learning Rates for LLM Pre-training, @SunnySanyal9 et al. (2018),…
— Sebastian Raschka (@rasbt) January 24, 2024

The Brain Processes Speech in Parallel With Other Sounds

SonicVisionLM — no paper, no code

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

resemble-ai/Resemblyzer

ml-explore/mlx

lucidrains/meshgpt-pytorch — Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch

Real-time speech MRI datasets with corresponding articulator ground-truth segmentations, code, data, weights

A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images, data