LLaVA-VL/LLaVA-NeXT

Adapting WavLM for Speech Emotion Recognition

SWAN: SubWord Alignment Network for HMM-free word timing estimation in end-to-end automatic speech recognition

drmfinlay/pyjsgf

kaldiasr/kaldi docker run -it --runtime=nvidia kaldiasr/kaldi:gpu-latest

espnet/owsm_v3

ljuvela/GlotNet

DoDi’s Visual Basic 4 Decompiler

brandjamie/midi2hydro_pattern

libyal — libraries for many obscure file formats

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching, code, model not open.

Even with a vast amound of data, the samples on their demo page still contain errors.

The Book of Shaders

FLAN templates

google-research/nisaba

KBLab/rixvox

Lauler/rixvox-alignments

MediaPipe Pose

Google Research deleted EEG stuff

tracel-ai/burn — Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.

lutzroeder/netron — Visualizer for neural network, deep learning and machine learning models

FSM card generator