Current

OlaWod/FreeVC

RaPID-5

Harvard sentences

Metavoice demo colab

ffmpeg - concatenate:

printf "file '%s'\n" *.wav > mylist.txt
ffmpeg -f concat -i mylist.txt -c copy output.mkv

FAVE-extract

The TORGO Database: Acoustic and articulatory speech from speakers with dysarthria

TTS Arena

Interspeech 2024 CfP

PolyAI-LDN/pheme

Dysartria Classification

SIKOR North Saami corpus, DOI

SIKOR North Saami free corpus

GiellaLT Translation Memories

Code LoRA from Scratch

kyegomez/AudioFlamingo — Implementation of the model “AudioFlamingo” from the paper: “Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities”

Map of Swedish counties

Map of Swedish dialects


Peppa Malac - A Nagy Beteg

Peppa Malac - Az álruha, Peppa Pig - Dressing Up

A besúgó - 1. rész

DubDB - Hungarian


lucidrains/RETRO-pytorch — Implementation of RETRO, Deepmind’s Retrieval based Attention net, in Pytorch

The Illustrated Retrieval Transformer

IDEFICS perceiver

SpiRit-LM: Interleaved Spoken and Written Language Model

NeMo - rnnt

espnet - Support external dataset library

DeAL: Decoding-time Alignment for Large Language Models

Textually Pretrained Speech Language Models, code, project

Master 6 French Tenses In Just 10 Minutes

MINISTRY - New Religion

RTE 2FM Classic Irish Track Uncovered - Sally by Kerbdog with Cormac Battle

CNChTu/FCPE — Fast Context-based Pitch Estimation

TSP Speech Database

Scalable Diffusion Models with Transformers Meta, github: facebookresearch/DiT, not open source.

Diffusion Models for Audio Restoration

LibriSpeech Alignments

NST N-gram – Swedish