@misc{Duquenne:2023:sonar_arxiv,
  author = {Paul-Ambroise Duquenne and Holger Schwenk and Benoit Sagot},
  title = {SONAR: Sentence-Level Multimodal and Language-Agnostic Representations},
  publisher = {arXiv},
  year = {2023},
  url = {https://arxiv.org/abs/2308.11466},
}

In general, this is quite useless, because only a handful of speech encoders are open source, while the the rest are CC-BY-NC, including the text encoder and decoder, so... in a pure open source environment, it's unusable. The licence refers to the models, not the code.

Fortunately, the Swedish model is one of the open ones, and I'm only interested in encoding speech.

SWE_MODEL = "https://dl.fbaipublicfiles.com/SONAR/spenc.v3ap.swe.pt"

from sonar.models.sonar_speech.loader import load_sonar_speech_model

speech_encoder_model = load_sonar_speech_model("sonar_speech_encoder_swe", device=device).eval()