Qwen3-Omni-30B-A3B (30B/3B active MoE) lands in llama.cpp: image + audio input, text output. Qwen3-ASR 1.7B (speech-to-text only) also supported. GGUFs from ggml-org: Q4_K_M 18.6 GB, Q8_0 32.5 GB. Apache 2.0. llama-server -hf ggml-org/Qwen3-Omni-30B-A3B-Instruct-GGUF
21Apr 14, 2026, 1:18 AM