Finetuning Qwen models to do interesting and useful things.

Demos

prompt-router — a retrieval-augmented router built on Qwen/Qwen3-Embedding-8B. Given a prompt, it embeds it, retrieves the most similar past prompts, and routes to the cheapest model that will still answer well (Qwen3-vLLM → Haiku → Sonnet → Opus). Minimal terminal-style interface, running live on ZeroGPU.

🔒 Private Space — hosted under masoudc/ (for free ZeroGPU via PRO); request access to try it.