ByteShape Blog

Insights, research, and updates on AI acceleration technology

Qwen 3.5 35B A3B Release
10 April 2026 Model Optimization

Blackwell Picks Favorites: Qwen 3.5 35B A3B

ByteShape's ShapeLearn-quantized release of Qwen 3.5 35B A3B. An MoE model where CPUs are surprisingly consistent but GPUs are much pickier. See the best quality/speed trade-offs across RTX 4090, 4080, 5090, RTX Pro 6000 Blackwell, Intel i7, Ryzen 9, Ultra 7, and Raspberry Pi.

Read More →
Qwen 3.5 9B Release
31 March 2026 Model Optimization

Happy GPUs, Moody CPUs: Qwen 3.5 9B

ByteShape's ShapeLearn-quantized release of Qwen 3.5 9B. GPUs agree on the best models, CPUs have strong opinions. See the best quality/speed trade-offs across RTX 5090, 4080, 3090, 5060 Ti, Intel i7, Ryzen 9, Ultra 7, and Raspberry Pi.

Read More →
Qwen3-30B TPS Optimization
5 January 2026 Model Optimization

A 30B Qwen Model Walks Into a Raspberry Pi… and Runs in Real Time

ByteShape's device-optimized release of Qwen3-30B-A3B-Instruct-2507 showing superior TPS-quality tradeoffs across Raspberry Pi, Intel CPUs, and NVIDIA GPUs. Discover how we treat memory as a budget and optimize for what matters: speed vs. quality. Real-time performance on a Pi at 8+ TPS with 94% accuracy retention.

Read More →