Google releases Gemma 4 12B, a compact multimodal model without separate encoders

Google releases Gemma 4 12B, a compact multimodal model without separate encoders

Hacker News·5d·Google

Google's new Gemma 4 12B handles text, images, and video in a single unified architecture, dropping the separate encoder approach. For indie developers, this means smaller model footprint and simpler local deployment without sacrificing multimodal capabilities.

Share𝕏Reddit

Related stories