Back to the feed

Google releases Gemma 4 12B, a single multimodal model for text and images

AI Devtools Open source

Google releases Gemma 4 12B, a single multimodal model for text and images

Hacker News·1mo·Google

Google's new Gemma 4 12B combines text and image understanding in one model without separate encoders, aimed at developers building on-device or cost-constrained AI applications. For indie makers, this means easier deployment of multimodal features without juggling multiple model architectures or managing complex pipelines.

Share𝕏 Reddit

Original story

Read the original on Hacker News

Related stories

⬢ HYVE SPOTLIGHT

The Owens AI Institute is giving K-12 AI education away free, forever

Hyve Spotlight·2mo·HyveCares

Devtools

HtmlUnit 5.0.0 ships as a headless browser library for Java

Hacker News·2mo·rbri

AI

Idle game skewers the AI startup cycle — built by a solo maker

Hacker News·2mo·haebom

Devtools

Dev turns personal stack visualizer into a dog-themed alien planet

Hacker News·2mo·bkawa-bot