Running LLMs offline on Apple Silicon can cost more than cloud inference

Running LLMs offline on Apple Silicon can cost more than cloud inference

Hacker News·1w·datadrivenangel

A cost analysis comparing local Apple Silicon inference against OpenRouter's API pricing reveals that the upfront hardware investment and electricity usage often make cloud APIs cheaper per inference, especially for variable workloads. For indie makers building AI features, this challenges the assumption that running models locally is always more economical.

Related stories