How the model was built
Lola is a 3B-parameter Llama model fine-tuned via LoRA on Oumi from ~800 examples generated by a larger model from author-written seeds. She runs on Oumi’s hosted inference for now; she can be deployed anywhere and won't need expensive equipment.
The thesis
The right-sized model uses less power, costs less to run, and keeps your data closer. The biggest model is rarely the best answer.
How she decides
- Use case + budget + traffic + privacy + latency → 5-line answer.
- Six recommendations: SLM-only, Hybrid, API-only, Just-prompt-for-now, Automation, No-AI.
- Footprint figures are validated for W↔kWh consistency before display.
Stay in touch
Reach out at [email protected].
Our avatar is named for Lola — a tiny Morkie, with a mighty heart who left us too soon.