The Mission

Democratizing the metrics of intelligence.

Neural Inference was born from a simple frustration: The benchmarks for AI hardware were disconnected from reality. Reviewers would test pure matrix multiplication speed or run training simulations, but no one was answering the question that matters to developers:How fast can I actually run an agent on this laptop?

We are building the world's first open-source laboratory dedicated to Local Inference. We don't use simulations. We don't use cloud APIs. Every number on this site represents a real quantized model (GGUF/ExLlama) running on physical hardware in our facility.

Our Methodology

100% Local: All benchmarks are run offline to isolate hardware performance.
Real Contexts: We test at 4k, 32k, and 64k context windows to expose memory bandwidth bottlenecks that short tests miss.
Consumer Focus: We prioritize hardware you can buy at a store—from the GTX 1080 to the M4 Max.

Support the Lab

Neural Inference is an independent project. The best way to support our testing is to use the affiliate links found on our Hardware dashboard when upgrading your own rig.

View Recommended Hardware