Make supercomputing
feel super

A research compute platform built for AI. Launch on one node or one thousand, straight from our Python API. Run directly on bare-metal to maximize performance and minimize abstraction.

Sign up for freeNo credit card required

Better foundations for your matrix multiplications

Clusterfudge is designed from the ground up for AI research.

Your workload is a Python dataclass. Request GPUs and specify replica counts. Sweep hyperparameters with a for-loop.

Hi, performance computing!

Onboard new clusters in minutes, not weeks. Track utilization and hardware health, from a single GPU to your entire Infiniband Network.

One-line install

Ready to go out-of-the-box

Install the Fudgelet on your own hardware using industry standard tools. No external dependencies.

People-first scheduling

A scheduler that explains its decisions.

The Clusterfudge scheduler makes transparent decisions. If a workload is unschedulable, we’ll tell you why. Workloads are never preempted by the scheduler. Advanced scheduling features, like gang-scheduling and replica failure policies, are built-in.

Quotas

Partition your resources for teams and projects.

Quotas are a way to limit the amount of resources a workload can use. They are a great way to prevent a workload from monopolizing all the resources in a node.

Pricing

Free for academia, open-source, or personal projects.

$0 / GPU / hour

Academia, open-source or personal projects.

Free

  • Fully Managed
  • Unlimited CPUs
  • Limited to 1000 GPUs
  • One-click notebook deployment
Get started

$0.04 / GPU / hour

Labs managing resources across multiple clusters and projects.

Teams

  • Everything in Free plus:
  • Unlimited GPUs
  • Multiple clusters
  • Teams and ACLs
  • Advanced Reporting
Get started

Contact Us

Bespoke clusterfudge for large organizations.

Enterprise

  • Everything in Teams plus:
  • Fully Managed or On-Prem
  • 24/7 Support
  • Custom SLAs
  • Custom Integrations
Get started

Frequently asked questions (FAQs)

If you have a question that is not answered here, please email us at [email protected].

    • Who are you?

      We're a venture-funded team who have previously built research compute platforms at places like Google DeepMind and Two Sigma. We work from a sunny office in Chancery Lane, London, UK.

    • Why is it called Clusterfudge?

      Clusterfudge is borne out of a belief that AI researchers — and the GPUs they use — deserve better than the software currently available.

    • How does this compare to Slurm?

      Yes, Clusterfudge is an alternative to Slurm. We believe we have a few advantages. (1) With Slurm, you're responsible for running both the compute nodes and the control plane. With Clusterfudge, we run the control plane. (2) We offer an easy-to-use launch experience. Rather than SSHing and writing Slurm scripts, researchers specify their jobs in Python or via our web interface.

    • How does this compare to Kubernetes?

      Yes, Clusterfudge is an alternative to Kubernetes. Compared to Kubernetes, Clusterfudge is simpler to operate. Advanced scheduling features, like gang-scheduling, are built-in. Clusterfudge also maintains a history of jobs launched on the Cluster. We also don't require applications to run in containers.

    • Do you sell compute?

      Not currently. Our software is designed to run on any Cloud or on-prem, so long as it can access our API. You're free to bring your own compute hardware from any provider. We do have hardware vendors we've worked with previously, and would be happy to recommend.

    • Is Clusterfudge generally available?

      Clusterfudge is currently in beta. We're actively adding users from our waitlist as we grow our capacity.