Bodo is Going Open Source – HPC Meets Python for data and AIBodo is Going Open Source – HPC Meets Python for data and AI

Bodo is Going Open Source – HPC Meets Python for data and AI

Date
January 27, 2025
Author
Ehsan Totoni

When we started Bodo, we had a clear mission: to make high-performance computing (HPC) as simple as possible. Along the way, we tackled some of the most challenging problems in parallel computing—achieving unprecedented performance and efficiency while maintaining ease of use.

Today, we are proud to take that mission to the next level: The Bodo compute engine is now open source, bringing HPC to all data and AI workloads.

By parallelizing and compiling Python workloads, Bodo delivers extraordinary performance, accelerates AI pipelines, and significantly reduces cloud costs. Now, we’re sharing that power with the community, enabling all of us to unlock even greater possibilities together.

Here’s why we’re going open source—and what it means for you.

Why Open Source?

We’ve cracked the code on HPC for Data and AI with Python—and now it’s time to set it free, empowering everyone to scale their Python workloads effortlessly. We believe we’ve built something transformative and want software developers everywhere, from startups to enterprises, to leverage our code.

Python’s strength lies not only in its versatility as a language but also in the community that drives it forward. From libraries like NumPy and Pandas to frameworks like TensorFlow and PyTorch, the tools that power Python’s success are born out of a spirit of openness and shared progress.

When in Rome … 

The Python ecosystem is one of the largest open-source ecosystems, and we are fully committed to being an integral part of it. That’s why we’ve decided to go all-in and open-source Bodo’s core compute engine under the Apache license. We’re giving back to the community that has inspired us from the beginning, ensuring everyone can benefit from what we’ve built—no strings attached.

Bodo’s journey has been deeply influenced by the open-source ecosystem. We’ve proudly supported projects like Numba, contributing to advancements in Python’s performance and capabilities. This experience has strengthened our belief in the power of collective effort to tackle complex challenges.

Whether you’re working on massive data pipelines, complex AI workloads, or something entirely new, Bodo can help you run faster and more efficiently, without unnecessary overhead—and now it’s free to use.

About the Bodo Compute Engine

Built on over a decade of HPC research, the Bodo Compute Engine transforms standard Python code into efficient, parallelized execution without requiring changes to the codebase. By eliminating the complexity of traditional HPC, Bodo empowers data engineers, data scientists, and AI/ML practitioners to accelerate data processing, model training, optimize analytics pipelines, and reduce cloud and energy costs. With support for familiar tools and libraries, Bodo makes it possible to achieve performance at scale—no HPC expertise required. 

We believe you shouldn’t need to use multiple specialized tools or rewrite everything to get the best out of your code. At its core, Bodo combines two unique components:

  1. Auto-Parallelizing Python Compiler: Automatically transforms your Python code into efficient, parallelized execution without requiring HPC expertise.
  2. Vectorized SQL Engine: Optimizes structured workloads with advanced features like predicate pushdowns, partition pruning, and cost-based optimizations.

These work together to handle both structured (SQL) and unstructured (Python) workloads efficiently, without wasting resources or adding extra complexity — whether you’re using Iceberg tables, S3, etc

With its dual-language support and intelligent optimizations, Bodo empowers teams to handle large-scale, complex workloads more effectively while delivering exceptional speed, cost efficiency, and scalability.

  • Better Performance: High-granularity parallelism and reduced overhead enable up to 270x faster execution across diverse workloads.
  • Lower Costs and Energy Use: Efficient hardware utilization cuts operational costs and reduces energy consumption.
  • Focus on Critical Workloads: Directs computational power to priority tasks, ensuring timely and actionable insights.

How to Get Started with Bodo

1. Install Bodo.

Using your preferred Python package manager:

Using pip:

Bodo can be installed using Pip or Conda:

pip install -U bodo
Using Conda:
‍conda create -n Bodo python=3.12 -c conda-forge
conda activate Bodo
conda install bodo -c bodo.ai -c conda-forge

2. Read the Docs.

The package includes a Getting Started Guide. You’ll find:

  • Step-by-step instructions to set up and run your first Bodo project.
  • A pre-built example script that demonstrates Bodo in action
  • Guidance on integrating Bodo with common data sources and libraries.

3. Join the Community

We’re building an active and supportive community to help you make the most of Bodo. Whether you have technical questions, need help troubleshooting, or want to share your feedback, join our Slack community channel to connect with other users and the Bodo team for real-time support.

We can’t wait to see what you build. Let us know how we can help, and join us in shaping the future of Python and high-performance computing.

Ready to see Bodo in action?
Schedule a demo with a Bodo expert

Let’s go