Today I’m thrilled to announce our partnership with Snowflake, accompanied by a strategic investment they have made in Bodo.
This partnership is part of our joint commitment to help enterprise data engineers and data scientists access extreme performance analytics and machine learning (ML) on the Snowflake data platform - especially with large datasets. We will work together to deliver computing approaches for Python using Snowflake Data Cloud - unquestionably increasing compute speed (as compared to alternatives). Data professionals will directly benefit from simpler and faster methods to get results - opening up completely new business opportunities. Bodo users will also benefit from simpler access to large datasets on the Snowflake data platform, and the ability to manipulate Snowflake data using Python analytics (including Pandas, SciKit Learn, NumPy and others) at a nearly unlimited compute scale. Ultimately this move will make it simpler for data personnel to access such computing power by avoiding the need for add-on performance libraries. This will mean more data professionals working with large datasets will be able to prototype, to move to lighting-fast production, and to do so quicker and easier than ever before.
Enterprise data engineers and data scientists typically approach working with large datasets by rewriting Python code using alternatives such as Spark or Dask. Unfortunately, these approaches are costly in time, labor, and learning curves. And, while these solutions may improve speed, they are ultimately distributed computing technologies that make inefficient use of infrastructure (think: schedulers and wait states). Bodo - in contrast to those approaches - solves for performance entirely differently, one that approaches the theoretical efficiency limit of parallel computing. With Bodo, regular Python code and syntax is parallelized into machine code - something only experts were able to do... up until now. Computer scientists have thought of this as a “holy grail”: Delivering the simplicity of high-level programming combined with the performance of true machine-code level parallelism and performance.
This partnership is part of a commitment to the overall Python ecosystem, and to bringing Bodo’s technology to the Snowpark developer community. Data engineers and data scientists can leverage the Snowflake platform to build even more scalable and optimized data pipelines and applications. This move will help accelerate the pace of innovation using Python’s familiar syntax and technology ecosystem to explore and process data where it lives. Our joint expectation is that Snowflake users who prefer to use Python will be able to benefit from the following:
Looking to the future with Snowflake and our mutual customers, our mission is clear.
First and foremost, we want to help accelerate the pace of innovation for data engineers and data scientists using Snowflake. With Python’s familiar and popular syntax and huge ecosystem, we will help build even more scalable data pipelines and help get to business outcomes faster.
Second, we want to democratize access to high-performance parallel computing. By closing the simplicity-vs-performance gap, we can make more powerful computing available to more data developers everywhere. We believe we’ve made a huge leap forward for computing, and hope to share it broadly.
Finally, In early 2022 expect to see Bodo and Snowflake make additional strides forward in our partnership. In the meantime, we’ll work hard to ensure we offer the performance, use-cases, and simplicity that will delight our joint customers.