We are all still abuzz from the first ever Open Source Data Summit, a live virtual event that attracted thousands of registrants from around the world. More than 30 speakers contributed on the past, present, and future use of open source in innovating with, and extracting value from the ever-growing tsunami of data that is sweeping across – and increasingly helping to define - our world, from run-of-the mill business reporting to the latest innovations in generative AI.
As the founding co-chair of the Summit, we here at Onehouse believe that open source software will increasingly power the core infrastructure on which next-generation data services are built. The world in which data engineers, data scientists, data analysts, and many others do their work will be an increasingly open world, and we helped launched this new event to make that happen faster and better. As part of our shared commitment to this proposition, the Summit starred many open source leaders, committers, and contributors.
Don’t worry if you missed the conference live, there are recordings coming soon if you visit https://opensourcedatasummit.com! On the day, up to three breakout sessions were running at a time, so no one could have possibly gotten to see everything live. All the sessions are available online for you to watch at your own pace.
Onehouse Founder and CEO Vinoth Chandar led off the day with an overview of the role of open source in data infrastructure. “As someone who fumbled his way through building Uber's early data infrastructure…,” he shared, and brought the audience with him through a description of the growing role of OSS. Visit Vinoth's keynote.
The Summit saw the first public discussion of OneTable, a recently open sourced project that allows omni-directional interoperability across Apache Hudi, Apache Iceberg, and Delta Lake. (For more on lakehouse projects, visit our comparison blog post.) OneTable was built and is currently co-owned across a partnership of Onehouse, Microsoft and Google. Click to view the OneTable panel discussion.
Journalists had the opportunity to interview senior leaders of these companies. Here are some quotes of what they had to say in the VentureBeat article:
Vinoth Chandar, CEO Onehouse
"Throughout this year, we’ve been working with our customers as well as with Google and Microsoft and a bunch of different folks to broaden the idea and bring more form and shape to it."
Raghu Ramakrishnan, CTO Azure Data
"Ultimately, my real hope here is that together, we can create an ecosystem where customers can go to whatever is the best solution without being shackled by the underlying data."
Gerrit Kazmaier, VP/GM Analytics Google
“There are free and open formats like Iceberg, but then there may be other workloads running that depend on a different format that is not your chosen primary file format. That’s where OneTable helps; it’s kind of like a Babelfish.”
The fan favorite live quote during the OSDS session was from Tim Brown at Onehouse:
“You don’t want to be left wondering what your life would have been if you had chosen the other format.”
The leadership panel assembled open source pioneers from – wait for it – Confluent, Google, LinkedIn, Microsoft, Onehouse, Starburst, and Uber. Chaired by Onehouse CEO Vinoth Chandar, the panel discussed “The Growing Role of Open Source Technology in Today’s Data Architectures.” Click to visit the leadership panel.
A few key quotes:
Near the end of the discussion, Raghu Ramakrishnan shared a conclusion: “We are evolving as an industry toward a risk architecture for a unified analytic portal.” We believe that people will be playing back this session on repeat for a long time to come.
Link to session coming soon.
Apache Hudi, launched at Uber in 2016, is growing up – soon to reach the 1.0 milestone. Bhavani Sudha Saktheeswaran (widely known as Sudha) and Sagar Sumit, software engineers at Onehouse, led a session titled “Apache Hudi 1.0 preview: A database experience on the data lake.”
As one attendee put it in the comments, “Some of these terms / features are only associated with databases and have never been heard of before in the data lakes / lakehouse ecosystem. Hudi… leading the pack in such innovations!” Click to visit the panel discussion.
Including the digital leaders in the opening panel, the mega-companies who were (well) spoken for by presenters and panelists are Amazon, Google, Intuit, Microsoft, Netflix, Tesla, Uber, and Walmart. Our stellar speaker line-up also included data folk from up-and-coming and established companies including Acryl Data, Apna, Confluent, DataStax, Eastern Bank, InfluxData, Intuit, JobTarget, Lyra, Quix, Robinhood, Starburst, Tecton, and Wayfair.
And special thanks to our fellow sponsors: Acryl Data, ClickHouse, DataStax, InfluxData, Starburst, and Tecton.
Several sessions focused on putting the lakehouse to work; click to view a session:
Several sessions concerned pushing technology, and the community, forward; click to view a session:
Creating and helping to put on an event like this is always a ton of work – and several tons of fun. The event organizers, Solution Monday, expertly walked all involved through the process. You can visit Open Source Data Summit, view Solution Monday's previous events, or reach out by email at email@example.com to participate in their future events.
Now that we've closed on this exciting day, we expect a raft of questions about our managed service offering, also called Onehouse. Onehouse is powered by Apache Hudi, and amplified by OneTable. If you’d like to know more, visit our website or contact us.
Be the first to read new posts