Bringing the Power of Google’s Infrastructure to your Apache Iceberg™ Lakehouse with BigQuery

May 21, 2025

Speaker

Victor Agababov

Software Engineer

Google

Apache Iceberg™ has become a popular table format for building data lakehouses, enabling multi-engine interoperability. This presentation explains how Google BigQuery leverages Google's planet-scale infrastructure to enhance Iceberg, delivering unparalleled performance, scalability, and resilience.

Transcript

AI-generated, accuracy is not 100% guaranteed.

Adam - 00:00:06

All right, Victor, how you doing, man?

‍

Victor Agababov - 00:00:08

I'm good. Good. How are you today?

‍

Adam - 00:00:10

Where are you calling in from?

‍

Victor Agababov - 00:00:12

I'm calling from Seattle.

‍

Adam - 00:00:13

Seattle. Today we'll be talking about bringing the power of Google's structure to Apache Iceberg. I'm very excited to hear her talk about BigQuery and how all these different pieces fit together. I will be back in 15 minutes and we're gonna have an even longer session for Q and A.

‍

Victor Agababov - 00:00:35

My name is Victor and I'm a software engineer in BQE. I am working in a group that is tasked with big lakes and similar technologies, and Apache Iceberg is part of the technology that we wanted to talk about today. So I'm going to go over the general overview of BQE and then I'm going to discuss several existing integrations that we have that work with Apache Iceberg. A few years back, we didn't have anything like four years back, and now we have several offerings and we're hoping to have even more soon.

In general, BQE is, unless you don't know, a fully managed and serverless system for data processing. It has completely decoupled storage and metadata layers which are scaling independently and on demand. Basically, users and customers don't need to think about scaling or especially having their resources running while there are no jobs or no queries to execute.

‍

Victor Agababov - 00:01:51

So this is different from traditional node-based computing warehousing, where you have to have your nodes possibly running at all times, even if they are not serving any workloads. So this gives you big flexibility and cost control over what you have to do. The engine is known as Dremel, also scales extremely well horizontally and vertically, and provides modern SQL 2011 compliant SQL language and CLI, REST API, and other drivers that you can use to talk to BQE.

Now the gamut of tables that BQE provides is pretty wide. On the left, there are self-managed Iceberg tables or even self-managed external tables. There, the data is read-only. You manage all the data, you manage all the metadata and BQE just provides you querying. On the other side, you have the BQ managed tables, which leverage all the capabilities that we have.

‍

Adam - 00:00:06

All right, Victor, how you doing, man?

‍

Victor Agababov - 00:00:08

I'm good. Good. How are you today?

‍

Adam - 00:00:10

Where are you calling in from?

‍

Victor Agababov - 00:00:12

I'm calling from Seattle.

‍

Adam - 00:00:13

‍

Victor Agababov - 00:00:35

‍

Victor Agababov - 00:01:51

‍

Victor Agababov - 00:03:06

The storage is managed, the metadata is managed, it's all far and forget, and it gives you all the flexibility and control over the data and metadata on the other side. Because this flexibility and performance come at a cost that both the data and metadata are in proprietary format. And unless you export or take out your data, you don't have direct access to the data except the methods that BQE provides, be that query engine or the Read API. But it provides all the features that enterprises require, automating, scaling depending on the loads, controls and features, compliance, disaster recovery, data residency requirements, and column and role level security and whatnot, including materialized views and so on.

‍

Victor Agababov - 00:04:24

And in the middle, we have our more recent offering, which is Apache Iceberg tables for Apache Iceberg, which provide you some middle ground. The metadata is still managed by BQE, but the data is in open source Parquet, and you can export metadata in Iceberg format to BQE from external engines. We'll get to those details later.

‍

Victor Agababov - 00:04:50

So the self-starting on the left is the self-managed of Iceberg tables. They come in two flavors and we'll go over the capabilities of each one. So one are external Iceberg tables. In that case, users basically have to manage the data and metadata themselves. It can be done with Spark or Flink engines. And Google BQE basically receives the pointer to the metadata. And at that point, it executes the standard querying, Apache Iceberg querying processes, reads the manifest files, reads the Parquet files, and serves the query by virtue being part of the BQ execution engine that you are gaining.

‍

Victor Agababov - 00:05:58

On the other side, all the controls that you might not have otherwise in the open source such as row and column level security, materialized views, and search indices. So in that case, for example, you can generate your data stream it using your Flink into a big lake that will be managed as an Apache table, but later your data can curate the data by applying those positive security and visibility controls with data masking and columnar security.

‍

Victor Agababov - 00:06:30

The next level up is the BQ metastore, which is our most recent offering that recently has been graduated to general availability. This executes as a standard Iceberg plugin where there is a JAR file, or now it's currently merged into upstream Iceberg repository. So it will be available in every release starting 1.10. At that point, the code or the library on the engine side will manage all the file interactions, the metadata and whatnot, but the BQE will act as the ultimate catalog and the source of truth for the metadata. It'll store the table information, the pointer to your metadata and will ensure that all the data is correct and it leverages the same availability and disaster recovery features for that part that BQE stores.

‍

Victor Agababov - 00:07:24

In addition, if you curate those tables, the BQ rather than through your open source engine, you are receiving the same exact features as we talked just now for externalized Iceberg tables, including columnar security and possibility building of materialized views and search indices over your data.

‍

Victor Agababov - 00:07:50

In addition, because by virtue of it being a library, you can still use it directly from your open source engines, but it is also possible to run it from other GCP offerings such as Dataproc, which natively integrates with it and just works out of the box as your another metastore. And as I just said, it leverages all the same querying engine, which is Dremel and features all the enterprise features that might be required.

‍

Victor Agababov - 00:08:20

Now going to the managed Iceberg tables, which will be soon launching to general availability. In this case, BQE is the source of truth for both metadata and the data. And the data is stored in the Parquet format in the user GCS bucket. And except for one case we'll discuss later, immediately at the completion of your DML operations or load operations, it will be available in the bucket. The metadata currently is exported on demand. So the metadata is stored internally in our proprietary formats, but the Iceberg metadata can be exported on demand whenever user needs that, and soon it'll be coming to, then it'll be out automatically exporting it as the data inside the table changes. So you will have fresh Iceberg metadata available in your GCS bucket for consumption by external engines almost immediately.

‍

Victor Agababov - 00:09:14

Because the metadata is stored in our proprietary formats, it allows same optimizations that we can apply to our native tables, which means that there are features like enterprise control as we discussed, the column, row and lower level security, data residency and whatnot. In addition, it supports several more advanced features. One is high throughput streaming using Write API, which is one of the best offerings in the market. It allows you to stream several gigabytes of data across all your streams per second.

‍

Victor Agababov - 00:09:50

Here the difference from the previous slide is that that data will be stored temporarily internally while it's being converted into Parquet format. We'll go over architecture in the next slide. Another thing that BQE tables for Apache provide is that because we own all the metadata internally, we are able to apply same data optimization, ensuring the Parquet files are sized appropriately for optimal query performance. We're applying our internal appropriate algorithms to determining the sizes, computing statistics to ensure the proper performance and whatnot, data reclustering and garbage collection. So as soon as the data is going outside of your time travel window for querying, the data is collected. So you don't have to pay your GCS bill unless you need the data for longer.

‍

Victor Agababov - 00:11:16

So going back to the hyper streaming, I wanted to discuss a little bit the architecture that powers it. So the data arrives at a Write API front end, which is supported by several libraries in various languages. At that point, the data can be partitioned to several streams. It can be sent by one stream, and it's all being consumed by our stream servers in the backend. Those stream servers collect all the data and store it in our internal proprietary formats for the time being. While the amount of data, which is expected amounts that would necessitate its conversion and export into Parquet format and the metadata, when that time arrives, the internal system will convert all the temporary data, convert it into Parquet format and export into your bucket, and the metadata will be merged into the rest of the table metadata, and then it can be exported and processed by external engines should the need arise.

‍

Victor Agababov - 00:12:30

Interestingly though, if you want to access the temporary stored data right now, you will need to use either BQE engine or the Read API connector from external open source engines. While the data is stored, obviously it's also subject to all the controls as discussed above, including CSE. If your table is encrypted, if your custom key, then it will be encrypted and all the data will be stored from BQE. You can always curate all the data, the union of the recently streamed and already committed data to the table.

‍

Victor Agababov - 00:13:16

The features that are coming soon with the BQ managed tables are, it's going to be generally available soon, and it'll include continuous Iceberg snapshots. It'll include time travel support for both from BQE and from the OSS engines. And it'll have change data capture capability in the Write API and high throughput streaming where you not only can write data as append only, but also you can send changes such as deletions or upserts, which is a highly requested feature.

‍

Victor Agababov - 00:13:50

As it has been announced that the next, we are also working to support Iceberg REST interface so that it is possible to manage the Iceberg tables in your BQ REST catalog rather than the plugins jars or the code running on your cluster side. And that's what I had planned for you today, and I'm ready to answer any questions you have.

‍

Adam - 00:14:30

Awesome, Victor, thank you very much for coming and sharing with us. Can you share your screen one more time? And can you go back to that system diagram?

‍

Victor Agababov - 00:14:41

Certainly.

‍

Adam - 00:14:45

Okay. By the way, I'm all right. So I'm putting my question aside for a minute. I'm already seeing some questions then show up. So first is, what is the speed of high throughput streaming?

‍

Victor Agababov - 00:14:57

So, as I said, the ingestion capabilities are above several gigabytes per second across all streams. If you're streaming only one stream, it's a bit restricted. But if you have several streams, the default limit is three gigabytes, but if you have a good use case it can be increased even above that.

‍

Adam - 00:15:25

Okay. John is asking, is there any cost to the customer for the internal temporary storage and the conversion to Parquet?

‍

Victor Agababov - 00:15:34

No. So the bill is basically you pay for the Write API, whatever the price is. It's standard as for native tables. As for Apache tables, there is no additional billing for either temporary storage or conversion.

‍

Adam - 00:15:54

Okay. We got another one from Vin Note. Are you able to comment on how the table types are distributed across BigQuery native format versus Iceberg or open table formats?

‍

Victor Agababov - 00:16:06

I don't think I have, like you mean like the number of tables or proportions of tables of native to Iceberg, I assume. I don't think I have that information, but currently the majority of tables that BQE has are native tables.

‍

Adam - 00:16:23

Yeah. Yeah, he clarifies, you know, the percent of tables in each category.

‍

Victor Agababov - 00:16:27

No, unfortunately I do not have that information.

‍

Adam - 00:16:31

We've got another one here. Does BigQuery differ on compute slots? So limited to 2000 per GCP project when querying self-managed versus native versus big lake tables?

‍

Victor Agababov - 00:16:46

So the slots are basically shared across the compute engine. The Dremel is the execution engine and whatever queries are done through Dremel are billed to your slots here, regardless of the table type.

‍

Adam - 00:17:04

Let's see. We have another one here. Surrender is asking Victor, can you explain about time travel in BigQuery Iceberg?

‍

Victor Agababov - 00:17:11

In the Iceberg? So right now it was, because the product was in the preview phase, there were some edge cases. We didn't want to release it to general availability for people to use if we weren't sure of the quality at the time. So those are all fixed and will be released. And it works standard way. So if we're talking about internal metadata, it's supported out of the box. For Iceberg, like for time travel, you need to store appropriate snapshots and snapshot history in your metadata, and that will also be available so you can query with time travel from your open source engines as well.

‍

Adam - 00:17:55

We got another one here from Taz. Does your system only support Parquet format or also supports other formats like ORC and others?

‍

Victor Agababov - 00:18:05

So depending on which one. External Iceberg Table supports both ORC and Parquet, the managed Iceberg support only Parquet.

‍

Adam - 00:18:22

Okay. Let's see. I think, by the way, folks, you could feel free to just drop your questions in the Q and A tab and then others can upvote them. Otherwise I kind of have to fish them from the chat and it's a little more, at least I don't then know which ones are most popular. We got another one from the chat. Can you comment on how BigQuery costs on slowly changing dimensions operations?

‍

Adam - 00:18:51

Slowly changing dimensions operations?

‍

Victor Agababov - 00:18:54

Yeah, I'm not sure I understand the question.

‍

Adam - 00:19:01

Yeah, I don't know if, do you guys follow up?

‍

Victor Agababov - 00:19:07

Yeah, I'm not sure I understand the question well, sorry.

‍

Adam - 00:19:16

If you want to explain a little bit more, what do you mean by costs? Do you just mean like the extent to which slowly changing dimensions are operations on, on slowly changing dimensions or like adding up to increasing costs?

‍

Victor Agababov - 00:19:35

Yeah. In general for managed storage is basically built on how much storage you have for the Iceberg tables. Obviously whatever GCS is billing is whatever how much data you are putting into your bucket. So slow changing or fast changing, it doesn't affect much.

‍

Adam - 00:19:58

What about any plans for managed Delta Lake tables?

‍

Victor Agababov - 00:20:03

No, there are currently none that I'm aware about.

‍

Adam - 00:20:09

Okay.

‍

Victor Agababov - 00:20:10

Delta Lake tables are supported same way as Iceberg tables, the self-managed. So you are able to query those through BQE, but they are read-only.

‍

Adam - 00:20:20

Okay, we got another one here. Is it efficient to use native BigQuery tables versus creating external tables in BigQuery with data in blob storage written in Hive or Iceberg?

‍

Victor Agababov - 00:20:36

So especially if you're going to query them from BigQuery, it's more efficient to create them native because we have all the necessary metadata internally, which is faster to load than going to the GCS bucket and loading it from there. And the format is internal and proprietary, which is faster for us to read than it would be ORC or JSON in general.

‍

Adam - 00:21:04

Okay. I think we're gonna keep it to the last one for this session. John Esperanza is asking, is there any configuration available for data compaction on Iceberg tables?

‍

Victor Agababov - 00:21:15

So for managed tables, I assume so those are all managed internally, but our algorithms, they're changing over time. They're not really public, all the thresholds and whatnot. Based on our history and query performance, we determine what would be the optimal sizes for our engine, and that's how we determine those.

‍

Adam - 00:21:44

Yeah, that's fair. Especially for the managed solutions. Victor, thank you very much for coming and sharing with us and joining us for this session.