feat(writer): byte-based row group sizing in ParquetWriter by shanielh · Pull Request #2741 · apache/iceberg-rust

shanielh · 2026-06-29T14:17:17Z

Which issue does this PR close?

No tracking issue yet — small, self-contained enhancement; happy to file one if maintainers prefer it for changelog purposes.

What changes are included in this PR?

Adds ParquetWriterBuilder::with_row_group_size_bytes(bytes) — byte-based row-group sizing for ParquetWriter.

Today WriterProperties::max_row_group_size bounds a row group only by row count. The on-disk size of a row group therefore depends entirely on how well the data compresses: with a fixed row count, well-compressing data produces tiny row groups and poorly-compressing data produces large ones. A caller that wants ~N-MiB row groups has no choice but to guess a compression ratio to convert a byte target into a row count, and that guess is wrong whenever the ratio drifts (across columns, files, or within a file).

This is exactly the knob Java Iceberg exposes as write.parquet.row-group-size-bytes, which has no equivalent in iceberg-rust.

When with_row_group_size_bytes is set, after each write the writer checks parquet's AsyncArrowWriter::in_progress_size() — the anticipated encoded (post-compression) size of the open row group — and calls flush() to cut the group once it reaches the target. This is the size-driven flush pattern parquet-rs documents. Result: byte-uniform row groups with no ratio guess.

Notes:

The existing row-count cap still applies as an upper bound; whichever limit is hit first cuts the group.
The size is checked after each FileWriter::write, so a group can overshoot by at most one written batch (documented on the builder method). Streaming callers that write in modest batches get tight byte-uniform groups; a caller handing the writer one enormous batch gets coarser cuts.
Default behavior is unchanged: the field defaults to None, so existing writers cut by row count exactly as before.

Are these changes tested?

Yes — new unit test test_parquet_writer_row_group_size_bytes writes ~800 KB of (dictionary- and compression-disabled, so ~incompressible) data with a 64 KiB byte target and the row-count cap left at the parquet default, then reads the file back and asserts it contains multiple row groups — proving the size-based cut fired rather than emitting one default row group. cargo fmt, cargo clippy -p iceberg, and the existing parquet_writer tests all pass.

viirya

Nice addition — this is the analogue of Java's write.parquet.row-group-size-bytes, which we were missing.

The mechanism checks out against the parquet-rs API: AsyncArrowWriter::in_progress_size() is documented as "Anticipated encoded size of the in progress row group" (i.e. post-encoding/compression estimate, which is what you want for byte-uniform groups), and flush() cuts the buffered rows "into a new row group" rather than closing the file — so calling it mid-write is correct. Checking after each FileWriter::write and the one-batch-overshoot caveat are both documented honestly on the builder method.

Default stays None so existing writers are unchanged, and the row-count cap remains an upper bound — both good. The test disables dictionary/compression to get ~incompressible data and asserts multiple row groups at a 64 KiB target, which actually exercises the size-driven cut.

Pulled the branch: 15 parquet_writer tests pass (incl. the new one), clippy + rustfmt clean.

One optional thought, non-blocking: the PR mentions no tracking issue. Given Java exposes this as a table property (write.parquet.row-group-size-bytes), a natural follow-up would be wiring it from table properties too, but the builder-level knob is a fine standalone first step.

`WriterProperties::max_row_group_size` bounds a parquet row group only by row count, so on-disk row-group size swings with the data's compression ratio — uniform-row-count groups can vary by orders of magnitude in bytes, and a caller that wants N-MiB groups has to guess a ratio to convert the byte target into a row count. Add `ParquetWriterBuilder::with_row_group_size_bytes`: when set, the writer cuts a row group as soon as its anticipated encoded size (`AsyncArrowWriter::in_progress_size`) reaches the target, using parquet's recommended size-driven flush pattern. This yields byte-uniform row groups with no ratio guess — the Rust analogue of Java Iceberg's `write.parquet.row-group-size-bytes`. The row-count cap still applies as an upper bound; whichever fires first cuts the group. Checked after each write, so overshoot is bounded by one batch. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

shanielh · 2026-07-01T05:28:07Z

Cool @viirya , I've just rebased over main and generated the public api. Hope this get merged soon, after merging I'll PR wiring the table property to writer.

viirya · 2026-07-01T06:09:20Z

Thank you @shanielh

viirya · 2026-07-01T06:10:16Z

cc @kevinjqliu

shanielh force-pushed the feat/parquet-row-group-size-bytes branch from 2577533 to 8639746 Compare June 29, 2026 14:43

viirya approved these changes Jun 30, 2026

View reviewed changes

shanielh force-pushed the feat/parquet-row-group-size-bytes branch from 8639746 to b5386b9 Compare July 1, 2026 05:25

shanielh force-pushed the feat/parquet-row-group-size-bytes branch from b5386b9 to 1f31c3e Compare July 1, 2026 05:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(writer): byte-based row group sizing in ParquetWriter#2741

feat(writer): byte-based row group sizing in ParquetWriter#2741
shanielh wants to merge 1 commit into
apache:mainfrom
shanielh:feat/parquet-row-group-size-bytes

shanielh commented Jun 29, 2026

Uh oh!

viirya left a comment

Uh oh!

shanielh commented Jul 1, 2026

Uh oh!

viirya commented Jul 1, 2026

Uh oh!

viirya commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shanielh commented Jun 29, 2026

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Uh oh!

viirya left a comment

Choose a reason for hiding this comment

Uh oh!

shanielh commented Jul 1, 2026

Uh oh!

viirya commented Jul 1, 2026

Uh oh!

viirya commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants