Skip to content

Arrow w/ arrow_output_version produces errors with VARINT, GEOMETRY and ENUMs #517

Description

@paultiq

What happens?

When using arrow_output_version (documented here: https://duckdb.org/docs/current/configuration/overview), certain data types fail with a "Expected 3 buffers" error.

See below reproducer.

Output:

duckdb 1.5.4 | pyarrow 24.0.0

VARINT defaults (sv=false, aov=1.0) : OK
VARINT aov=1.4 : FAIL -> Expected 3 buffers for imported type extension<arrow.opaque[storage_type=binary, type_name=bignum, vendor_name=DuckDB]>, ArrowArray struct has 4
GEOMETRY aov=1.5 : FAIL -> Expected 3 buffers for imported type binary, ArrowArray struct has 4
ENUM sv=true + aov=1.5 : FAIL -> Expected 3 buffers for imported type string, ArrowArray struct has 4

To Reproduce

#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "duckdb==1.5.4",
# "pyarrow>=22",
# ]
# ///

import duckdb
import pyarrow as pa

print(f"duckdb {duckdb.__version__} | pyarrow {pa.__version__}\n")


# VARINT/BIGNUM: defaults are fine
con = duckdb.connect()
pa.table(con.sql("SELECT (2**100)::VARINT AS v"))
print("VARINT defaults (sv=false, aov=1.0): OK")


# VARINT/BIGNUM: arrow_output_version >= 1.4 -> FAIL
con = duckdb.connect()
con.execute("SET arrow_output_version='1.4'")
try:
 pa.table(con.sql("SELECT (2**100)::VARINT AS v"))
 print("VARINT aov=1.4: OK (unexpected)")
except Exception as e:
 print(f"VARINT aov=1.4: FAIL -> {e}")


# --- GEOMETRY: arrow_output_version = 1.5 -> FAIL
con = duckdb.connect()
con.execute("INSTALL spatial")
con.execute("LOAD spatial")
con.execute("SET arrow_output_version='1.5'")
try:
 pa.table(con.sql("SELECT ST_Point(1, 2) AS g"))
 print("GEOMETRY aov=1.5: OK (unexpected)")
except Exception as e:
 print(f"GEOMETRY aov=1.5: FAIL -> {e}")


# --- ENUM: needs BOTH produce_arrow_string_view=true AND aov>=1.5 -----------
con = duckdb.connect()
con.execute("SET produce_arrow_string_view=true")
con.execute("SET arrow_output_version='1.5'")
try:
 pa.table(con.sql("SELECT 'happy'::ENUM('happy', 'sad') AS v"))
 print("ENUM sv=true + aov=1.5: OK (unexpected)")
except Exception as e:
 print(f"ENUM sv=true + aov=1.5: FAIL -> {e}")

OS:

Linux

DuckDB Package Version:

1.5.4

Python Version:

3.14

Full Name:

paultiq

Affiliation:

Iqmo

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions