What happens?
When using arrow_output_version (documented here: https://duckdb.org/docs/current/configuration/overview), certain data types fail with a "Expected 3 buffers" error.
See below reproducer.
Output:
duckdb 1.5.4 | pyarrow 24.0.0
VARINT defaults (sv=false, aov=1.0) : OK
VARINT aov=1.4 : FAIL -> Expected 3 buffers for imported type extension<arrow.opaque[storage_type=binary, type_name=bignum, vendor_name=DuckDB]>, ArrowArray struct has 4
GEOMETRY aov=1.5 : FAIL -> Expected 3 buffers for imported type binary, ArrowArray struct has 4
ENUM sv=true + aov=1.5 : FAIL -> Expected 3 buffers for imported type string, ArrowArray struct has 4
To Reproduce
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "duckdb==1.5.4",
# "pyarrow>=22",
# ]
# ///
import duckdb
import pyarrow as pa
print(f"duckdb {duckdb.__version__} | pyarrow {pa.__version__}\n")
# VARINT/BIGNUM: defaults are fine
con = duckdb.connect()
pa.table(con.sql("SELECT (2**100)::VARINT AS v"))
print("VARINT defaults (sv=false, aov=1.0): OK")
# VARINT/BIGNUM: arrow_output_version >= 1.4 -> FAIL
con = duckdb.connect()
con.execute("SET arrow_output_version='1.4'")
try:
pa.table(con.sql("SELECT (2**100)::VARINT AS v"))
print("VARINT aov=1.4: OK (unexpected)")
except Exception as e:
print(f"VARINT aov=1.4: FAIL -> {e}")
# --- GEOMETRY: arrow_output_version = 1.5 -> FAIL
con = duckdb.connect()
con.execute("INSTALL spatial")
con.execute("LOAD spatial")
con.execute("SET arrow_output_version='1.5'")
try:
pa.table(con.sql("SELECT ST_Point(1, 2) AS g"))
print("GEOMETRY aov=1.5: OK (unexpected)")
except Exception as e:
print(f"GEOMETRY aov=1.5: FAIL -> {e}")
# --- ENUM: needs BOTH produce_arrow_string_view=true AND aov>=1.5 -----------
con = duckdb.connect()
con.execute("SET produce_arrow_string_view=true")
con.execute("SET arrow_output_version='1.5'")
try:
pa.table(con.sql("SELECT 'happy'::ENUM('happy', 'sad') AS v"))
print("ENUM sv=true + aov=1.5: OK (unexpected)")
except Exception as e:
print(f"ENUM sv=true + aov=1.5: FAIL -> {e}")
OS:
Linux
DuckDB Package Version:
1.5.4
Python Version:
3.14
Full Name:
paultiq
Affiliation:
Iqmo
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?
What happens?
When using arrow_output_version (documented here: https://duckdb.org/docs/current/configuration/overview), certain data types fail with a "Expected 3 buffers" error.
See below reproducer.
Output:
To Reproduce
OS:
Linux
DuckDB Package Version:
1.5.4
Python Version:
3.14
Full Name:
paultiq
Affiliation:
Iqmo
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Yes
Did you include all code required to reproduce the issue?
Did you include all relevant configuration to reproduce the issue?