WIP: Adding support for contrib/intarray style fast-allocated arrays #2086
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Faster arrays!
Now that
Array<T>
is stable, this PR ports in performance enhancements for creating PostgreSQLint[]
arrays—based on the optimizations found in PostgreSQL’scontrib/intarray
extension (specificallynew_intArrayType(int num)
).The goal is to significantly speed up the conversion from
&[i32]
to a PostgreSQLArrayType
Datum.Background
Currently, to return a PostgreSQL
int[]
from a Rust function, we typically return aVec<i32>
, which then gets converted to a Datum viaarray_datum_from_iter(..)
(implementation here).
This approach uses:
pg_sys::initArrayResult
pg_sys::accumArrayResult
pg_sys::makeArrayResult
These APIs build a varlena array by accumulating values and repeatedly reallocating memory as the capacity grows. For larger arrays, this results in substantial performance overhead.
Optimization
PostgreSQL's
contrib/intarray
takes a different approach for fixed-size datums: it pre-allocates the entire array up front and directlymemcpy
s the data intoARR_DATA_PTR
. This avoids the need for reallocations and is much faster.What I've added
BoxRet
forArray<'a, T>
whenT
is a fixed-size numeric type (i8
,i16
,i32
,f32
,f64
)Array<T>
directly as a return type from#[pg_extern]
functions.ArrayType
memcpy
-like behavior for fast construction from slicesArray<T>
from&[T]
Benchmarks
I've benchmarked two approaches for turning
Vec<i32> -> Array<i32>
:array_datum_from_iter(..)
Each test used 10,000 rows of randomly generated
int[]
in a temp table, work_mem = '1GB'. I used Instant::now() to time only the microseconds needed to convert and accumulated all those timingsIt's been a while since I contributed, so I'm a little rusty on matching the teams style/conventions, and as usual lifetime differences between PG allocated and rust allocated still escape me. I'm sure the ergonomics of the new Array functions could be improved. Am I doing anything the wrong way here?