Skip to content

Commit 5652af6

Browse files
authored
GH-46321: [C++][Doc] Better explain ArrayData IsValid and GetNullCount (#46332)
### What changes are included in this PR? Address a comment that was not addressed in PR #46271. ### Are these changes tested? No, only docstring changes. ### Are there any user-facing changes? No, only docstring changes. * GitHub Issue: #46321 Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
1 parent c727b75 commit 5652af6

File tree

1 file changed

+19
-5
lines changed

1 file changed

+19
-5
lines changed

cpp/src/arrow/array/data.h

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,7 @@ struct ARROW_EXPORT ArrayData {
217217
/// queried instead.
218218
/// For dictionary arrays, this reflects the validity of the dictionary
219219
/// index, but the corresponding dictionary value might still be null.
220+
/// For null arrays, this always returns false.
220221
bool IsValid(int64_t i) const {
221222
if (buffers[0] != NULLPTR) {
222223
return bit_util::GetBit(buffers[0]->data(), i + offset);
@@ -358,8 +359,16 @@ struct ARROW_EXPORT ArrayData {
358359

359360
/// \brief Return the physical null count
360361
///
361-
/// The null count is lazily computed from the array's validity bitmap,
362-
/// if not already cached.
362+
/// This method returns the number of array elements for which `IsValid` would
363+
/// return false.
364+
///
365+
/// A cached value is returned if already available, otherwise it is first
366+
/// computed and stored.
367+
/// How it is is computed depends on the data type, see `IsValid` for details.
368+
///
369+
/// Note that this method is typically much faster than calling `IsValid`
370+
/// for all elements. Therefore, it helps avoid per-element validity bitmap
371+
/// lookups in the common cases where the array contains zero or only nulls.
363372
int64_t GetNullCount() const;
364373

365374
/// \brief Return true if the array may have nulls in its validity bitmap
@@ -492,9 +501,14 @@ struct ARROW_EXPORT BufferSpan {
492501
}
493502
};
494503

495-
/// \brief EXPERIMENTAL: A non-owning ArrayData reference that is cheaply
496-
/// copyable and does not contain any shared_ptr objects. Do not use in public
497-
/// APIs aside from compute kernels for now
504+
/// \brief EXPERIMENTAL: A non-owning array data container
505+
///
506+
/// Unlike ArrayData, this class doesn't own its referenced data type nor data buffers.
507+
/// It is cheaply copyable and can therefore be suitable for use cases where
508+
/// shared_ptr overhead is not acceptable. However, care should be taken to
509+
/// keep alive the referenced objects and memory while the ArraySpan object is in use.
510+
/// For this reason, this should not be exposed in most public APIs (apart from
511+
/// compute kernel interfaces).
498512
struct ARROW_EXPORT ArraySpan {
499513
const DataType* type = NULLPTR;
500514
int64_t length = 0;

0 commit comments

Comments
 (0)