GH-48470: [Python] Construct UuidArray from list of UuidScalars#48746
GH-48470: [Python] Construct UuidArray from list of UuidScalars#48746rok merged 9 commits intoapache:mainfrom
Conversation
|
|
# Conflicts: # python/pyarrow/tests/test_extension_type.py
043c8a8 to
dbf1194
Compare
rok
left a comment
There was a problem hiding this comment.
Looks good. Some minor change and more extensive tests suggestions.
| // Helper function to unwrap extension scalar to its storage scalar | ||
| const Scalar& GetStorageScalar(const Scalar& scalar) { | ||
| if (scalar.type->id() == Type::EXTENSION) { | ||
| return *checked_cast<const ExtensionScalar&>(scalar).value; | ||
| } | ||
| return scalar; | ||
| } | ||
|
|
There was a problem hiding this comment.
How about using Result here?
| // Helper function to unwrap extension scalar to its storage scalar | |
| const Scalar& GetStorageScalar(const Scalar& scalar) { | |
| if (scalar.type->id() == Type::EXTENSION) { | |
| return *checked_cast<const ExtensionScalar&>(scalar).value; | |
| } | |
| return scalar; | |
| } | |
| // Helper function to unwrap extension scalar to its storage scalar | |
| Result<const Scalar*> GetStorageScalar(const Scalar& scalar) { | |
| if (scalar.type->id() != Type::EXTENSION) { | |
| return &scalar; | |
| } | |
| const auto& extension_scalar = checked_cast<const ExtensionScalar&>(scalar); | |
| return extension_scalar.value.get(); | |
| } |
There was a problem hiding this comment.
Perhaps Result is not needed here - if its kind of failure path is unlikely then const Scalar& might be better?
Looking at the line 757 check if (PyValue::IsNull(this->options_, value)) { ...
( For an (unlikely?) non-None extension scalar with internal storage .value of a null shared_ptr both approaches don't avoid it? )
There was a problem hiding this comment.
Result is probably not needed indeed, since no error can be returned here. As you prefer @tadeja.
Co-authored-by: Rok Mihevc <rok@mihevc.org>
a38cfda to
483e866
Compare
|
For the C++ part of changes, @pitrou would you have a chance to take a look? |
|
|
||
|
|
||
| def test_array_from_extension_scalars(): | ||
| import datetime |
There was a problem hiding this comment.
Can move this stdlib import to the top of file.
|
Thanks for sanity check @pitrou! Merging. |
|
After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit d08d5e6. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Rationale for this change
Fixes #48470. Also fixes all extension types, not just UUID.
What changes are included in this PR?
An extension scalar is unwrapped to its storage type when building arrays.
Are these changes tested?
Yes, new
test_array_from_extension_scalarscovers builtin (uuid, bool8, json_, opaque) and custom types across all storage types (int, float, bool, string, binary, large string/binary, decimal, fixed-size binary, struct, timestamp, duration, date).Are there any user-facing changes?
Now user can run such an example to get the output below instead of
ArrowInvalidmessage.This now works for any extension type, not just UUID.
UuidArrayfrom list ofUuidScalars #48470