v0.7.4
·
60 commits
to refs/heads/main
since this release
What's Changed 🚀
💥 Breaking Changes
- refactor(arrow2)!: remaining arrow2 from daft-core @universalmind303 (#6284)
- refactor(arrow2)!: use arrow-rs casting for strftime function @universalmind303 (#6263)
- refactor(arrow2)!: migrate interval arithmetic to arrow-rs @rohitkulshreshtha (#6186)
✨ Features
- feat(ai): Add HTTP URL direct passthrough for images and videos in prompt function @huleilei (#6182)
- feat(sql): add DATE_TRUNC function support @desmondcheongzx (#6258)
- feat(swordfish): Streaming sources @colin-ho (#5978)
- feat(metrics): add metrics docs @cckellogg (#6253)
- feat(io/av): enhance time-interval sampling with comprehensive tests and improved @huleilei (#6088)
- feat: Add Tencent Cloud COS (Cloud Object Storage) support @XuQianJin-Stars (#6140)
- feat(metrics): consolidate naming and add node.type attribute @cckellogg (#6236)
- feat: add Flight shuffle to Flotilla @srilman (#6123)
- feat: Apache OpenDAL™ compatible backends @universalmind303 (#6177)
- feat(observability): Split duration into separate column in metrics DF @srilman (#6235)
- feat: add support for pyiceberg 0.11.0 @gweaverbiodev (#6200)
- feat: add support for SQL ORDER BY column position @Lucas61000 (#6211)
- feat: Supports running dashboard in daemon mode @plotor (#5993)
- feat: json_write support for time stamps @gpathak128 (#6214)
- feat:
.as_Tcast methods @aaron-ang (#6100)
🐛 Bug Fixes
- fix: handle case where join keys are different for sort-merge multi-partition join @gweaverbiodev (#6243)
- fix(arrow2): reinterpret physical array as logical type after cast fallback @desmondcheongzx (#6291)
- fix(sql): resolve GROUP BY ambiguous column names for derived expressions @desmondcheongzx (#6286)
- fix(flight): Add a check for
flight_shuffle_dirsarg and change default @srilman (#6266) - fix: Map Literal <-> Python Dict conversion @w2ais (#6084)
- fix: NaN-aware comparator for multi-column search_sorted and sort @desmondcheongzx (#6242)
- fix: add ignore_empty_and_null parameter to
.explode()@singularityDLW (#6047) - fix: Broadcast literal expressions in aggregations to match input length @desmondcheongzx (#6155)
- fix: Cleanup imports from dashboard daemon PR @srilman (#6222)
- fix: Fix compilation on main @srilman (#6221)
- fix: canonicalize negative NaN in multi-column sort comparator @ykdojo (#6215)
- fix: delta version parsing @aaron-ang (#6156)
- fix: Register extension type before reading from Lance @ykdojo (#6058)
- fix: use union instead of append_column in window agg to fix schema mismatch @ykdojo (#6178)
- fix: into_batches should not allow downstream shuffle elision @desmondcheongzx (#6170)
♻️ Refactor
- refactor(arrow2): refactor index_bitmap and time_unit @universalmind303 (#6287)
- refactor(arrow2)!: remaining arrow2 from daft-core @universalmind303 (#6284)
- refactor(arrow2): remove series::try_from(name, arrow2_arr) @universalmind303 (#6283)
- refactor(arrow2): switch to arrow-rs backed arrays @universalmind303 (#6280)
- refactor(arrow2): misc deprecation warnings @universalmind303 (#6270)
- refactor(arrow2): remove from_arrow2 @universalmind303 (#6267)
- refactor(arrow2): fully remove toarrow2 from arrays & series @universalmind303 (#6265)
- refactor(arrow2)!: use arrow-rs casting for strftime function @universalmind303 (#6263)
- refactor(arrow2): use arrow-rs for filtering on python arrays @universalmind303 (#6262)
- refactor(arrow2): migrate cast.rs from arrow2 to arrow-rs @desmondcheongzx (#6239)
- refactor(arrow2): migrate BooleanArray bitmap access to arrow-rs @desmondcheongzx (#6256)
- refactor(arrow2): migrate concat.rs to arrow-rs @desmondcheongzx (#6255)
- refactor(arrow2): remove misc arrow2 references in csv and json @universalmind303 (#6248)
- refactor(arrow2): remove more arrow2 usages in daft-core @universalmind303 (#6249)
- refactor(arrow2): remove arrow2 from daft-recordbatch @universalmind303 (#6231)
- refactor(arrow2): replace arrow2 based buffer with custom impl @universalmind303 (#6247)
- refactor(arrow2)!: migrate interval arithmetic to arrow-rs @rohitkulshreshtha (#6186)
- refactor(arrow2): remove daft_arrow from arrow_growable @universalmind303 (#6251)
- refactor(arrow2): migrate len.rs to arrow-rs @rohitkulshreshtha (#6171)
- refactor(arrow2): migrate image.rs to arrow-rs @rohitkulshreshtha (#6191)
- refactor(arrow2): migrate dyn_compare and probeable to arrow-rs @desmondcheongzx (#6227)
- refactor(arrow2): migrate if_else kernel from arrow2 to arrow-rs @desmondcheongzx (#6240)
- refactor(arrow2): migrate array serdes from arrow2 to arrow-rs @desmondcheongzx (#6238)
- refactor(arrow2): migrate growable internals from arrow2 to arrow-rs @desmondcheongzx (#6228)
- refactor(arrow-rs): Remove basic growable usages @srilman (#5779)
- refactor(arrow2): misc arrow2 cleanups in daft-core @universalmind303 (#6232)
- refactor(arrow2): remove daft_arrow from daft-functions-utf8 @universalmind303 (#6229)
- refactor(arrow2): fully remove daft-arrow from functions-list @universalmind303 (#6230)
- refactor(arrow2): migrates hashing kernel from arrow2 to arrow-rs @universalmind303 (#6166)
- refactor(arrow2): replace arrow2 iterators with custom daft-native iterators @universalmind303 (#6220)
- refactor(arrow2): list kernels @universalmind303 (#6219)
- refactor(arrow2): migrate groups to arrow-rs @cckellogg (#6185)
- refactor(arrow2): second attempt at offsets @universalmind303 (#6162)
- refactor(arrow2): update image_array to not use arrow2 @universalmind303 (#6217)
- refactor(arrow2): Migrate arithmetic kernels @desmondcheongzx (#6193)
- refactor(arrow2): Migrate concat agg @desmondcheongzx (#6190)
📖 Documentation
- docs: Fix broken links to the url modality and various datatypes @desmondcheongzx (#6212)
✅ Tests
- test(benchmarking): add some read_json python benchmarks @universalmind303 (#6288)
- test(parquet): add benchmarks for nested types, codecs, and filter pushdown @desmondcheongzx (#6285)
- test: Filter null bytes from generated column names in property-based tests @desmondcheongzx (#6213)
- test(postmerge): Minor fix to OpenAI integration tests @desmondcheongzx (#6209)
- test: Fix incomplete metrics migration in OpenAI integration tests @desmondcheongzx (#6204)
👷 CI
- ci: include dashboard assets in maturin sdist for manylinux builds @desmondcheongzx (#6281)
- ci: fix restore-mtime exit code when last file is deleted on macos runners @desmondcheongzx (#6268)
- ci: cache workspace crates and share rust caches across all workflows @desmondcheongzx (#6264)
- ci: cache workspace crate artifacts in integration build @desmondcheongzx (#6261)
- ci: share rust cache across branches and restore file mtimes @desmondcheongzx (#6246)
- ci: bump all timeouts from 30 -> 45 @universalmind303 (#6250)
- ci: increase integration-test-build timeout from 45 to 90 minutes @desmondcheongzx (#6244)
- ci: increase integration-test-build timeout from 30 to 45 minutes @desmondcheongzx (#6241)
- ci: increase integration-test-sql timeout from 30 to 45 minutes @desmondcheongzx (#6237)
- ci: Reduce xdist workers on macOS to fix Ray actor timeout @desmondcheongzx (#6210)
- ci: make repository secrets optional for CI workflows @jeevb (#6199)
- ci: Improve CI reliability via disk space reclamation and coverage optimization @desmondcheongzx (#6205)
🔧 Maintenance
- chore: Update codeowners @colin-ho (#6183)
- chore: Adding Issue requirement for PRs, Updating maintainers @madvart (#6196)
- chore(observability): Split dashboard cli into separate start / stop subcommands @srilman (#6234)
- chore(deps): Resolve Dependabot security alerts @desmondcheongzx (#6226)
- chore: address review feedback from #6208 @desmondcheongzx (#6225)
- chore(deps): Bump dependency group with conflict resolution @desmondcheongzx (#6208)
- chore: Replace Bun with Node / NPM @srilman (#6202)
- chore(arrow2): simplify deprecation markers @universalmind303 (#6218)
- chore: add .values method for utf8Array @universalmind303 (#6216)
- chore: add mypy-boto3-glue to aws optional dependencies @Killua7163 (#6080)
Full Changelog: v0.7.3...v0.7.4