chore(migration): Migrate code from googleapis/python-bigquery-dataframes into packages/bigframes#16493
Conversation
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> 🦕
…when `connection_id` is not present (#2272) Fixes #460856043 🦕
…2258) Previously, when the total number of rows (row_count) was unknown (e.g., due to deferred computation or errors), it would incorrectly default to 0. This resulted in confusing UI, such as displaying "Page 1 of 0", and allowed users to navigate to empty pages without automatically returning to valid data. current display strategy for the interactive table widget: * When `row_count` is a positive number (e.g., 50): * Total Rows Display: Shows the exact count, like 50 total rows. * Pagination Display: Shows the page relative to the total rows, like Page 1 of 50. * Navigation: The "Next" button is disabled only on the final page. * When `row_count` is `None` (unknown): * Total Rows Display: Shows Total rows unknown. * Pagination Display: Shows the page relative to an unknown total, like Page 1 of many. * Navigation: The "Next" button is always enabled, allowing you to page forward until the backend determines there is no more data. Fixes #<428238610> 🦕
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> 🦕
Fixes internal issue 445774480🦕 --------- Co-authored-by: Shenyang Cai <sycai@users.noreply.github.com>
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes b/447388852 🦕
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> 🦕 --------- Co-authored-by: Shenyang Cai <sycai@users.noreply.github.com>
#2287) This pull request addresses a pagination display bug in the `anywidget` table where a small DataFrame (e.g., 5 rows) would incorrectly show "Page 1 of 5" instead of "Page 1 of 1". * **Fixed `table_widget.js` pagination logic:** Corrected the JavaScript to accurately calculate total pages, ensuring "Page 1 of 1" is displayed for datasets smaller than the page size. * **Added comprehensive system test:** Enhanced `test_anywidget.py` by improving the `test_widget_with_few_rows_should_have_only_one_page` test. This test now explicitly asserts the correct `row_count` and verifies that page navigation is correctly clamped to the first page, thus confirming the backend conditions for the "Page 1 of 1" frontend display. Fixes #<issue_number_goes_here> 🦕
Fixes internal issue 445774480 🦕
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes b/447388852 🦕
…#2289) Fixes internal issue 445774480 🦕
…2293) Also: - include link to `bigframes.bigquery.ai` in README - add partial ordering mode recommendation to starter sample - remove 2.0 warning Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Towards b/454350869 🦕
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> 🦕
…2255) This PR introduces single-column sorting functionality to the interactive table widget. 1) **Three-State Sorting UI** 1.1) The sort indicator dot (●) is now hidden by default and only appears when the user hovers the mouse over a column header 1.2) Implemented a sorting cycle: unsorted (●) → ascending (▲) → descending (▼) → unsorted (●). 1.3) Visual indicators (●, ▲, ▼) are displayed in column headers to reflect the current sort state. 1.4) Sorting controls are now only enabled for columns with orderable data types. 2) **Tests** 2.1) Updated `paginated_pandas_df` fixture for better sorting test coverage 2.2) Added new system tests to verify ascending, descending, and multi-column sorting. **3. Frontend Unit Tests** JavaScript-level unit tests have been added to validate the widget's frontend logic, specifically the new sorting functionality and UI interactions. **How to Run Frontend Unit Tests**: To execute these tests from the project root directory: ```bash cd tests/js npm install # Only needed if dependencies haven't been installed or have changed npm test ``` Docs has been updated to document the new features. The main description now mentions column sorting and adjustable widths, and a new section has been added to explain how to use the column resizing feature. The sorting section was also updated to mention that the indicators are only visible on hover. Fixes #<459835971> 🦕 --------- Co-authored-by: Tim Sweña (Swast) <swast@google.com>
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes b/447388852 🦕
This change aims to fix the `test_timestamp_series_diff_agg` test failing in #2248. Fixes internal issue 417774347 🦕
This change aims to fix some string-related tests failing in #2248. Fixes internal issue 417774347🦕
The default maximum instances for cloud functions is 100, not 0. Updated the `expected_max_instances` in the `parametrize` decorator to 100 for the 'no-set' and 'set-None' test cases to accurately reflect the runtime behavior. Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes b/465212379 🦕
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> 🦕
See instructions at https://pydata-sphinx-theme.readthedocs.io/en/latest/user_guide/analytics.html#google-analytics Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> 🦕 Co-authored-by: Shuowei Li <shuowei@google.com>
…es (#2533) Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: - [ ] Make sure to open an issue as a [bug/issue](https://github.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea - [ ] Ensure the tests and linter pass - [ ] Code coverage does not decrease (if any source code was changed) - [ ] Appropriate docs were updated (if necessary) Fixes #<issue_number_goes_here> 🦕
Updates documentation and internal comments to use the term "ObjectRef column" instead of "Blob column", as per the official BigQuery documentation. Links to the documentation are included in user-facing docstrings. --- *PR created automatically by Jules for task [15739234298342142432](https://jules.google.com/task/15739234298342142432) started by @tswast* Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: tswast <247555+tswast@users.noreply.github.com>
This will be the reference notebook to be used by the tech blog on AI functions in BigFrames
Fixes #<496320476> 🦕
SQL generator output fallbacks (SELECT 1 placeholder). Fixes #<452681068> 🦕
Fixes internal issue 497970577🦕
…rames/main' into migration.python-bigquery-dataframes.migration.2026-03-31_18-48-47.migrate
There was a problem hiding this comment.
Code Review
This pull request introduces the foundational codebase for the bigframes package, including core execution nodes, BigQuery operation compilers, and extensive CI/CD configurations. The review feedback identifies a bug in the exponentiation logic for negative bases and highlights opportunities to improve behavioral parity with Pandas by returning null instead of zero for division and modulo by zero operations. Additionally, a suggested fix addresses a potential infinite loop in the repository root detection script used during environment setup.
I am having trouble creating individual review comments. Click here to see my feedback.
packages/bigframes/bigframes/core/compile/ibis_compiler/scalar_op_registry.py (1548)
The odd_exponent calculation for negative bases is incorrect for negative odd integers. In BigQuery (and many other SQL dialects), the MOD operator returns a negative value for negative inputs (e.g., MOD(-3, 2) returns -1). As a result, the condition == _ibis_num(1) will fail for negative odd exponents, leading to an incorrect sign in the final result when overflow_cond is true. Using the absolute value of the exponent or checking for a non-zero remainder would fix this.
odd_exponent = (x_val < _ZERO) & (y_val.cast(ibis_dtypes.int64).abs() % _ibis_num(2) == _ibis_num(1))packages/bigframes/.kokoro/trampoline_v2.sh (232-238)
The repo_root function can enter an infinite loop if it is executed in a directory that is not part of a git repository. This happens because dirname "/" returns /, so the while loop condition [[ ! -d "${dir}/.git" ]] will never become false if /.git does not exist. Adding a check for the root directory or a maximum depth would make the script more robust.
function repo_root() {
local dir="$1"
while [[ ! -d "${dir}/.git" && "$dir" != "/" ]]; do
dir="$(dirname "$dir")"
done
if [[ ! -d "${dir}/.git" ]]; then
echo "Error: Could not find .git directory in any parent of $1" >&2
exit 1
fi
echo "${dir}"
}
packages/bigframes/bigframes/core/compile/ibis_compiler/scalar_op_registry.py (1634)
In floordiv_op, integer division by zero currently returns 0 (via _ZERO * x_numeric at line 1637). This is inconsistent with Pandas behavior for nullable integers (Int64), where division by zero should result in a null/NA value. Returning 0 can lead to silent errors in data processing. It is recommended to return null for the integer case.
zero_result = _INF if (x.type().is_floating() or y.type().is_floating()) else ibis.null().cast(x.type())packages/bigframes/bigframes/core/compile/ibis_compiler/scalar_op_registry.py (1741-1744)
In _int_mod, the modulo operation with a zero divisor returns 0 (via _ZERO * x). To maintain consistency with Pandas and avoid mathematically incorrect results, this should return null instead. Returning 0 hides the division-by-zero error and produces an incorrect value.
.when(
y == _ZERO,
ibis.null().cast(x.type()),
) # Return NULL for division by zero to match pandas behavior…on.2026-03-31_18-48-47.migrate
See #15999.
This PR should be merged with a merge-commit, not a squash-commit, in order to preserve the git history.