-
Notifications
You must be signed in to change notification settings - Fork 116
feat: dataframe string formatter #1170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: dataframe string formatter #1170
Conversation
a9b9679
to
f04b973
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Left some suggestions.
python/datafusion/dataframe.py
Outdated
@staticmethod | ||
def default_str_repr( | ||
batches: list[pa.RecordBatch], | ||
schema: pa.Schema, | ||
has_more: bool, | ||
table_uuid: str | None = None, | ||
) -> str: | ||
"""Return the default string representation of a DataFrame. | ||
|
||
This method is used by the default formatter and implemented in Rust for | ||
performance reasons. | ||
""" | ||
return DataFrameInternal.default_str_repr(batches, schema, has_more, table_uuid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default_str_repr is at the bottom of the class.
It might be clearer to place it alongside other repr/formatter methods (repr, repr_html) for cohesion.
@classmethod | ||
def is_styles_loaded(cls) -> bool: | ||
"""Check if HTML styles have been loaded in the current session. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of a better idea to implement this
#1171
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! Maybe we open a separate PR for this?
python/datafusion/html_formatter.py
Outdated
"The module 'html_formatter' is deprecated and will be removed in the next release." | ||
"Please use 'dataframe_formatter' instead.", | ||
DeprecationWarning, | ||
stacklevel=2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using stacklevel=2 may not point users to their import site. Consider stacklevel=3 or more, so the warning originates where they imported it.
Which issue does this PR close?
None
Rationale for this change
We recently added a html formatter, which is useful for generating html tables as a customer. This adds the similar feature to
__repr__
to allow for text based formatting.What changes are included in this PR?
Add method to formatter to format a dataframe as a string.
Renamed html_formatter to dataframe_formatter.
Are there any user-facing changes?
Yes, the
html_formatter
module has been renamed todataframe_formatter
. A deprecation warning exists if any users try the old method. This was only released a couple of weeks ago so I doubt there has been time for adoption yet.