-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add warning to .groupby
when null keys would be dropped due to default dropna
#61351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
_NULL_KEY_MESSAGE = ( | ||
"`dropna` is not specified but grouper encountered null group keys. These keys " | ||
"will be dropped from the result by default. To keep null keys, set `dropna=True`, " | ||
"or to hide this warning and drop null keys, set `dropna=False`." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a standard approach for a warning message that could be hit from two lines of code?
@property | ||
def dropna(self) -> bool: | ||
if self._dropna is lib.no_default: | ||
return True | ||
return self._dropna |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the implementation is trivial, but this is redundant with Grouper
. I'm not sure we can get around it while still being a class property, but should the default value be referenced as a constant defined just once?
this will help with PDEP-11 (pandas-dev#53094) as an intermediate step to identify tests that will fail under the default value
DataFrame.groupby
drops NA keys #61339doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.TODO:
codes
check approaches (codes.min()
was about 3x faster).pivot_table
/.stack
/etc. (possibly in a follow-up PR?)