-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: document dropna kwarg of pd.factorize #35667
Comments
I think it's the other way around: Doc updates are certainly welcome! |
Thank you for the hint. It indeed works:
It seems to be the case that the During testing I also found that the kwargs does not seem to be available on the Series API:
|
I agree the name can be a bit confusing:
So the "dropping" is about dropping NA from the uniques, not from the original values / code. I suppose we mainly added this for internal usage to implement groupby's |
This issue is no longer an issue, as the confusion has been cleared by @jorisvandenbossche and #35667 should be closed now, right? Or am I misunderstanding something? |
The keyword still needs to be documented (and potentially its name discussed) |
emm, indeed, the I am very busy recently, so I could try to wrap up a PR to add docstring (also change the naming if needed) and add support for Series case at this weekend. But if anyone has time and is up for a PR, feel free to do so!! |
does documentation include adding examples? Edit: |
Is this one still active? |
since the keyword is not documented, could one assume that it is not yet part of the public API if so, could we incorporate the functionality using the
on master
The
|
It actually is "working" .., it's just that the False is cast to the dtype of the codes, being int: In [25]: pd.factorize(np.array([1, 2, 1, np.nan]), na_sentinel=False) # interpreted as 0
Out[25]: (array([0, 1, 0, 0]), array([1., 2.]))
In [26]: pd.factorize(np.array([1, 2, 1, np.nan]), na_sentinel=True) # interpreted as 1
Out[26]: (array([0, 1, 0, 1]), array([1., 2.])) Now, since the codes can only be int, we can of course special case booleans. |
Hey guys |
Location of the documentation
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.factorize.html
Documentation problem
The docs show the existence of a kwarg "dropna" which does not exist
Suggested fix for documentation
Delete the kwarg "dropna"
The text was updated successfully, but these errors were encountered: