Closed
Description
Similarly as for normal reductions (eg #30982), we should investigate having masked array-specific support in the groupby algorithms.
Currently, when starting from a nullable extension array, they get converted to a numpy array (eg integers with missing values will typically get cast to float with nan) before passing to the cython algorithm.
Having support for passing a mask to the cython algos can improve the groupby support for nullable dtypes.
-
any
/all
(correctly pass mask + add Kleene logic) -> ENH/BUG: Use Kleene logic for groupby any/all #40819 -
cummin
/cummax
-> PERF/BUG: use masked algo in groupby cummin and cummax #40651 -
cumsum
/cumprod
-> ENH: Support mask in GroupBy.cumsum #48070, ENH: Support mask in groupby cumprod #48138 -
sum
/prod
/var
/mean
-
min
/max
-> BUG: Groupby min/max with nullable dtypes #42567 -
median
-
ohlc
(ENH: Add support for groupby.ohlc for ea dtypes #48081) -
quantile
(correctly pass mask) -
last
/nth
(last PERF: support mask in group_last #46107) -
rank
(ENH: support mask in libalgos.rank #46932)