Skip to content

ENH: Change default behavior of rolling.count to be consistent with others #31302

Closed
@fujiaxiang

Description

@fujiaxiang

Following on discussion happened in #30923, we may want to change the default behavior of rolling.count with regards to its parameter min_periods, so it is consistent with all other similar APIs such as rolling.mean and rolling.sum.

Code Sample

With the updates from #30923

>>> import numpy as np
>>> import pandas as pd
>>> s = pd.Series([1, 1, 1, np.nan, 1, 1, 1])
>>> s
0    1.0
1    1.0
2    1.0
3    NaN
4    1.0
5    1.0
6    1.0
dtype: float64

# rolling.mean and rolling.sum defaults min_periods to the same value as window size (3 in this case)
# notice that it requires not only the window size to be at least 3, but also the number of valid entries (not NaN) to be at least 3
>>> s.rolling(3).mean()  
0    NaN
1    NaN
2    1.0
3    NaN
4    NaN
5    NaN
6    1.0
dtype: float64

>>> s.rolling(3).sum()
0    NaN
1    NaN
2    3.0
3    NaN
4    NaN
5    NaN
6    3.0
dtype: float64

# the default value of min_periods for rolling.count is 0
# we may want to change this behavior so it's consistent with other APIs
>>> s.rolling(3).count()
0    1.0
1    2.0
2    3.0
3    2.0
4    2.0
5    2.0
6    3.0
dtype: float64

# notice that rolling.count requires window size to be at least equal to min_periods to give a result
# it doesn't care about how many valid entries (not NaN) to determine if it should output NaN
# we should retain this behavior because this function is meant to count the number of valid entries
>>> s.rolling(3, min_periods=3).count()
0    NaN
1    NaN
2    3.0
3    2.0
4    2.0
5    2.0
6    3.0
dtype: float64

Expected Output

>>> s.rolling(3).count()
0    NaN
1    NaN
2    3.0
3    2.0
4    2.0
5    2.0
6    3.0
dtype: float64

Problem description

With the updates from #30923, the min_periods argument of rolling.count is now respected (it used to be completely ignored). However, the default value remains 0 for backward compatibility purpose. In future updates we probably want to change this default behavior so it's consistent with other similar APIs.

@mroeschke previously mentioned we needed to start with a DeprecationWarning to inform users of future changes, then probably in the following release make the actual change.

@jreback @WillAyd
Let me know what you guys think!

Metadata

Metadata

Assignees

No one assigned

    Labels

    API - ConsistencyInternal Consistency of API/BehaviorDeprecateFunctionality to remove in pandasWindowrolling, ewma, expanding

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions