Closed
Description
-
[y] I have checked that this issue has not already been reported.
-
[y] I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
this is a tiny example
tmp = pd.DataFrame({'id': ['1624477460271-3908654213', '1624477460271-3908654213'], 'dt':[np.nan, pd.to_timedelta('0:0:2')]})
tmp
Out[6]:
id dt
0 1624477460271-3908654213 NaT
1 1624477460271-3908654213 0 days 00:00:02
Problem description
dataframe.groupby('some_column').timedelta.sum() results wrong when timedelta contains NaT
tmp.groupby('id').dt.sum()
Out[7]:
id
1624477460271-3908654213 -106752 days +00:12:45.145224192
Name: dt, dtype: timedelta64[ns]
Expected Output
tmp.dt.sum()
Out[8]: Timedelta('0 days 00:00:02')
Output of pd.show_versions()
[pandas 1.3.0]