Open
Description
dti = pd.date_range("2016-01-01", periods=3)
ser = pd.Series(dti)
ser[0] = pd.NaT
dense = ser._values
sparse = pd.core.arrays.SparseArray(ser.values)
>>> dense.astype("int64")
array([-9223372036854775808, 1451692800000000000, 1451779200000000000])
>>> sparse.astype("int64")
[...]
ValueError: Cannot convert NaT values to integer
>>> sparse.astype("Sparse[int64]")
[0, 1451692800000000000, 1451779200000000000]
Fill: 0
IntIndex
Indices: array([1, 2], dtype=int32)
The dense version goes through DatetimeArray.astype
, for which .astype to int64 is basically a view (xref #45034). The Sparse version goes through astype_nansafe
which specifically checks for NaTs when going from dt64->int64. I expected this to match the non-sparse behavior.
When converting to Sparse[int64]
, we only call astype_nansafe on the non-NaT elements so it doesn't raise, but when converting the fill_value from NaT it somehow gets 0, whereas I expected that to raise.
Side-notes:
ser.astype(pd.SparseDtype(ser.dtype))
raises, as does dense.astype("Sparse[int64]")