ENH/API: resolution inference in vectorized datetime parsing

@MarcoGorelli

I'm in the process of implementing resolution inference for vectorized datetime parsing in array_to_datetime. This issue is to track and discuss design issues.

Should we implement this as a breaking change in 3.0 or bugfix in 2.x? We changed the scalar behavior in 2.0.
What to do when mixed resolutions are detected? e.g. pd.to_datetime(["2016-01-01", "2016-01-01T02:03:04.050607"])? ATM in the branch I have going I get the resolution from the first non-NaT entry and apply that everywhere.
i) Are we OK with that value-dependent behavior?
ii) What if we see a np.datetime64("nat", "s") i.e. it has a reso attached. should we infer "s" from that?
What resolution to infer for "today" or "now"? ATM the Timestamp constructor gives "ns", but bc it goes through the stdlib datetime.now, I'm thinking "us" might be more appropriate. (i also don't care that much)
ATM i only handle array_to_datetime, also need to handle array_strptime.
Need to do the same for the timedelta64 paths.

ATM 1455 tests are failing locally. Hopefully these are mostly bc they hard-code ns in "expected".

cc @MarcoGorelli

Issues that I think implementing this will address

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH/API: resolution inference in vectorized datetime parsing #55564

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ENH/API: resolution inference in vectorized datetime parsing #55564

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions