Skip to content

ENH: Add global option io.nullable_type="pandas"|"pyarrow" to control IO reader use_nullable_dtype #48957

Closed
@mroeschke

Description

@mroeschke

xref: #29752, #36712

From the above issues and read_parquet, it appears that generally a use_nullable_dtypes: bool option will be added to read_* functions to allow users to opt into pandas nullable types.

Additionally in 1.5 with ArrowDtype, the nullable type returned could be backed by pyarrow instead of pandas' implementation. This could be advantageous for read_*(engine="pyarrow") capable readers where an option like io.nullable_type="pyarrow" would just preserve the pyarrow object from the pyarrow parsing function instead of converting it to a numpy object.

The proposal would be to add a new global option io.nullable_type="pandas"|"pyarrow" (default="pandas") such that when pd.read_*(..., use_nullable_dtype=True), the nullable backing type would be dictated by the global io.nullable_type setting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityEnhancementIO DataIO issues that don't fit into a more specific labelNA - MaskedArraysRelated to pd.NA and nullable extension arrays

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions