Skip to content

BUG: pd.Categorical(categorical, dtype="category") loses orderedness #49309

Closed
@jbrockmendel

Description

@jbrockmendel
cat = pd.Categorical([3, 1], categories=[3, 2, 1], ordered=True)

res = pd.Categorical(cat, dtype="category")

>>> res.dtype
CategoricalDtype(categories=[3, 2, 1], ordered=False)
>>> cat.dtype
CategoricalDtype(categories=[3, 2, 1], ordered=True)

By contrast, if we do cat.astype("category") we do retain orderedness.

Found when refactoring sanitize_array. We have two code paths effectively:

if isinstance(data, ExtensionArray):
    if dtype is not None:
        return data.astype(dtype, copy=copy)
    elif copy:
        return data.copy()
    return data

if isinstance(dtype, ExtensionDtype):
    cls = dtype.construct_array_type()
    return cls._from_sequence(data, dtype=dtype, copy=copy)

Switching the order of these checks causes a few Categorical tests to break bc of the Categorical constructor behavior.

Metadata

Metadata

Assignees

Labels

BugCategoricalCategorical Data TypeConstructorsSeries/DataFrame/Index/pd.array Constructors

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions