Skip to content

SparseArray not in arrays module - inconsistent with IntegerArray, StringArray, etc. #30642

@Dr-Irv

Description

@Dr-Irv

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd

In [2]: pd.__version__
Out[2]: '0.26.0.dev0+1563.g1feefc692'

In [3]: pd.SparseArray
Out[3]: pandas.core.arrays.sparse.array.SparseArray

In [4]: pd.arrays.SparseArray
Out[4]: pandas.core.arrays.sparse.array.SparseArray

In [5]: pd.arrays.IntegerArray
Out[5]: pandas.core.arrays.integer.IntegerArray

In [6]: pd.IntegerArray
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-12476104dd13> in <module>
----> 1 pd.IntegerArray

C:\Code\pandas_dev\pandas\pandas\__init__.py in __getattr__(name)
    246             return type(name, (), {})
    247
--> 248         raise AttributeError(f"module 'pandas' has no attribute '{name}'")
    249
    250

AttributeError: module 'pandas' has no attribute 'IntegerArray'

In [7]: pd.arrays.StringArray
Out[7]: pandas.core.arrays.string_.StringArray

In [8]: pd.StringArray
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-86553ff3c48c> in <module>
----> 1 pd.StringArray

C:\Code\pandas_dev\pandas\pandas\__init__.py in __getattr__(name)
    246             return type(name, (), {})
    247
--> 248         raise AttributeError(f"module 'pandas' has no attribute '{name}'")
    249
    250

AttributeError: module 'pandas' has no attribute 'StringArray'

Problem description

I discovered this while working on #30628 . The docs for SparseArray are at the top level (https://dev.pandas.io/docs/reference/api/pandas.SparseArray.html), while the docs for IntegerArray (https://dev.pandas.io/docs/reference/api/pandas.arrays.IntegerArray.html), StringArray(https://dev.pandas.io/docs/reference/api/pandas.arrays.StringArray.html), etc. are at the pandas.arrays level.

In the code SparseArray is at both levels, but IntegerArray, StringArray, etc. is only at the arrays level.

Expected Output

Unsure.

It seems that this should be consistent. Options are:

  1. Put all *Array classes at top level, and document them that way. (i.e., use the pattern currently used for SparseArray). That would involve code and documentation changes for all of the arrays except SparseArray.
  2. Put all *Array classes at both levels (like SparseArray), but document them at the pandas.arrays level (like IntegerArray and StringArray). That would involve code changes for all of the arrays, and doc changes for SparseArray.
  3. Put all *Array classes only at the pandas.arrays level and document them all that way. That would involve only changing code and docs for SparseArray and leaving the others alone.

It's not clear to me which is preferred.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 1feefc6
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None

pandas : 0.26.0.dev0+1563.g1feefc692
numpy : 1.17.4
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 42.0.2.post20191203
Cython : 0.29.14
pytest : 5.3.2
hypothesis : 4.54.2
sphinx : 2.3.0
blosc : None
feather : None
xlsxwriter : 1.2.6
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.10.2
pandas_datareader: None
bs4 : 4.8.1
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.2
s3fs : None
scipy : 1.3.2
sqlalchemy : 1.3.11
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.6
numba : 0.46.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    SparseSparse Data Type

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions