Description
Code Sample, a copy-pastable example if possible
In [1]: import pandas as pd
In [2]: pd.__version__
Out[2]: '0.26.0.dev0+1563.g1feefc692'
In [3]: pd.SparseArray
Out[3]: pandas.core.arrays.sparse.array.SparseArray
In [4]: pd.arrays.SparseArray
Out[4]: pandas.core.arrays.sparse.array.SparseArray
In [5]: pd.arrays.IntegerArray
Out[5]: pandas.core.arrays.integer.IntegerArray
In [6]: pd.IntegerArray
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-12476104dd13> in <module>
----> 1 pd.IntegerArray
C:\Code\pandas_dev\pandas\pandas\__init__.py in __getattr__(name)
246 return type(name, (), {})
247
--> 248 raise AttributeError(f"module 'pandas' has no attribute '{name}'")
249
250
AttributeError: module 'pandas' has no attribute 'IntegerArray'
In [7]: pd.arrays.StringArray
Out[7]: pandas.core.arrays.string_.StringArray
In [8]: pd.StringArray
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-86553ff3c48c> in <module>
----> 1 pd.StringArray
C:\Code\pandas_dev\pandas\pandas\__init__.py in __getattr__(name)
246 return type(name, (), {})
247
--> 248 raise AttributeError(f"module 'pandas' has no attribute '{name}'")
249
250
AttributeError: module 'pandas' has no attribute 'StringArray'
Problem description
I discovered this while working on #30628 . The docs for SparseArray
are at the top level (https://dev.pandas.io/docs/reference/api/pandas.SparseArray.html), while the docs for IntegerArray
(https://dev.pandas.io/docs/reference/api/pandas.arrays.IntegerArray.html), StringArray
(https://dev.pandas.io/docs/reference/api/pandas.arrays.StringArray.html), etc. are at the pandas.arrays
level.
In the code SparseArray
is at both levels, but IntegerArray
, StringArray
, etc. is only at the arrays
level.
Expected Output
Unsure.
It seems that this should be consistent. Options are:
- Put all
*Array
classes at top level, and document them that way. (i.e., use the pattern currently used forSparseArray
). That would involve code and documentation changes for all of the arrays exceptSparseArray
. - Put all
*Array
classes at both levels (likeSparseArray
), but document them at thepandas.arrays
level (likeIntegerArray
andStringArray
). That would involve code changes for all of the arrays, and doc changes forSparseArray
. - Put all
*Array
classes only at thepandas.arrays
level and document them all that way. That would involve only changing code and docs forSparseArray
and leaving the others alone.
It's not clear to me which is preferred.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 1feefc6
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 0.26.0.dev0+1563.g1feefc692
numpy : 1.17.4
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 42.0.2.post20191203
Cython : 0.29.14
pytest : 5.3.2
hypothesis : 4.54.2
sphinx : 2.3.0
blosc : None
feather : None
xlsxwriter : 1.2.6
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.10.2
pandas_datareader: None
bs4 : 4.8.1
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.2
s3fs : None
scipy : 1.3.2
sqlalchemy : 1.3.11
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.6
numba : 0.46.0