Skip to content

BUG: can't use .loc with boolean values in MultiIndex #47687

Closed
@kalekundert

Description

@kalekundert

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

>>> df = pd.DataFrame({'a': [True, True, False, False], 'b': [True, False, True, False], 'c': [1,2,3,4]})
>>> df
       a      b  c
0   True   True  1
1   True  False  2
2  False   True  3
3  False  False  4
>>> df.set_index(['a']).loc[True]
          b  c
a
True   True  1
True  False  2
>>> df.set_index(['a', 'b']).loc[True]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kale/hacking/forks/pandas/pandas/core/indexing.py", line 1071, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/home/kale/hacking/forks/pandas/pandas/core/indexing.py", line 1306, in _getitem_axis
    self._validate_key(key, axis)
  File "/home/kale/hacking/forks/pandas/pandas/core/indexing.py", line 1116, in _validate_key
    raise KeyError(
KeyError: 'True: boolean label can not be used without a boolean index'

Issue Description

The above snippet creates a data frame with two boolean columns. When just one of those columns is set as the index, .loc[] can select rows based on a boolean value. When both columns are set as the index, though, the same value causes .loc[] to raise a KeyError. The error message makes it seems like pandas doesn't regard a multi-index as a "boolean index", even if the columns making up the index are all boolean.

In pandas 1.4.3, I can work around this by indexing with 1 and 0 instead of True and False. But in 1.5.0 it seems to be an error to provide a integer value to a boolean index.

Expected Behavior

I would expect to be able to use boolean index values even when those values are part of a multi-index:

>>> df.set_index(['a', 'b']).loc[True]
       c
b
True   1
False  2

Installed Versions

INSTALLED VERSIONS ------------------ commit : 74f4e81 python : 3.10.0.final.0 python-bits : 64 OS : Linux OS-release : 5.18.10-arch1-1 Version : #1 SMP PREEMPT_DYNAMIC Thu, 07 Jul 2022 17:18:13 +0000 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.5.0.dev0+1134.g74f4e81ca8
numpy : 1.22.0
pytz : 2021.3
dateutil : 2.8.2
setuptools : 57.4.0
pip : 22.1.2
Cython : 0.29.30
pytest : 7.1.1
hypothesis : 6.39.4
sphinx : 4.3.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.8.0
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 8.0.1
pandas_datareader: None
bs4 : 4.10.0
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.1
numba : None
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.8.0
snappy : None
sqlalchemy : 1.4.31
tables : None
tabulate : 0.8.9
xarray : None
xlrd : None
xlwt : None
zstandard : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndex

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions