Description
df.stack(level=[0,1], dropna=False)
appears to be equivalent to df.stack(level=0, dropna=False).stack(level=0, dropna=False)
. While this makes a certain amount of sense, it results in additional rows that in many cases are probably not expected/desired. In particular, df.stack(level=[0,1], dropna=False)
is not equivalent to df.stack(level=[0,1], dropna=True)
even when df
contains no missing values, which seems counterintuitive.
I think that when stacking multiple levels, one may want them stacked in one go, rather than sequentially -- so that when there are no missing values, df.stack(level=[0,1], dropna=False)
would produce the same result as df.stack(level=[0,1], dropna=True)
.
Here is an example of current behavior. Since df
has no missing values, I would want [6]
to produce the same results as [5]
, not [7]
.
Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:16:31) [MSC v.1600 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.
IPython 2.3.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: df = pd.DataFrame(np.zeros((2,3)), columns=pd.MultiIndex.from_tuples([('A','x'), ('A','y'), ('B','z')], names=['Upper', 'Lower']))
In [4]: df
Out[4]:
Upper A B
Lower x y z
0 0 0 0
1 0 0 0
In [5]: df.stack(level=[0,1], dropna=True)
Out[5]:
Upper Lower
0 A x 0
y 0
B z 0
1 A x 0
y 0
B z 0
dtype: float64
In [6]: df.stack(level=[0,1], dropna=False)
Out[6]:
Upper Lower
0 A x 0
y 0
z NaN
B x NaN
y NaN
z 0
1 A x 0
y 0
z NaN
B x NaN
y NaN
z 0
dtype: float64
In [7]: df.stack(level=0, dropna=False).stack(level=0, dropna=False)
Out[7]:
Upper Lower
0 A x 0
y 0
z NaN
B x NaN
y NaN
z 0
1 A x 0
y 0
z NaN
B x NaN
y NaN
z 0
dtype: float64
In [13]: pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.4.2.final.0
python-bits: 64
OS: Windows
OS-release: 8
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.15.1
nose: 1.3.4
Cython: 0.21.1
numpy: 1.9.1
scipy: 0.14.0
statsmodels: 0.6.0
IPython: 2.3.1
sphinx: None
patsy: 0.3.0
dateutil: 2.2
pytz: 2014.9
bottleneck: 0.8.0
tables: 3.1.1
numexpr: 2.4
matplotlib: 1.4.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.6.3
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.8
pymysql: None
psycopg2: None