Parsing a Multi-Index Excel File in Pandas
Posted
by
rhaskett
on Stack Overflow
See other posts from Stack Overflow
or by rhaskett
Published on 2014-06-10T17:09:50Z
Indexed on
2014/06/11
3:25 UTC
Read the original article
Hit count: 154
I have a time series excel file with a tri-level column MultiIndex that I would like to successfully parse if possible. There are some results on how to do this for an index on stack overflow but not the columns and the parse
function has a header
that does not seem to take a list of rows.
The ExcelFile looks like is like the following:
- Column A is all the time series dates starting on A4
- Column B has top_level1 (B1) mid_level1 (B2) low_level1 (B3) data (B4-B100+)
- Column C has null (C1) null (C2) low_level2 (C3) data (C4-C100+)
- Column D has null (D1) mid_level2 (D2) low_level1 (D3) data (D4-D100+)
- Column E has null (E1) null (E2) low_level2 (E3) data (E4-E100+)
- ...
So there are two low_level
values many mid_level
values and a few top_level
values but the trick is the top and mid level values are null and are assumed to be the values to the left. So, for instance all the columns above would have top_level1 as the top multi-index value.
My best idea so far is to use transpose
, but the it fills Unnamed: #
everywhere and doesn't seem to work. In Pandas 0.13 read_csv
seems to have a header
parameter that can take a list, but this doesn't seem to work with parse
.
© Stack Overflow or respective owner