Parsing a Multi-Index Excel File in Pandas

Posted by rhaskett on Stack Overflow See other posts from Stack Overflow or by rhaskett
Published on 2014-06-10T17:09:50Z Indexed on 2014/06/11 3:25 UTC
Read the original article Hit count: 158

Filed under:
|
|
|
|

I have a time series excel file with a tri-level column MultiIndex that I would like to successfully parse if possible. There are some results on how to do this for an index on stack overflow but not the columns and the parse function has a header that does not seem to take a list of rows.

The ExcelFile looks like is like the following:

  • Column A is all the time series dates starting on A4
  • Column B has top_level1 (B1) mid_level1 (B2) low_level1 (B3) data (B4-B100+)
  • Column C has null (C1) null (C2) low_level2 (C3) data (C4-C100+)
  • Column D has null (D1) mid_level2 (D2) low_level1 (D3) data (D4-D100+)
  • Column E has null (E1) null (E2) low_level2 (E3) data (E4-E100+)
  • ...

So there are two low_level values many mid_level values and a few top_level values but the trick is the top and mid level values are null and are assumed to be the values to the left. So, for instance all the columns above would have top_level1 as the top multi-index value.

My best idea so far is to use transpose, but the it fills Unnamed: # everywhere and doesn't seem to work. In Pandas 0.13 read_csv seems to have a header parameter that can take a list, but this doesn't seem to work with parse.

© Stack Overflow or respective owner

Related posts about python

Related posts about excel