How to read a file with variable multi-row data in Python
- by dr.bunsen
I have a file that is about 100Mb that looks like this:
#meta data 1
skadjflaskdjfasljdfalskdjfl
sdkfjhasdlkgjhsdlkjghlaskdj
asdhfk
#meta data 2
jflaksdjflaksjdflkjasdlfjas
ldaksjflkdsajlkdfj
#meta data 3
alsdkjflasdjkfglalaskdjf
This file contains one row of meta data that corresponds to several, variable length data containing only alpha-numeric characters. What is the best way to read this data into a simple list like this:
data = [[#meta data 1, skadjflaskdjfasljdfalskdjflsdkfjhasdlkgjhsdlkjghlaskdjasdhfk],
[#meta data 2, jflaksdjflaksjdflkjasdlfjasldaksjflkdsajlkdfj],
[#meta data 3, alsdkjflasdjkfglalaskdjf]]
My initial idea was to use the read() method to read the whole file into memory and then use regular expressions to parse the data into the desired format. Is there a better more pythonic way? All metadata lines start with an octothorpe and all data lines are all alpha-numeric. Thanks!