[Python] Best strategy for dealing with incomplete lines of data from a file.
- by adoran
I use the following block of code to read lines out of a file 'f' into a nested list:
for data in f:
clean_data = data.rstrip()
data = clean_data.split('\t')
t += [data[0]]
strmat += [data[1:]]
Sometimes, however, the data is incomplete and a row may look like this:
['955.159', '62.8168', '', '', '', '', '', '', '', '', '', '', '', '', '', '29', '30', '0', '0']
It puts a spanner in the works because I would like Python to implicitly cast my list as floats but the empty fields '' cause it to be cast as an array of strings (dtype: s12).
I could start a second 'if' statement and convert all empty fields into NULL (since 0 is wrong in this instance) but I was unsure whether this was best.
Is this the best strategy of dealing with incomplete data?
Should I edit the stream or do it post-hoc?