Hi guys,
I am trying to read in an Excel file using xlrd, and I am wondering if there is a way to ignore the cell formatting used in Excel file, and just import all data as text?
Here is the code I am using for far:
import xlrd
xls_file = 'xltest.xls'
xls_workbook = xlrd.open_workbook(xls_file)
xls_sheet = xls_workbook.sheet_by_index(0)
raw_data = [['']*xls_sheet.ncols for _ in range(xls_sheet.nrows)]
raw_str = ''
feild_delim = ','
text_delim = '"'
for rnum in range(xls_sheet.nrows):
for cnum in range(xls_sheet.ncols):
raw_data[rnum][cnum] = str(xls_sheet.cell(rnum,cnum).value)
for rnum in range(len(raw_data)):
for cnum in range(len(raw_data[rnum])):
if (cnum == len(raw_data[rnum]) - 1):
feild_delim = '\n'
else:
feild_delim = ','
raw_str += text_delim + raw_data[rnum][cnum] + text_delim + feild_delim
final_csv = open('FINAL.csv', 'w')
final_csv.write(raw_str)
final_csv.close()
This code is functional, but there are certain fields, such as a zip code, that are imported as numbers, so they have the decimal zero suffix. For example, is there is a zip code of '79854' in the Excel file, it will be imported as '79854.0'.
I have tried finding a solution in this xlrd spec, but was unsuccessful.