Convert Excel File 'xls' to CSV, CAUTION: Bumps Ahead
- by faizanahmad
The task was to provide users with an interface where they can upload the 'csv' files, these files were to be processed and loaded to Database by a Console application. The code in Console application could not handle the 'xls' files so we thought, OK, lets convert 'xls' to 'csv' in the code, Seemed like fun. The idea was to convert it right after uploading within 'csv' file.
As Microsoft does not recommend using the Excel objects in ASP.NET, we decided to use the Jet engine to open xls. (Ace driver is used for xlsx)
The code was pretty straight, can be found on following links:
http://www.c-sharpcorner.com/uploadfile/yuanwang200409/102242008174401pm/1.aspx
http://www.devasp.net/net/articles/display/141.html
FIRST BUMP 'OleDbException (0x80004005): Unspecified error' ( Impersonation ): The ablove code ran fine in my test web site and test console application, but it gave an 'OleDbException (0x80004005): Unspecified error' in main web site, turns out imperonation was set to True and as soon as I changed it to False, it did work. on My XP box, web site was running under user
'ASPNET' with imperosnation set to FALSE
'IUSR_*' i.e IIS guest user with impersonation set to TRUE
The weired part was that both users had same rights on the folders I was saving files to and on Excel app in DCOM Config.
We decided to give it a try on Windows Server 2003 with web site set to windows authentication ( impersonation = true ) and yes it did work.
SECOND BUMP 'External table not in correct format': I got this error with some files and it appeared that the file from client has some metadata issues ( when I opened the file in Excel and try to save it ,excel would give me this error saying File can not be saved in current format ) and the error was caused by that. Some people were able to reslove the error by using "Extended Properties=HTML Import;" in connection string. But it did not work for me.
We decided to detour from here and use Excel object :( as we had no control on client setting the meta deta of Excel files.
Before third bump there were a ouple of small thingies like 'Retrieving the COM class factory for component with CLSID {00024500-0000-0000-C000-000000000046} failed due to the following error: 80070005'
Fix can be found at
http://blog.crowe.co.nz/archive/2006/03/02/589.aspx
THIRD BUMP ( Could not get rid of the EXCEL process ): I has all the code in place to 'Quiet' the excel, but, it just did not work. work around was done to Kill the process as we knew no other application on server was using EXCEL. The normal steps to quite the excel application worked just fine in console application though.
FOURTH BUMP: Code worked with one file 1 on my machine and with the other file 2 code will break. and the same code will work perfectly fine with file 2 on some other machine . We moved it to QA ( Windows Server 2003 )and worked with every file just perfect. But , then there was another problem: one user can upload it and second cant, permissions on folder and DCOM Conifg checked.
Another Detour: Uplooad the xls as it is and convert in Console application.
Lesson Learnt: If its 'xlsx' use 'ACE Driver' or read xml within excel as recommneded by MS. If xls and you know its always going to be properly formatted 'jet Engine'
Code:
Imports Microsoft.Office.Interop
Private Function ConvertFile(ByVal SourceFolder As String, ByVal FileName As String, ByVal FileExtension As String)As Boolean
Dim appExcel As New Excel.Application
Dim workBooks As Excel.Workbooks = appExcel.Workbooks
Dim objWorkbook As Excel.Workbook
Try
objWorkbook = workBooks.Open(CompleteFilePath )
objWorkbook.SaveAs(Filename:=CObj(SourceFolder & FileName & ".csv"), FileFormat:=Excel.XlFileFormat.xlCSV)
Catch ex As Exception
GenerateAlert(ex.Message().Replace("'", "") & " Error Converting File to CSV.")
LogError(ex )
Return False
Finally
If Not(objWorkbook is Nothing) then
objWorkbook.Close(SaveChanges:=CObj(False))
End If
ReleaseObj(objWorkbook)
ReleaseObj(workBooks)
appExcel.Quit()
ReleaseObj(appExcel)
Dim proc As System.Diagnostics.Process
For Each proc In System.Diagnostics.Process.GetProcessesByName("EXCEL")
proc.Kill()
Next
DeleteSourceFile(SourceFolder & FileName & FileExtension)
End Try
Return True
End Function
Private Sub ReleaseObj(ByVal o As Object)
Try
System.Runtime.InteropServices.Marshal.ReleaseComObject(o)
Catch ex As Exception
LogError(ex )
Finally
o = Nothing
End Try
End Sub
Protected Sub DeleteSourceFile(Byval CompleteFilePath As string)
Try
Dim MyFile As FileInfo = New FileInfo(CompleteFilePath)
If MyFile.Exists Then
File.Delete(CompleteFilePath)
Else
Throw New FileNotFoundException()
End If
Catch ex As Exception
GenerateAlert( " Source File could not be deleted.")
LogError(ex)
End Try
End Sub
The code to kill the process ( Avoid it if you can ):
Dim proc As System.Diagnostics.Process
For Each proc In System.Diagnostics.Process.GetProcessesByName("EXCEL")
proc.Kill()
Next