SSIS – Delete all files except for the most recent one

Posted by jorg on SQL Blog See other posts from SQL Blog or by jorg
Published on Fri, 28 May 2010 11:53:57 GMT Indexed on 2010/05/28 12:03 UTC
Read the original article Hit count: 687

Filed under:

Quite often one or more sources for a data warehouse consist of flat files. Most of the times these files are delivered as a zip file with a date in the file name, for example FinanceDataExport_20100528.zip

Currently I work at a project that does a full load into the data warehouse every night. A zip file with some flat files in it is dropped in a directory on a daily basis. Sometimes there are multiple zip files in the directory, this can happen because the ETL failed or somebody puts a new zip file in the directory manually. Because the ETL isn’t incremental only the most recent file needs to be loaded. To implement this I used the simple code below; it checks which file is the most recent and deletes all other files.

Note: In a previous blog post I wrote about unzipping zip files within SSIS, you might also find this useful: SSIS – Unpack a ZIP file with the Script Task

Public Sub Main()

        'Use this piece of code to loop through a set of files in a directory
        'and delete all files except for the most recent one based on a date in the filename.

        'File name example:
        'DataExport_20100413.zip

        Dim rootDirectory As New DirectoryInfo(Dts.Variables("DirectoryFromSsisVariable").Value.ToString)
        Dim mostRecentFile As String = ""
        Dim currentFileDate As Integer
        Dim mostRecentFileDate As Integer = 0

        'Check which file is the most recent
        For Each fi As FileInfo In rootDirectory.GetFiles("*.zip")

            currentFileDate = CInt(Left(Right(fi.Name, 12), 8)) 'Get date from current filename (based on a file that ends with: YYYYMMDD.zip)

            If currentFileDate > mostRecentFileDate Then

                mostRecentFileDate = currentFileDate
                mostRecentFile = fi.Name

            End If

        Next


        'Delete all files except the most recent one
        For Each fi As FileInfo In rootDirectory.GetFiles("*.zip")

            If fi.Name <> mostRecentFile Then

                File.Delete(rootDirectory.ToString + "\" + fi.Name)

            End If

        Next

        Dts.TaskResult = ScriptResults.Success
    End Sub

© SQL Blog or respective owner

Related posts about ssis