Designing a Content-Based ETL Process with .NET and SFDC

Posted by Patrick on Programmers See other posts from Programmers or by Patrick
Published on 2012-07-04T20:13:51Z Indexed on 2012/07/05 3:22 UTC
Read the original article Hit count: 441

Filed under:

As my firm makes the transition to using SFDC as our main operational system, we've spun together a couple of SFDC portals where we can post customer-specific documents to be viewed at will. As such, we've had the need for pseudo-ETL applications to be implemented that are able to extract metadata from the documents our analysts generate internally (most are industry-standard PDFs, XML, or MS Office formats) and place in networked "queue" folders. From there, our applications scoop of the queued documents and upload them to the appropriate SFDC CRM Content Library along with some select pieces of metadata. I've mostly used DbAmp to broker communication with SFDC (DbAmp is a Linked Server provider that allows you to use SQL conventions to interact with your SFDC Org data).

I've been able to create [console] applications in C# that work pretty well, and they're usually structured something like this:

static void Main()
{
    // Load parameters from app.config.
    // Get documents from queue.
    var files = someInterface.GetFiles(someFilterOrRegexPattern);

    foreach (var file in files)
    {
        // Extract metadata from the file.
        // Validate some attributes of the file; add any validation errors to an in-memory          
        // structure (e.g. List<ValidationErrors>).
        if (isValid) 
        {
            var fileData = File.ReadAllBytes(file);
            // Upload using some wrapper for an ORM or DAL
            someInterface.Upload(fileData, meta.Param1, meta.Param2, ...);
        }
        else
        {
            // Bounce the file
        }
    }

    // Report any validation errors (via message bus or SMTP or some such).
}

And that's pretty much it. Most of the time I wrap all these operations in a "Worker" class that takes the needed interfaces as constructor parameters.

This approach has worked reasonably well, but I just get this feeling in my gut that there's something awful about it and would love some feedback. Is writing an ETL process as a C# Console app a bad idea? I'm also wondering if there are some design patterns that would be useful in this scenario that I'm clearly overlooking.

Thanks in advance!

Developer IT

Designing a Content-Based ETL Process with .NET and SFDC - Developer IT

Designing a Content-Based ETL Process with .NET and SFDC

c#

.NET

architecture

application-design

etl

Related posts about c#

.NET WebRequest.PreAuthenticate not quite what it sounds like

HttpWebRequest and Ignoring SSL Certificate Errors

The dynamic Type in C# Simplifies COM Member Access from Visual FoxPro

Dynamic Type to do away with Reflection

Finding a Relative Path in .NET

Related posts about .NET

Apt-Get Update: failure to fetch; can't connect to any sources

12.04: Apt-Get Update: failure to fetch; can't connect to any sources

What's New in ASP.NET 4

.NET Reflector 6, .NET Reflector Pro, TestDriven.NET, .NET 4.0 and Mono

Redmine on Apache2 with Passenger issue

Categories cloud