Git-based storage and publishing, infrastructure advice
- by Joel Martinez
I wanted to get some advice on moving a system to "the cloud" ... specifically, I'm looking to move into some of Windows Azure's managed services, as right now I'm managing a VM. Basically, the system operates on some data stored in a github git repository. I'll describe the current architecture:
Current system (all hosted on a single server):
GitHub - configured with a webhook pointing at ...
ASP.NET MVC application - to accept the webhook from git. It pushes a message onto ...
Azure service bus Queue - which is drained by ...
Windows Service - pulls the message from the queue and ...
Fetches the latest data from the git repository (using GitLib2Sharp) onto the local disk and finally ...
Operates on the data in git to produce a static HTML website hosted/served by IIS.
The system works really well, actually ... but I would like to get out of the business of managing the VM, and move to using some combination of Azure web and worker roles. But because the system relies so heavily on the git repository on the local filesystem, I'm finding it difficult to figure out how to architect in the cloud. I know you can get file system access, so in theory I could just fetch the repository if there's nothing on disk ... but the performance/responsiveness of the system sort of depends on the repository being available and only having to fetch diffs, which is relatively quick. As opposed to periodically having to fetch the entire (somewhat large) git repository if the web or worker role was recycled, or something.
So I would love some advice on how you would architect such a system :) Ultimately, the only real requirement is to be able to serve HTML content that's been produced from the contents of a git repository (in a relatively responsive manner, from a publishing perspective) ... please feel free to ask any clarifying questions if there's something I omitted. Thanks!