Large scale file replication with an option to "unsubscribe" from a replicated file on a given machine
Posted
by
Alexander Gladysh
on Server Fault
See other posts from Server Fault
or by Alexander Gladysh
Published on 2013-06-29T21:38:30Z
Indexed on
2013/06/29
22:22 UTC
Read the original article
Hit count: 219
I have a 100+ GB files per day incoming on one machine. (File size is arbitrary and can be adjusted as needed.)
I have several other machines that do some work on these files.
I need to reliably deliver each incoming file to the worker machines. A worker machine should be able to free its HDD from a file once it is done working with it.
It is preferable that a file would be uploaded to the worker only once and then processed in place, and then deleted, without copying somewhere else — to minimize already high HDD load. (Worker itself requires quite a bit of bandwidth.)
Please advise a solution that is not based on Java. None of existing replication solutions that I've seen can do the "free HDD from the file once processed" stuff — but maybe I'm missing something...
A preferable solution should work with files (from the POV of our business logic code), not require the business logic to connect to some queue or other. (Internally the solution may use whatever technology it needs to — except Java.)
© Server Fault or respective owner