I've inherited a .NET project that has close to 2 thousand clients out in the field that need to push data periodically up to a central repository. The clients wake up and attempt to push the data up via a series of WCF webservices where they are passing each entity framework entity as parameter. Once the service receives this object, it preforms some business logic on the data, and then turns around and sticks it in it's own database that mirrors the database on the client machines.
The trick is, is that this data is being transmitted over a metered connection, which is very expensive. So optimizing the data is a serious priority. Now, we are using a custom encoder that compresses the data (and decompresses it on the other end) while it is being transmitted, and this is reducing the data footprint. However, the amount of data that the clients are using, seem ridiculously large, given the amount of information that is actually being transmitted.
It seems me that entity framework itself may be to blame. I'm suspecting that the objects are very large when serialized to be sent over wire, with a lot context information and who knows what else, when what we really need is just the 'new' inserts.
Is using the entity framework and WCF services as we have done so far the correct way, architecturally, of approaching this n-tiered, asynchronous, push only problem? Or is there a different approach, that could optimize the data use?