Should I use a binary or a text file for storing protobuf messages?

Posted by nbolton on Stack Overflow See other posts from Stack Overflow or by nbolton
Published on 2009-12-07T10:51:11Z Indexed on 2010/04/20 18:13 UTC
Read the original article Hit count: 272

Filed under:

protocol-buffers

Using Google protobuf, I am saving my serialized messaged data to a file - in each file there are several messages. We have both C++ and Python versions of the code, so I need to use protobuf functions that are available in both languages. I have experimented with using SerializeToArray and SerializeAsString and there seems to be the following unfortunate conditions:

SerializeToArray: As suggested in one answer, the best way to use this is to prefix each message with it's data size. This would work great for C++, but in Python it doesn't look like this is possible - am I wrong?
SerializeAsString: This generates a serialized string equivalent to it's binary counterpart - which I can save to a file, but what happens if one of the characters in the serialization result is \n - how do we find line endings, or the ending of messages for that matter?

Update:

Please allow me to rephrase slightly. As I understand it, I cannot write binary data in C++ because then our Python application cannot read the data, since it can only parse string serialized messages. Should I then instead use SerializeAsString in both C++ and Python? If yes, then is it best practice to store such data in a text file rather than a binary file? My gut feeling is binary, but as you can see this doesn't look like an option.

Related posts about protocol-buffers

boost serialization vs google protocol buffers?

as seen on Stack Overflow - Search for 'Stack Overflow'
Does anyone with experience with these libraries have any comment on which one they preferred? Were there any performance differences or difficulties in using? >>> More
Thrift / Google Protocol Buffers on Windows

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi All, Looking at Thrift and Google Protocol Buffers to implement some quick RPC code. Thrift would be perfect if the generated C++ code compiled on windows (which is what I need). And of course, GPB creates RPC stubs, but no implementation. Is there a way to get Thrift Windows friendly? Or, even… >>> More
how to send classes defined in .proto (protocol-buffers) over a socket

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am trying to send a proto over a socket, but i am getting segmentation error. Could someone please help and tell me what is wrong with this example? file.proto message data{ required string x1 = 1; required uint32 x2 = 2; required float x3 = 3; } client.cpp ... // class… >>> More
XStream <-> Alternative binary formats (e.g. protocol buffers)

as seen on Stack Overflow - Search for 'Stack Overflow'
We currently use XStream for encoding our web service inputs/outputs in XML. However we are considering switching to a binary format with code generator for multiple languages (protobuf, Thrift, Hessian, etc) to make supporting new clients easier and less reliant on hand-coding (also to better support… >>> More
Biggest differences of Thrift vs Protocol Buffers?

as seen on Stack Overflow - Search for 'Stack Overflow'
What are the biggest pros and cons of Apache Thrift vs Google's Protocol Buffers? >>> More

Developer IT