Should I use a binary or a text file for storing protobuf messages?
Posted
by nbolton
on Stack Overflow
See other posts from Stack Overflow
or by nbolton
Published on 2009-12-07T10:51:11Z
Indexed on
2010/04/20
18:13 UTC
Read the original article
Hit count: 219
protocol-buffers
Using Google protobuf, I am saving my serialized messaged data to a file - in each file there are several messages. We have both C++ and Python versions of the code, so I need to use protobuf functions that are available in both languages. I have experimented with using SerializeToArray and SerializeAsString and there seems to be the following unfortunate conditions:
SerializeToArray: As suggested in one answer, the best way to use this is to prefix each message with it's data size. This would work great for C++, but in Python it doesn't look like this is possible - am I wrong?
SerializeAsString: This generates a serialized string equivalent to it's binary counterpart - which I can save to a file, but what happens if one of the characters in the serialization result is \n - how do we find line endings, or the ending of messages for that matter?
Update:
Please allow me to rephrase slightly. As I understand it, I cannot write binary data in C++ because then our Python application cannot read the data, since it can only parse string serialized messages. Should I then instead use SerializeAsString
in both C++ and Python? If yes, then is it best practice to store such data in a text file rather than a binary file? My gut feeling is binary, but as you can see this doesn't look like an option.
© Stack Overflow or respective owner