Recently I met a bug when I was using google protobuf to serialize and deserialize messages.
I have a code segment like this:

In the code, there is a weird usage of std::string::data() and MyMessage::ParseFromString() API. It’s a mistake due to my negligence. In the beginning, I use MyMessage::ParseFromArray(const char*) API and the type of msg_buffer is std::vector, so I use std::vector::data() to get a pointer in type of const char*.  When I was changing the parsing method from ParseFromArray to ParseFromString, I didn’t think much about the argument of the new API ParseFromString.

These codes could pass compilation without any warning, even with compiler flag -Wall . So I didn’t realize that I made a mistake on the usage of the API. Then, the program crashed, which came from later codes that assume the msg object should have some specified data.

In fact, the API MyMessage::ParseFromString() never accepts an argument like const char *. Since std::string has a non-explicit constructor basic_string(const CharT *, const Allocator &), the argument is implicitly converted to an  std::string and passed into the function. However, the string constructed in this way is not what we want. Let’s see a simple example.

Compile and run the code above and you will get the output: 3. But the string has 12 bytes actually. Since there is a terminate char ‘\0’ in the C-style string (pointed by const char*), the newly constructed std::string will end with “I’m” and hold non-expected data.

A bug Caused by Implicit Construction of std::string

Leave a Reply

Your email address will not be published.