C++ std::string_view for better performance: An example use case

The std::string_view offers the benefits of std::string's interface without the cost of constructing an std::string object.

Tutorial | Oct 9, 2019 | jmiller 

Overview

The std::string_view, from the C++17 standard, is a read-only non-owning reference to a char sequence. The motivation behind std::string_view is that it is quite common for functions to require a read-only reference to an std::string-like object where the exact type of the object does not matter. The drawback of using const std::string& in those situations is that it requires creating an std::string object. Here is a simple case in point:

//foo requires a std::string-like object
void foo(const std::string& s) {
 if(s.length() >= 6 && s[2] == 'V') {
    // extract a part of string
    auto d = s.substr(2,4);
    // d is a std::string
    // ...
 }
 //...
}

// A std::string is constructed
// Readable, easy to use, but expensive
foo("A Very Long String");

Constructing an std::string object could be expensive because it usually (but not always) requires dynamic memory allocation. Where the cost of constructing an std::string object is a concern, the readability and ease of usage are frequently compromised by using const char* and length parameters:

/* Better performance, but have to give up 
   benefits of std::string interface. */
void foo(const char* str, size_t length) {
 // written to work with const char* and length
 //.....
}

const char* str = "A Very Long String";
foo(str, strlen(str));

What makes std::string_view better than const std::string& is that it eliminates the need to have an std::string object in the first place. Usually, an std::string_view is composed of two members: a const char* that points to the start of the char array, and the size. Our simple example with std::string_view:

// foo requires a std::string-like object 
void foo(std::string_view s) {
 if(s.length() >= 6 && s[2] == 'V') {
    // extract a part of string
    auto d = s.substr(2,4);
    // d is a std::string_view
    //...
 }
 //...
}

// A std::string_view is constructed.
// Better performance, readable, and easy to use.
foo("A Very Long String");

Note that the substr method of an std::string_view returns an std::string_view. The following illustration shows how std::string_view objects conceptually refer to char sequences:


std::string_view conceptual illustration


The applications that require a substantial amount of constructing and copying string objects can significantly benefit in terms of performance and code readability by using std::string_view.

An Example: OSI Symbols of Option Contracts

Assume a hypothetical trading system application that uses a large number of option contract OSI symbols (e.g., "AAPL 131101C00470000"). An OSI symbol is a 21-character long identifier that encodes various attributes of an option contract. The application loads a delimited list of all the symbols from a file to a buffer. Then the symbols from the buffer are split and stored in an std::unordered_set of std::string objects:

// A type alias for symbol 
using Symbol = std::string;

/* A routine to split and load the 
   symbols in a collection */
template<typename C>
void loadSymbols(const char* source, size_t len, 
                    char delim, C& coll) {

 const char* first = source;
 const char* last = source+len;

 while(true) {
  // find delimiter location
  auto delimPos = std::find(first, last, delim);

  // check if delimiter found
  if(delimPos == last)
   break; // no more delimiter

  // Insert the Symbol in coll
  coll.insert(Symbol(first, delimPos - first));

  // advance the first pointer for next token
  first = delimPos + 1;
 }
}

// Somewhere else 

/*buffer holding '|' delimited symbol list. 
   Could be a std::vector<char> */
std::string buf = "SPX   191115C02820000|"
                  "SPX   191115P02820000|"
                  /*many more symbols...*/; 

// the symbols collection
std::unordered_set<Symbol> symbols;

 // load symbols 
loadSymbols(buf.c_str(), buf.size(), '|', symbols);

At various places in the application, the symbols are searched from the symbols collection, copied when necessary, and stored to other STL containers when needed. But nowhere, the symbols are modified. It is costly in terms of performance and memory usage to have a vast number of the std::string objects, mainly when the dynamic memory allocation is involved. To minimize dynamic allocation, a typical implementation of std::string is optimized to store a small string within itself in a char array; this is called short/small string optimization (SSO). However, the small-string size for optimization is implementation-dependent and could very well be below 21-chars.

This application can undoubtedly benefit from using std::string_view. The first thing we have to do is change the type alias of symbol to be of type std::string_view instead of std::string:

// A type alias for symbol 
using Symbol = std::string_view;  

That is the only change we need for the above example code to work. However, it is likely that more changes would be required in a real-world application. For one, it is important to consider here is that the buffer that holds the symbol list should live for the lifetime of the application; otherwise, all the symbol std::string_views would be invalidated.

Another change could come from a fact that the std::string_view does not have the c_str() interface to return a null-terminated string. We would have to convert the std::string_view to std:: string wherever a null-terminated string is required:

//using namespace std::literals::string_view_literals;

auto v = "This is a view"sv; // 'sv' is a string_view literal
// to get null-terminated string
std::cout << std::string(v).c_str() << "\n";

Closing Words

The std::string_view is an excellent utility for good performance and readability where only the std::string-like interface is required. But the caution must be exercised to ensure that the std::string_view does not outlive the referred char sequence.

Further Reading

string_view: a non-owning reference to a string

std::basic_string_view: cppreference