std::variant as a map key and std::hash

Tutorial | Dec 4, 2020 | nwheeler 

c++ tutorial ir-md-block-image ir-md-block-image-w-50

TL;DR

A std::variant can hold a value of any of its alternative types, which are well known at compile time. So, it is possible to compare two instances of the same std::variant<T...> class with a comparison operator (e.g., ==, >, <) if all the alternatives are comparable:

/** Basic examples **/
std::variant<int, std::string> var1{1};
std::variant<int, std::string> var2{1};

std::cout << std::boolalpha
          << (var1 == var2); //true
std::cout << std::boolalpha
          << (var1 < var2); //false

var2 = "Hello";  //Change held alternative
std::cout << std::boolalpha
          << (var1 == var2); //false

//         ---

/** Not same types **/
std::variant<std::string, int> var3{1};
std::cout << std::boolalpha
          << (var1 == var3); //Error! not same types

//         ---

/** All alternatives must be comparable  **/
struct X { int x; }; 
std::variant<int, X> var4{0};
std::variant<int, X> var5{0};
std::cout << std::boolalpha
        << (var4 == var5); //Error! Xs cannot be compared.

Therefore, a std::variant type can serve as a std::map (or std::set) key. However, an unordered container (e.g., std::unordered_map) requires a hash function also. Fortunately, the default hash function std::hash is specialized for std::variant and computes the hash from the active value and index. So, std::hash works for a std::variant as long as it is specialized for all the alternative types. Some examples:

using Key = std::variant<int, std::string>;
using Value = std::string;

std::unordered_map<Key, Value> kvs;

kvs[Key{1}] = "One";
kvs[Key{"1"}] = "One";

std::cout << kvs[Key{1}] << " "
        << kvs[Key{"1"}] << "\n"; //One One

Finally, it should be emphasized that a comparison operator and std::hash consider both the active value and the index — because a std::variant can have a repeated alternative type (e.g., std::variant<std::string, std::string>):

std::variant<int, int> v1, v2;

std::hash<std::variant<int, int>> vHash;

//Set same value at different index in v1 and v2
v1.emplace<0>(1);
v2.emplace<1>(1);

//They are not equal
std::cout << std::boolalpha
           << (v1 == v2) << " "
           << (vHash(v1) == vHash(v2)); //false false

//Set same value at same index 
v2.emplace<0>(1);

//Now they are equal
std::cout << std::boolalpha
          << (v1 == v2) << " "
          << (vHash(v1) == vHash(v2)); //true true

In the next section, we will look at a realistic example of using std::variant as an unordered_map key.

An Example

A hypothetical banking application needs to cache financial instruments (aka securities) against various instrument identifier keys. Depending on the country of origin and asset type, a financial instrument can have one or more identifiers associated with it, e.g., CUSIP, SEDOL, CIK, and more. These identifiers can be alphanumeric strings or numbers.

An instrument record, shown below, can be looked up by any of its identifiers:

struct Instrument {
 std::string description;
 std::optional<std::string> cusip;
 std::optional<std::string> sedol;
 std::optional<int> cik;
 //..more..
};

There are various strategies to cache these records. One approach is to use a cache with multiple maps, one for each of the identifier types. But that approach has a drawback as it requires changing the cache every time another identifier is to be supported and makes the cache very specific. Implementing a generic cache that could support various features — like Time-to-live (TTL) and Least-recently-used (LRU) — becomes very painful with that approach.

Another approach is to create a custom unified key type (e.g., a string) that encodes the identifier and its kind. That way, our cache implementation requires only one map and can stay generic. But this approach requires an extra transformation of identifiers.

We can avoid the penalty of the extra transformation of identifiers if we use std::variant. The identifier types can be alternatives of a std::variant, as shown:

using InstrumentId = std::variant<std::string,  //CUSIP
                                  std::string,  //SEDOL
                                  int>;         //CIK

To add clarity and avoid mistakes, we can have helper functions to create specific identifiers:

template<typename... T>
constexpr auto make_cusip(T&&... args) {
 return 
 InstrumentId(std::in_place_index<0>, 
              std::forward<T>(args)...);
}

template<typename... T>
constexpr auto make_sedol(T&&... args) {
 return 
 InstrumentId(std::in_place_index<1>, 
              std::forward<T>(args)...);
}

template<typename... T>
constexpr auto make_cik(T&&... args) {
 return 
 InstrumentId(std::in_place_index<2>, 
              std::forward<T>(args)...);
}

Here is a basic cache implementation:

template<typename K, typename V>
class Store {
public:
 void put(const K& k, const V& v) {
    /*
     .... possibily more code here
     */
    items[k] = v;
 }

 std::optional<V> get(const K& k) {
  auto itr = items.find(k);
  if(itr == items.end())
   return std::nullopt;

   return itr->second;
 }

 //... more functions ...
private:
 //....more members ....
 std::unordered_map<K, V> items;
};

And, the following code shows how we can add and retrieve the instruments from the cache:

using namespace std;
//Store contains InstrumentId => shared_ptr<Instrument>
Store<InstrumentId, shared_ptr<Instrument>> iStore;

//An IBM common stock instrument contains CUSIP and CIK, but no SEDOL
auto ibm = make_shared<Instrument>(Instrument{"IBM Common", 
                                               "459200101", 
                                               nullopt, 
                                               51143});

//Add the instrument against cusip and cik
iStore.put(make_cusip(ibm->cusip.value()), ibm);
iStore.put(make_cik(ibm->cik.value()), ibm);

//A BP bond contains SEDOL, but no CUSIP and CIK
auto bpBond = make_shared<Instrument>(Instrument{"BP 1.177% 12aug2023", 
                                                  nullopt, 
                                                  "BDGPG89"});

//Add the instrument against SEDOL
iStore.put(make_sedol(bpBond->sedol.value()), bpBond);

//-- Retrieve --

//Retrieve from CUSIP
if(auto ins = iStore.get(make_cusip("459200101"s))) {
 cout << (*ins)->description << "\n";
}

//Retrieve from SEDOL
if(auto ins = iStore.get(make_sedol("BDGPG89"s))) {
 cout << (*ins)->description << "\n";
}

Conclusion

This article shows the use of std::variant as a std::map and std::unordered_map key. The comparison operators and the default hash function (std::hash) on std::variant work either just out of the box if all alternative types are primitives or they can be explicitly supported.