Starting with C++20, some very useful functions for searching have been added to some standard containers, such as std::map
, std::set
, and std::string
. These have been required for a long time and it’s good to see that the committee finally agreed upon their value. I hope this is the beginning of some wonderful additions.
Maps and sets
A typical operation when working with maps is to check if a given key exists. How do you do this in C++17? Well, it’s simple:
std::map<int, std::string> m{ {1, "one"}, {2, "two"}, {3, "three"} }; if (m.find(1) != m.end()) { std::cout << "key found!\n"; }
Although it may be simple it’s not user-friendly at all. For this reason, many have written their own contains()
function that takes a map and a key and returns a boolean indicating whether the map contains the key. This is no longer necessarily in C++20 where std::map
has a contains()
method.
std::map<int, std::string> m{ {1, "one"}, {2, "two"}, {3, "three"} }; if (m.contains(1)) { std::cout << "key found!\n"; }
The same is true for std::set
too.
std::set<int> s{ 1, 2, 3 }; if (s.contains(1)) { std::cout << "key found!\n"; }
In fact, a contains()
function has been added to the following types in C++20:
- std::map
- std::multimap
- std::unordered_map
- std::unordered_multimap
- std::set
- std::multiset
- std::unordered_set
- std::unordered_multiset
Strings
A similar problem concerns strings. Often, we need to know if a string contains another string. This is how you do it in C++17:
std::string text{"The quick brown fox jumps over the lazy dog"}; if (text.find("fox") != std::string::npos) { std::cout << "fox found!\n"; }
A particular case related to strings is finding a substring at the beginning and end of the string. Searching at the beginning is relatively simple:
if (text.find("The quick") == 0) { std::cout << "right start\n"; }
But searching at the end requires a helper function. A possible implementation is this:
bool ends_with(std::string const & text, std::string const & substr) { if (substr.size() > text.size()) return false; return std::equal(text.begin() + text.size() - substr.size(), text.end(), substr.begin()); }
Which can be used as follows:
if (ends_with(text, "lazy dog")) { std::cout << "right end\n"; }
(Note: You can find alternative implementations for such a function here.)
This have been greatly simplified in C++20 where std::basic_string
and std::basic_string_view
have two more methods: starts_with() and ends_with().
if (text.starts_with("The quick")) { std::cout << "right start\n"; } if(text.ends_with("lazy dog")) { std::cout << "right end\n"; }
However, there is a huge miss in C++20: a function for checking if a string contains a substring. Now, during the last C++ ISO committee meeting, such a method has been added to C++23 (P1679). This will enable us to write the following:
if (text.contains("fox")) { std::cout << "fox found!\n"; }
And that is how we always wanted to write code.
But…
You should keep in mind that these new string functions are case sensitive. They do not take a predicate to allow you to customize the way the search is done. Therefore, if you need to perform case-insensitive searching, then you still need to implement that yourself. A possible implementation for contains()
, starts_with()
, and ends_with()
that performs case-insensitive search is shown here:
bool contains_ci(std::string const & text, std::string const & substr) { if (substr.length() > text.length()) return false; auto it = std::search( text.begin(), text.end(), substr.begin(), substr.end(), [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }); return it != text.end(); } bool starts_with_ci(std::string const& text, std::string const& substr) { if (substr.length() > text.length()) return false; auto it = std::search( text.begin(), text.begin() + substr.length(), substr.begin(), substr.end(), [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }); return it == text.begin(); } bool ends_with_ci(std::string const& text, std::string const& substr) { if (substr.length() > text.length()) return false; auto it = std::search( text.rbegin(), text.rbegin() + substr.length(), substr.rbegin(), substr.rend(), [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }); return it == text.rbegin(); }
And these can be used as follows:
if (contains_ci(text, "FOX")) { std::cout << "fox found!\n"; } if (starts_with_ci(text, "THE QUICK")) { std::cout << "right start\n"; } if (ends_with_ci(text, "LAZY DOG")) { std::cout << "right end\n"; }