Starting with C++20, some very useful functions for searching have been added to some standard containers, such as std::map, std::set, and std::string. These have been required for a long time and it’s good to see that the committee finally agreed upon their value. I hope this is the beginning of some wonderful additions.
Maps and sets
A typical operation when working with maps is to check if a given key exists. How do you do this in C++17? Well, it’s simple:
1 2 3 4 5 6 |
std::map<int, std::string> m{ {1, "one"}, {2, "two"}, {3, "three"} }; if (m.find(1) != m.end()) { std::cout << "key found!\n"; } |
Although it may be simple it’s not user-friendly at all. For this reason, many have written their own contains() function that takes a map and a key and returns a boolean indicating whether the map contains the key. This is no longer necessarily in C++20 where std::map has a contains() method.
1 2 3 4 5 6 |
std::map<int, std::string> m{ {1, "one"}, {2, "two"}, {3, "three"} }; if (m.contains(1)) { std::cout << "key found!\n"; } |
The same is true for std::set too.
1 2 3 4 5 |
std::set<int> s{ 1, 2, 3 }; if (s.contains(1)) { std::cout << "key found!\n"; } |
In fact, a contains() function has been added to the following types in C++20:
- std::map
- std::multimap
- std::unordered_map
- std::unordered_multimap
- std::set
- std::multiset
- std::unordered_set
- std::unordered_multiset
Strings
A similar problem concerns strings. Often, we need to know if a string contains another string. This is how you do it in C++17:
1 2 3 4 5 6 |
std::string text{"The quick brown fox jumps over the lazy dog"}; if (text.find("fox") != std::string::npos) { std::cout << "fox found!\n"; } |
A particular case related to strings is finding a substring at the beginning and end of the string. Searching at the beginning is relatively simple:
1 2 3 4 |
if (text.find("The quick") == 0) { std::cout << "right start\n"; } |
But searching at the end requires a helper function. A possible implementation is this:
1 2 3 4 5 |
bool ends_with(std::string const & text, std::string const & substr) { if (substr.size() > text.size()) return false; return std::equal(text.begin() + text.size() - substr.size(), text.end(), substr.begin()); } |
Which can be used as follows:
1 2 3 4 |
if (ends_with(text, "lazy dog")) { std::cout << "right end\n"; } |
(Note: You can find alternative implementations for such a function here.)
This have been greatly simplified in C++20 where std::basic_string and std::basic_string_view have two more methods: starts_with() and ends_with().
1 2 3 4 5 6 7 8 9 |
if (text.starts_with("The quick")) { std::cout << "right start\n"; } if(text.ends_with("lazy dog")) { std::cout << "right end\n"; } |
However, there is a huge miss in C++20: a function for checking if a string contains a substring. Now, during the last C++ ISO committee meeting, such a method has been added to C++23 (P1679). This will enable us to write the following:
1 2 3 4 |
if (text.contains("fox")) { std::cout << "fox found!\n"; } |
And that is how we always wanted to write code.
But…
You should keep in mind that these new string functions are case sensitive. They do not take a predicate to allow you to customize the way the search is done. Therefore, if you need to perform case-insensitive searching, then you still need to implement that yourself. A possible implementation for contains(), starts_with(), and ends_with() that performs case-insensitive search is shown here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
bool contains_ci(std::string const & text, std::string const & substr) { if (substr.length() > text.length()) return false; auto it = std::search( text.begin(), text.end(), substr.begin(), substr.end(), [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }); return it != text.end(); } bool starts_with_ci(std::string const& text, std::string const& substr) { if (substr.length() > text.length()) return false; auto it = std::search( text.begin(), text.begin() + substr.length(), substr.begin(), substr.end(), [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }); return it == text.begin(); } bool ends_with_ci(std::string const& text, std::string const& substr) { if (substr.length() > text.length()) return false; auto it = std::search( text.rbegin(), text.rbegin() + substr.length(), substr.rbegin(), substr.rend(), [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }); return it == text.rbegin(); } |
And these can be used as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
if (contains_ci(text, "FOX")) { std::cout << "fox found!\n"; } if (starts_with_ci(text, "THE QUICK")) { std::cout << "right start\n"; } if (ends_with_ci(text, "LAZY DOG")) { std::cout << "right end\n"; } |