C++23 is the current working version of the C++ standard. No major feature has been included so far, but a series of smaller ones as well as many defect reports have made it already to the standard. You can check the current status as well as the compiler support for the new features here. Many of these new features are small improvements or things you probably wouldn’t use on a regular basis. However, I want to point here to three C++23 features that, in my opinion, stand out among the others as more likely to be used more often.
Literal suffixes for size_t and ptrdiff_t
std::size_t is an unsigned data type (of at least 16 bits) that can hold the maximum size of an object of any type. It can safely store the index of an array on any platform. It is the type returned by the sizeof
, sizeof...
, and alignof
operators.
std::ptrdiff_t is a signed data type (of at least 17 bits) that represents the type of the result of subtracting two pointers.
In C++23, these have their own string literal suffixes.
Literal suffix | Deduced type | Example |
---|---|---|
uz or uZ or Uz or UZ | std::size_t | auto a = 42uz; |
z or Z | signed std::size_t (std::ptrdiff_t ) | auto b = -42z; |
Let’s see how this is useful. In C++20, we could write the following:
std::vector<int> v {1, 1, 2, 3, 5, 8}; for(auto i = 0u; i < v.size(); ++i) { std::cout << v[i] << '\n'; }
The deduced type of the variable i
is unsigned int
. This works fine on 32-bit, where both unsigned int
and size_t
, which is the return type of the size()
member function, are 32-bit. But on 64-bit you may get a warning and the value is truncated, because unsigned int
is still 32-bit but size_t
is 64-bit.
On the other hand, we can have the following:
std::vector<int> v {1, 1, 2, 3, 5, 8}; auto m = std::max(42, std::ssize(v)); // compiles on 32-bit but fails on 64-bit
std::vector<int> v {1, 1, 2, 3, 5, 8}; auto m = std::max(42ll, std::ssize(v)); // compiles on 64-bit but fails on 32-bit
Neither of these two versions work on both 32-bit and 64-bit platforms.
This is where the new literal suffixes help:
std::vector<int> v {1, 1, 2, 3, 5, 8}; for(auto i = 0uz; i < v.size(); ++i) { std::cout << v[i] << '\n'; } auto m = std::max(42z, std::ssize(v));
This code works the same on all platforms.
See more:
Multidimensional subscript operator
Sometimes we need to work with multidimensional containers (or views). Accessing elements in an unidimensional container can be done with the subscript operator (such as arr[0]
or v[i]
). But for a multidimensional type, the subscript operator does not work nice. You cannot say arr[0, 1, 2]
. The alternatives are:
- Define an access function, such as
at()
with any number of parameters (so you could sayc.at(0, 1, 2)
) - overload the call operator (so you could say
c(0, 1, 2)
) - overload the subscript operator with a brace-enclosed list (so you could say
c[{1,2,3}]
) - chain single-argument array access operators (so you could say
c[0][1][2]
) which is probably leading to the least desirable APIs and usage
To demonstrate the point, let’s consider a matrix class (that represents a two dimensional array). A simplistic implementation and usage is as follows:
template <typename T, size_t R, size_t C> struct matrix { T& operator()(size_t const r, size_t const c) noexcept { return data_[r * C + c]; } T const & operator()(size_t const r, size_t const c) const noexcept { return data_[r * C + c]; } static constexpr size_t Rows = R; static constexpr size_t Columns = C; private: std::array<T, R* C> data_; }; int main() { matrix<int, 2, 3> m; for (size_t i = 0; i < m.Rows; ++i) { for (size_t j = 0; j < m.Columns; ++j) { m(i, j) = i * m.Columns + (j + 1); } } for (size_t i = 0; i < m.Rows; ++i) { for (size_t j = 0; j < m.Columns; ++j) { std::cout << m(i, j) << ' '; } std::cout << '\n'; } }
I never liked the m(i, j)
syntax, but this was the best we could do until C++23, IMO. Now, we can overload the subscript operator with multiple parameters:
T& operator[](size_t const r, size_t const c) noexcept { return data_[r * C + c]; } T const & operator[](size_t const r, size_t const c) const noexcept { return data_[r * C + c]; }
We can now use the new matrix
implementation as follows:
int main() { matrix<int, 3, 2> m; for (size_t i = 0; i < m.Rows; ++i) { for (size_t j = 0; j < m.Columns; ++j) { m[i, j] = i * m.Columns + (j + 1); } } for (size_t i = 0; i < m.Rows; ++i) { for (size_t j = 0; j < m.Columns; ++j) { std::cout << m[i, j] << ' '; } std::cout << '\n'; } }
I just wished we had this twenty years ago!
See also:
contains() member function for string/string_view
C++20 added the starts_with() and ends_with() member functions to std::basic_string
and std::basic_string_view
. These enable us to check whether a string starts with a given prefix or ends with a given suffix.
int main() { std::string text = "lorem ipsum dolor sit amet"; std::cout << std::boolalpha; std::cout << text.starts_with("lorem") << '\n'; // true std::cout << text.starts_with("ipsum") << '\n'; // false std::cout << text.ends_with("dolor") << '\n'; // false std::cout << text.ends_with("amet") << '\n'; // true }
Unfortunately, these don’t help us checking whether a string contains a given substring. Of course, this is possible with the find() function. But this returns the position of the first character of the found substring or npos
otherwise, so we need to do a check as follows:
std::cout << (text.find("dolor") != std::string::npos) << '\n';
I find this cumbersome and ugly when you just want to know if a string contains a particular substring or character.
In C++23, the circle is complete, as the same feature is available with the new contains() member function. This function enables us to check whether a substring or a single character is present anywhere the string. This is basically the same as find(x) != npos
. But the syntax is nicer and in line with starts_with()
and ends_with()
.
std::cout << text.contains("dolor") << '\n';
See also:
So, no case insensitive flag? Another disappointment from C++…
Could you explain why “chain single-argument array access operators (so you could say c[0][1][2]) which is probably leading to the least desirable APIs and usage”?
It seems to be the most common and wide-spread method there is.