Three C++23 features for common use

C++23 is the current working version of the C++ standard. No major feature has been included so far, but a series of smaller ones as well as many defect reports have made it already to the standard. You can check the current status as well as the compiler support for the new features here. Many of these new features are small improvements or things you probably wouldn’t use on a regular basis. However, I want to point here to three C++23 features that, in my opinion, stand out among the others as more likely to be used more often.

Literal suffixes for size_t and ptrdiff_t

std::size_t is an unsigned data type (of at least 16 bits) that can hold the maximum size of an object of any type. It can safely store the index of an array on any platform. It is the type returned by the sizeof, sizeof..., and alignof operators.

std::ptrdiff_t is a signed data type (of at least 17 bits) that represents the type of the result of subtracting two pointers.

In C++23, these have their own string literal suffixes.

Literal suffixDeduced typeExample
uz or uZ or Uz or UZstd::size_tauto a = 42uz;
z or Zsigned std::size_t (std::ptrdiff_t)auto b = -42z;

Let’s see how this is useful. In C++20, we could write the following:

std::vector<int> v {1, 1, 2, 3, 5, 8};
for(auto i = 0u; i < v.size(); ++i)
{
   std::cout << v[i] << '\n';
}

The deduced type of the variable i is unsigned int. This works fine on 32-bit, where both unsigned int and size_t, which is the return type of the size() member function, are 32-bit. But on 64-bit you may get a warning and the value is truncated, because unsigned int is still 32-bit but size_t is 64-bit.

On the other hand, we can have the following:

std::vector<int> v {1, 1, 2, 3, 5, 8};
auto m = std::max(42, std::ssize(v)); // compiles on 32-bit but fails on 64-bit
std::vector<int> v {1, 1, 2, 3, 5, 8};
auto m = std::max(42ll, std::ssize(v)); // compiles on 64-bit but fails on 32-bit

Neither of these two versions work on both 32-bit and 64-bit platforms.

This is where the new literal suffixes help:

std::vector<int> v {1, 1, 2, 3, 5, 8};
for(auto i = 0uz; i < v.size(); ++i)
{
   std::cout << v[i] << '\n';
}

auto m = std::max(42z, std::ssize(v));

This code works the same on all platforms.

See more:

Multidimensional subscript operator

Sometimes we need to work with multidimensional containers (or views). Accessing elements in an unidimensional container can be done with the subscript operator (such as arr[0] or v[i]). But for a multidimensional type, the subscript operator does not work nice. You cannot say arr[0, 1, 2]. The alternatives are:

  • Define an access function, such as at() with any number of parameters (so you could say c.at(0, 1, 2))
  • overload the call operator (so you could say c(0, 1, 2))
  • overload the subscript operator with a brace-enclosed list (so you could say c[{1,2,3}])
  • chain single-argument array access operators (so you could say c[0][1][2]) which is probably leading to the least desirable APIs and usage

To demonstrate the point, let’s consider a matrix class (that represents a two dimensional array). A simplistic implementation and usage is as follows:

template <typename T, size_t R, size_t C>
struct matrix
{
   T& operator()(size_t const r, size_t const c) noexcept
   {
      return data_[r * C + c];
   }

   T const & operator()(size_t const r, size_t const c) const noexcept
   {
      return data_[r * C + c];
   }

   static constexpr size_t Rows = R;
   static constexpr size_t Columns = C;
private:
   std::array<T, R* C> data_;
};

int main()
{
   matrix<int, 2, 3> m;
   for (size_t i = 0; i < m.Rows; ++i)
   {
      for (size_t j = 0; j < m.Columns; ++j)
      {
         m(i, j) = i * m.Columns + (j + 1);
      }
   }

   for (size_t i = 0; i < m.Rows; ++i)
   {
      for (size_t j = 0; j < m.Columns; ++j)
      {
         std::cout << m(i, j) << ' ';
      }

      std::cout << '\n';
   }
}

I never liked the m(i, j) syntax, but this was the best we could do until C++23, IMO. Now, we can overload the subscript operator with multiple parameters:

T& operator[](size_t const r, size_t const c) noexcept
{
   return data_[r * C + c];
}

T const & operator[](size_t const r, size_t const c) const noexcept
{
   return data_[r * C + c];
}

We can now use the new matrix implementation as follows:

int main()
{
   matrix<int, 3, 2> m;
   for (size_t i = 0; i < m.Rows; ++i)
   {
      for (size_t j = 0; j < m.Columns; ++j)
      {
         m[i, j] = i * m.Columns + (j + 1);
      }
   }
    
   for (size_t i = 0; i < m.Rows; ++i)
   {
      for (size_t j = 0; j < m.Columns; ++j)
      {
         std::cout << m[i, j] << ' ';
      }
       
      std::cout << '\n';
   }    
}

I just wished we had this twenty years ago!

See also:

contains() member function for string/string_view

C++20 added the starts_with() and ends_with() member functions to std::basic_string and std::basic_string_view. These enable us to check whether a string starts with a given prefix or ends with a given suffix.

int main()
{
   std::string text = "lorem ipsum dolor sit amet";

   std::cout << std::boolalpha;

   std::cout << text.starts_with("lorem") << '\n'; // true
   std::cout << text.starts_with("ipsum") << '\n'; // false

   std::cout << text.ends_with("dolor") << '\n';   // false
   std::cout << text.ends_with("amet") << '\n';    // true
}

Unfortunately, these don’t help us checking whether a string contains a given substring. Of course, this is possible with the find() function. But this returns the position of the first character of the found substring or npos otherwise, so we need to do a check as follows:

std::cout << (text.find("dolor") != std::string::npos) << '\n';

I find this cumbersome and ugly when you just want to know if a string contains a particular substring or character.

In C++23, the circle is complete, as the same feature is available with the new contains() member function. This function enables us to check whether a substring or a single character is present anywhere the string. This is basically the same as find(x) != npos. But the syntax is nicer and in line with starts_with() and ends_with().

std::cout << text.contains("dolor") << '\n';

See also:

2 Replies to “Three C++23 features for common use”

  1. Could you explain why “chain single-argument array access operators (so you could say c[0][1][2]) which is probably leading to the least desirable APIs and usage”?
    It seems to be the most common and wide-spread method there is.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.