A comparison of two std::transform alternatives revisited

In the previous post I have compared two alternative ways of transforming a string to upper case, both using std::transform: one that modifies an existing string and one that generates a new one by inserting at the end using std::back_inserter. For the second alternative I have presented two implementations, one that does an initial reservation for the newly created string and one that does not.

inline std::string to_upper_v2(std::string const & text)
{
  auto uppertext = std::string{};
  std::transform(std::begin(text), std::end(text), std::back_inserter(uppertext), toupper);
  return uppertext;
}
 
inline std::string to_upper_v3(std::string const & text)
{
   auto uppertext = std::string{};
   uppertext.reserve(text.size());
   std::transform(std::begin(text), std::end(text), std::back_inserter(uppertext), toupper);
   return uppertext;
}

The curious conclusion of the tests was that the version with reserve was actually slower than the one that did not perform an initial reservation.

The solution was built with Visual Studio 2015 Update 2. As it was later noticed in the comments, the actual cause of that is a Microsoft optimization for std::string by using an array of 16 characters for strings that do not exceed this size and only dynamically allocate memory for larger strings. Since all the strings had a length between 3 and 12 characters, this optimization was used for all strings. Therefore, reserve() dynamically allocated memory that was never used and its execution time only added to the overall time.

To actually be able to test the performance of these two implementations with VC++, the strings should be larger than 16 characters. So I changed the code to generate strings between 17 and 25 characters long.

auto dist_len = std::uniform_int_distribution<>{ 3, 12 };

The results this time were totally different. The 3rd version with initial reserving was more performant than the one that did not do that. It can also be noticed that the more strings need to be transformed the more similar times it takes for all the versions.

No of strings time v1 time v2 time v3 Percentage of slowdown with v2 Percentage of slowdown with v3
1000 122 219 205 79.5 68.0
10000 1202 2178 2055 81.2 71.0
100000 13563 22758 21431 67.8 58.0
1000000 136160 225669 214149 65.7 57.3
10000000 1368034 2268991 2155969 65.9 57.6
100000000 23090172 27997658 27322888 21.3 18.3

In the chart below with blue it is represented the results for version 2 and with orange the results for version 3 (with initial reservation).
transform2

Note: Generating 100 milion strings between 17 and 25 characters require a lot of memory. In my tests it peaked to 13GB. So if you want to run the code you should be aware of this.

1 Reply to “A comparison of two std::transform alternatives revisited”

  1. Compiling code with gcc version 11.2.0 (Rev5, Built by MSYS2 project) (Windows 10), my results are:

    for dist_len {3, 12}:

    duration v1 (1000) = 0
    duration v2 (1000) = 0
    duration v3 (1000) = 0
    duration v1 (10000) = 1001
    duration v2 (10000) = 0
    duration v3 (10000) = 0
    duration v1 (100000) = 3000
    duration v2 (100000) = 1953
    duration v3 (100000) = 2003
    duration v1 (1000000) = 29952
    duration v2 (1000000) = 27020
    duration v3 (1000000) = 24999
    duration v1 (10000000) = 276953
    duration v2 (10000000) = 257000
    duration v3 (10000000) = 235970

    for dist_len {17, 25}:

    duration v1 (1000) = 0
    duration v2 (1000) = 0
    duration v3 (1000) = 984
    duration v1 (10000) = 982
    duration v2 (10000) = 999
    duration v3 (10000) = 1006
    duration v1 (100000) = 12000
    duration v2 (100000) = 14010
    duration v3 (100000) = 14000
    duration v1 (1000000) = 108033
    duration v2 (1000000) = 116949
    duration v3 (1000000) = 101997
    duration v1 (10000000) = 1042996
    duration v2 (10000000) = 1190016
    duration v3 (10000000) = 1025000

    I didn’t execute test for 100 millions strings.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.