I have decided to make a comparison for file IO operations on Win32, CRT, STL and MFC.
For all four libraries/APIs I have done the profiling in the following way:
- open the file
- allocate the buffer used for reading
- start the timer
- read/write from/to the file
- stop the timer
- close the file
- release the memory
This way, the profiling only applies on the read or write operations, not on other task such as opening and closing files, allocating or releasing memory.
What I’ve used:
- Win32: functions CreateFile, ReadFile, WriteFile and CloseHandle
- CRT: functions FILE, fopen, fread, fwrite and fclose
- STL: for reading class std::ifstream, and methods, open(), read() and close() and for writing class std::ofstream and methods open(), write() and close()
- MFC: class CFile, and methods Open(), Read(), Write and Close()
I have performed the reading with different buffer sizes: 32, 64, 128, 256, 512 bytes and 1KB, 2KB, 4KB, 8KB, 16KB, 32KB, as well as with a buffer accomodating the entire file. Same buffer sizes were used for writing. For testing the write operation I also wrote the file at once. In all cases, I generated a 16MB file.
To decide which one is better overall, I have associated a score with each result. The faster (for each buffer size) got 4 points, the next 3, 2, and 1. The bigger the sum, the more performant overall.
I have run the program on two files for reading on a Intel(R) Pentium(R) 4 CPU at 3.20GHz, 1 GB RAM, running Windows XP, SP2. The results, representing an average of 15 runs, are shown bellow:
File 1: size 2,131,287 bytes
Buffer Size | CRT | Win32 | STL | MFC | CRT | Win32 | STL | MFC |
32 | 0.01917630 | 0.063093700 | 0.02123180 | 0.064283700 | 4 | 2 | 3 | 1 |
64 | 0.01474360 | 0.031909200 | 0.01460960 | 0.032482700 | 3 | 2 | 4 | 1 |
128 | 0.01118370 | 0.016183700 | 0.01164060 | 0.016426700 | 4 | 2 | 3 | 1 |
256 | 0.00929148 | 0.008573490 | 0.01063090 | 0.008840810 | 2 | 4 | 1 | 3 |
512 | 0.01071420 | 0.004684040 | 0.00985086 | 0.004745970 | 1 | 4 | 2 | 3 |
1024 | 0.00883909 | 0.002584480 | 0.00907385 | 0.002486950 | 2 | 3 | 1 | 4 |
2048 | 0.00847502 | 0.001531440 | 0.00894887 | 0.001477660 | 2 | 3 | 1 | 4 |
4096 | 0.00776395 | 0.000981391 | 0.00891128 | 0.001009350 | 2 | 4 | 1 | 3 |
8192 | 0.00740465 | 0.000744340 | 0.00913489 | 0.000749145 | 2 | 4 | 1 | 3 |
16384 | 0.00740928 | 0.000604900 | 0.00936410 | 0.000673978 | 2 | 4 | 1 | 3 |
32768 | 0.00736531 | 0.000657141 | 0.00837419 | 0.000610040 | 2 | 3 | 1 | 4 |
file size | 0.00955846 | 0.002496180 | 0.00981464 | 0.002428280 | 2 | 3 | 1 | 4 |
28 | 38 | 20 | 34 |
File 2: size 110,999,662 bytes
Buffer Size | CRT | Win32 | STL | MFC | CRT | Win32 | STL | MFC |
32 | 1.011360 | 3.3216500 | 2.47695 | 3.2822700 | 4 | 1 | 3 | 2 |
64 | 0.742683 | 1.6815600 | 0.804563 | 1.6836300 | 4 | 2 | 3 | 1 |
128 | 0.600344 | 0.8697840 | 0.639113 | 0.8750610 | 4 | 2 | 3 | 1 |
256 | 0.521233 | 0.4661430 | 0.586376 | 0.4751340 | 2 | 4 | 1 | 3 |
512 | 0.501420 | 0.2734540 | 0.532212 | 0.2653010 | 2 | 3 | 1 | 4 |
1024 | 0.474670 | 0.1532950 | 0.510266 | 0.1587330 | 2 | 4 | 1 | 3 |
2048 | 0.458538 | 0.1012430 | 0.479981 | 0.1067980 | 2 | 4 | 1 | 3 |
4096 | 0.432552 | 0.0715536 | 0.488251 | 0.0774886 | 2 | 4 | 1 | 3 |
8192 | 0.417481 | 0.0607284 | 0.467426 | 0.0674372 | 2 | 4 | 1 | 3 |
16384 | 0.400320 | 0.0510897 | 0.458111 | 0.0602826 | 2 | 4 | 1 | 3 |
32768 | 0.406497 | 0.0503835 | 0.461796 | 0.0572124 | 2 | 4 | 1 | 3 |
file size | 0.523950 | 0.1867240 | 0.583327 | 0.1828440 | 2 | 3 | 1 | 4 |
30 | 39 | 18 | 33 |
The first conclusion is that overall Win32 is the fastest, followed by MFC, then by CRT, the slowest being the STL.
The second conclusion is that CRT is the fastest with buffer sizes smaller than 256 bytes, and then Win32 and MFC are the faster.
The results for writing were a quite similar. Of course, running several times, can produce slight variation in the results (both for read and write).
File 3 : size 16,809,984
Buffer Size | CRT | Win32 | STL | MFC | CRT | Win32 | STL | MFC |
32 | 0.273796 | 0.890973 | 0.335245 | 0.877301 | 4 | 1 | 3 | 2 |
64 | 0.219715 | 0.465254 | 0.259597 | 0.450076 | 4 | 1 | 3 | 2 |
128 | 0.181927 | 0.24715 | 0.201949 | 0.245169 | 4 | 1 | 3 | 2 |
256 | 0.178976 | 0.141146 | 0.189154 | 0.143666 | 2 | 4 | 1 | 3 |
512 | 0.153816 | 0.0872411 | 0.172239 | 0.0851424 | 2 | 3 | 1 | 4 |
1024 | 0.148846 | 0.0608282 | 0.159186 | 0.0601419 | 2 | 3 | 1 | 4 |
2048 | 0.139997 | 0.0493811 | 0.150503 | 0.0496117 | 2 | 4 | 1 | 3 |
4096 | 0.125797 | 0.0705146 | 0.15275 | 0.0508061 | 2 | 3 | 1 | 4 |
8192 | 0.126708 | 0.15708 | 0.1459 | 0.0655567 | 3 | 1 | 2 | 4 |
16384 | 0.121919 | 0.0282886 | 0.14662 | 0.158024 | 3 | 4 | 2 | 1 |
32768 | 0.124429 | 0.0247259 | 0.145496 | 0.0267301 | 2 | 4 | 1 | 3 |
16809984 | 0.148424 | 0.47066 | 0.146321 | 0.513205 | 3 | 2 | 4 | 1 |
33 | 31 | 23 | 33 |
You can download the project I used for the benchmark from here.
I would consider that you should include the time consume in making the buffer – since with larger chunks if they take more time to read – they might save some time in allocation. Also, it might be interesting to see how the speeds vary when you keep appending the read data at each iteration and not destroying the buffer. Something like using a growing array (a vector for example)
Nice article/program.
Two suggestions:
1. If you open both ifstream and fopen using the BINARY flag you will get much better comparison.
2. I have written tons of benchmark programs and one key lesson is always verify the code works before measuring it. I recommend computing a checksum of the data read which should be identical for all implementations to be a fair and accurate comparison.
I ran your write test with a target file larger than my virtual space (16 GB) and STL and CRT are identical and maintain 80 MB/sec write transfer rate. MFC and Win32 start out much slower and only hit 80 MB/s with record size of 512 and greater. The lesson learned is as expected, buffered i/o (fstream and fopen) are much faster on small transaction then unbuffererd i/o (cfile, createfile) and all four are identical when the transaction exceeds their native buffers because it is a pass through. So always use fstream or fopen because they were designed to be the best general purpose i/o API.
Please review:
http://home.comcast.net/~lang.dennis/code/fileio/fileio.html