std::ifstream buffer caching

Posted by ledokol on Stack Overflow See other posts from Stack Overflow or by ledokol
Published on 2010-12-29T21:35:59Z Indexed on 2010/12/30 19:54 UTC
Read the original article Hit count: 205

Filed under:
|
|
|
|

Hello everybody,

In my application I'm trying to merge sorted files (keeping them sorted of course), so I have to iterate through each element in both files to write the minimal to the third one. This works pretty much slow on big files, as far as I don't see any other choice (the iteration has to be done) I'm trying to optimize file loading. I can use some amount of RAM, which I can use for buffering. I mean instead of reading 4 bytes from both files every time I can read once something like 100Mb and work with that buffer after that, until there will be no element in buffer, then I'll refill the buffer again. But I guess ifstream is already doing that, will it give me more performance and is there any reason? If fstream does, maybe I can change size of that buffer?

added

My current code looks like that (pseudocode)

// this is done in loop
int i1 = input1.read_integer();
int i2 = input2.read_integer();
if (!input1.eof() && !input2.eof())
{
   if (i1 < i2)
   {
      output.write(i1);
      input2.seek_back(sizeof(int));
   } else
      input1.seek_back(sizeof(int));
      output.write(i2);
   }
} else {
   if (input1.eof())
      output.write(i2);
   else if (input2.eof())
      output.write(i1);
}

What I don't like here is

  • seek_back - I have to seek back to previous position as there is no way to peek 4 bytes
  • too much reading from file
  • if one of the streams is in EOF it still continues to check that stream instead of putting contents of another stream directly to output, but this is not a big issue, because chunk sizes are almost always equal.

Can you suggest improvement for that?

Thanks.

© Stack Overflow or respective owner

Related posts about c++

Related posts about Performance