A question about memory allocation in Linux

J

JQ

I have a programming question and hope to get your suggestions. I
have a C++ program that reads files. First, I allocate a large piece
memory. Then, after reading one line, I fill the data in this line
into the memory.

The whole program may use up to 1G mem. However, the program is
very slow during the reading file and filling the data. The whole
machine has 4G mem. It is CentOS and program is in C++, compiled by g+
+.

I tried to read the file only, which is very fast.

It seems allocating memory is slow. Do you have some suggestions?
 
J

Jerry Coffin

I have a programming question and hope to get your suggestions. I
have a C++ program that reads files. First, I allocate a large piece
memory. Then, after reading one line, I fill the data in this line
into the memory.

[ ... ]
It seems allocating memory is slow. Do you have some
suggestions?

The usual suggestion for Linux (or anything very similar) would be to
memory map the file. Then you'd (probably) want to create an
"index" -- an array of pointers to the beginnings of the lines.

Unfortunately, doing that will take some system-specific code that's
not entirely topical here...
 
V

Victor Bazarov

JQ said:
I have a programming question and hope to get your suggestions. I
have a C++ program that reads files. First, I allocate a large piece
memory. Then, after reading one line, I fill the data in this line
into the memory.

You should probably look at 'std::getline'...
The whole program may use up to 1G mem. However, the program is
very slow during the reading file and filling the data. The whole
machine has 4G mem. It is CentOS and program is in C++, compiled by g+
+.

I tried to read the file only, which is very fast.

It seems allocating memory is slow. Do you have some suggestions?

Yes, well, look at 'std::getline' first. Then don't allocate too much
memory. Often, especially if reading the file is fast, read the file
while counting bytes until you see the boundary (like \n or whatever),
then allocate exactly as much as you need, then go back in the file (see
the 'seek' family of functions) and read it again but into the newly
allocated buffer. Then profile to know what exactly is not right. Of
course, having 1G of memory footprint is pushing it. Perhaps you car
reorganize your algorithm to use less?... Hard to say without knowing
what you do and why.

V
 
J

Juha Nieminen

JQ said:
It seems allocating memory is slow. Do you have some suggestions?

If you know the size of the file (or line) which you are going to
read, you can allocate one single chunk of memory to hold it and read
the entire file with one single std::fread(). One single memory
allocation will be fast, as well as the single std::fread().

Of course if the file is way too large to be read at once into memory,
and you have to read it one line at a time, and you have no way of
knowing in advance the length of the lines, then you'll just have to use
std::getline() with a std::string at the outer scope (so that it will
not be allocated and destroyed with each line, but reused with the for
each read line) and hope that your compiler has an efficient
implementation of std::string (ie. similar to std::vector).

In other words, something like this:

std::string line;
while(true)
{
std::getline(instream, line);
if(!instream) break;
// parse line here
}
 
R

Rune Allnor

   I have a programming question and hope to get your suggestions. I
have a C++ program that reads files. First, I allocate a large piece
memory. Then, after reading one line, I fill the data in this line
into the memory.

   The whole program may use up to 1G mem. However, the program is
very slow during the reading file and filling the data. The whole
machine has 4G mem. It is CentOS and program is in C++, compiled by g+
+.

   I tried to read the file only, which is very fast.

   It seems allocating memory is slow. Do you have some suggestions?

What kinds of data are these? Numeric data on ASCII format?
If so, expect 40-50 seconds per 100 MBytes of data to convert
from ASCII format to binary format.

Rune
 
J

Jerry Coffin

[ ... ]
If you know the size of the file (or line) which you are going to
read, you can allocate one single chunk of memory to hold it and read
the entire file with one single std::fread(). One single memory
allocation will be fast, as well as the single std::fread().

Of course if the file is way too large to be read at once into memory,
and you have to read it one line at a time, and you have no way of
knowing in advance the length of the lines, then you'll just have to use
std::getline() with a std::string at the outer scope (so that it will
not be allocated and destroyed with each line, but reused with the for
each read line) and hope that your compiler has an efficient
implementation of std::string (ie. similar to std::vector).

There is a possibility of a middle ground. If, for example, you're
scanning through the file linearly, you can quite reasonably read a
block that's (normally) much larger than a single line, but still
considerably smaller than the entire file. For example, as I recall
the OP said the file was around a gigabyte -- rather on the large
side for a single memory allocation (not so huge that it's completely
unreasonable, but might still be larger than you'd really want to use
all at once).

In such a case I'd consider reading a block of something like a
megabyte at a time, and parsing that into lines for processing. Once
you've processed all but the last partial line, you read another
block. Typically you either copy that final partial line to the
beginning of the buffer, and start the next read after it, or else
you alternate between using two blocks of memory, with the
"beginning" of a line starting at the end of one block, and extending
into the beginning of the other block. Note, however, that this can
fail completely if the input contains a "line" that's more than two
blocks long (not common with normal text files, with quite possible
with machine generated "text", especially if the generator contains a
bug or is written with malice aforethought).
 
J

JQ

First, thank you all for the suggestions.

Actually, from a preprocessing program, I know the size of the file
and how much memory is needed.

Step 1: I read a small configuration file and allocate the memory by
calling new().
Step 2: I read the data file (the file I mentioned in the initial
post), process the data and write them to the memory. The total mem is
around 800M since it is a large network.

I tried reading the file only, doing simple operations and not writing
them into the mem. It is fast. Seems the memeory allocation is slow?

Thanks,

JQ
 
I

Ismo Salonen

JQ said:
First, thank you all for the suggestions.

Actually, from a preprocessing program, I know the size of the file
and how much memory is needed.

Step 1: I read a small configuration file and allocate the memory by
calling new().
Step 2: I read the data file (the file I mentioned in the initial
post), process the data and write them to the memory. The total mem is
around 800M since it is a large network.

I tried reading the file only, doing simple operations and not writing
them into the mem. It is fast. Seems the memeory allocation is slow?

Thanks,

JQ
The memory allocation itself is quite fast. But your usage patterns
probably sucks. Show us some code. Doing realloc many times is not the
martest thing to do, it has to move memory around quite often.

ismo
 
F

Fred Zwarts

JQ said:
I have a programming question and hope to get your suggestions. I
have a C++ program that reads files. First, I allocate a large piece
memory. Then, after reading one line, I fill the data in this line
into the memory.

The whole program may use up to 1G mem. However, the program is
very slow during the reading file and filling the data. The whole
machine has 4G mem. It is CentOS and program is in C++, compiled by g+
+.

I tried to read the file only, which is very fast.

It seems allocating memory is slow. Do you have some suggestions?

Maybe don't copy the file to memory. Most operating systems today, including Linux,
cache already the file data in memory. Why would you duplicate this?
 
J

James Kanze

[ ... ]
It seems allocating memory is slow. Do you have some
suggestions?
The usual suggestion for Linux (or anything very similar)
would be to memory map the file.

In this case, anything very similar would include Windows, no?
Windows also supports memory mapping (although obviously with a
different interface---can't make things too simple for the
programmer).

Alternatively, using a system level read to put the data
directly into the memory probably isn't significantly slower
either.

But he didn't really describe what he was doing in detail. If
his only allocation is one large piece of memory up front, it's
not the allocation of this memory which is costing the time.
Similarly, if he's using something like:

ptr = &buffer[ 0 ] ;
std::string line ;
while ( std::getline( source, line ) ) {
ptr = std::copy( line.begin(), line.end(), ptr ) ;
*ptr ++ = '\n' ;
}

(with a bit more error handling, of course), then in a good
implementation, line should soon have the capacity for the
longest line, and there should be no more allocations, either.
(Regretfully, the standard doesn't guarantee this, but most
implementations use the same memory management strategy for
std::string as for std::vector---of the four library
implementations I have at hand, only Sun CC's will ever reduce
the capacity in something like the above.)

On the other hand, anytime he's using a buffer of 1GB, he's
risking page faults, and the copying in the above will be
expensive. On my system, with a file slightly over 400 MB, and
doing an xor over the data after the read (so that I was sure
that everything would be paged in with mmap), using mmap took a
half a second, using a system read directly into the buffer took
one second, and using something like the above (with the xor on
the buffer after) took a little over 2 seconds. (That's elapsed
time, so it's not very precise---there are other things going on
on the system. Also, I'm pretty sure that by the time I got to
measuring, the entire file was already in system cache---the
system has 3 GB of main memory.) FWIW, the simplest utility (wc)
takes almost five seconds on the file, so we can probably
conclude that if there is any serious processing of the data
after it has been read, the differences in time will likely be
negligible.
 
J

James Kanze

First, thank you all for the suggestions.
Actually, from a preprocessing program, I know the size of the
file and how much memory is needed.
Step 1: I read a small configuration file and allocate the memory by
calling new().
Step 2: I read the data file (the file I mentioned in the initial
post), process the data and write them to the memory. The total mem is
around 800M since it is a large network.
I tried reading the file only, doing simple operations and not
writing them into the mem. It is fast. Seems the memeory
allocation is slow?

If I understand you correctly, you're only doing one memory
allocation, which should be close to negligible. Could you post
the code for the loop where you're reading the data and copying
it into the buffer---even with the most naïve implementation,
using std::getline into an std::string, then std::copy into the
buffer, I get reasonable performance on my system.

FWIW: allocating the buffer and reading the entire file into it
in one low-level system request (read under Unix, ReadFile under
Windows) should be very fast, and is fairly simple to do. Under
Unix, mmap isn't that much more complex, and is a little bit
faster; IIRC, memory mapping under Windows is a little bit more
complicated (but that might only be because I'm less familiar
with Windows).
 
J

Jerry Coffin

On Aug 25, 8:51 pm, Jerry Coffin <[email protected]> wrote:

[ ... ]
In this case, anything very similar would include Windows, no?

It certainly could, but he specifically mentioned a Linux variant, so
that's all I tried to address.
Windows also supports memory mapping (although obviously with a
different interface---can't make things too simple for the
programmer).

Alternatively, using a system level read to put the data
directly into the memory probably isn't significantly slower
either.

If he has the physical memory to support holding it all in physical
memory at once, yes. If he doesn't (and for a 1 GiB file, he might
easily not) we're looking at exactly why memory mapped files were
invented in the first place: a system level read ends up doing little
more than copying the data from his original file into the paging
file, creating descriptors along the way to allow reading pages back
in when needed. Under these circumstances, a memory mapped file will
generally be a LOT faster, especially during that startup phase,
because it avoids reading and writing the entire 1 GiB file from and
back out to disk before it even gets started on doing the real work.
But he didn't really describe what he was doing in detail. If
his only allocation is one large piece of memory up front, it's
not the allocation of this memory which is costing the time.

Depending -- if he has physical memory, but it's nearly all used (and
dirty) allocating it for other purposes could involve flushing 1 GiB
out of physical memory, which really will be slow (usually anyway).

Of course, I'm left with a few questions, like what he's doing that
requires holding an entirely ~1 GiB file in memory at once anyway.
There can be legitimate reasons to do that, but I don't think it's a
given that it's the most efficient way to accomplish what he wants.

In the end, I agree that there's something fishy though: unless he's
running into (something bordering on) disk thrashing, allocating a
block of memory is normally a lot faster than reading that amount of
data from disk into the memory.
 
J

Jerry Coffin

[ ... ]
IIRC, memory mapping under Windows is a little bit more
complicated (but that might only be because I'm less familiar
with Windows).

It is marginally more complex under Windows. POSIX memory mapping is
a two-stage process: you open the file, then map that file to memory.
Under Win32, it's a three stage process: you open the file, you
create the memory mapping, then you map a view of the file. The third
stage under Windows allows you to map only part of a file to memory
at once, if you don't care about the whole thing. This allows (for
one example) memory mapping (part of) a file that's larger than the
available address space.
 
J

James Kanze

[ ... ]
IIRC, memory mapping under Windows is a little bit more
complicated (but that might only be because I'm less familiar
with Windows).
It is marginally more complex under Windows. POSIX memory
mapping is a two-stage process: you open the file, then map
that file to memory. Under Win32, it's a three stage process:
you open the file, you create the memory mapping, then you map
a view of the file. The third stage under Windows allows you
to map only part of a file to memory at once, if you don't
care about the whole thing. This allows (for one example)
memory mapping (part of) a file that's larger than the
available address space.

Just a nit (and totally off topic), but the Unix mmap also
allows mapping part of a file. The mmap function merges the
last two steps in Windows---it has an offset and a length
argument. (But the added complexity may have been largely in my
mind. It took me about ten minutes to write the Unix version,
and easily two hours for the Windows version. But most of that
two hours was finding my way around in a documentation I'm not
familiar with---I didn't even know the name of the Windows
function I'd need when I started.)
 
J

Jorgen Grahn

I have a programming question and hope to get your suggestions. I
have a C++ program that reads files. First, I allocate a large piece
memory. Then, after reading one line, I fill the data in this line
into the memory.

The whole program may use up to 1G mem. However, the program is
very slow during the reading file and filling the data. The whole
machine has 4G mem. It is CentOS and program is in C++, compiled by g+
+.

I tried to read the file only, which is very fast.

It seems allocating memory is slow. Do you have some suggestions?

Yes: learn to use a profiler (Gnu gprof, in your case). It is unlikely
to be the 'new char[1e9]' which takes all the time.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,828
Latest member
LauraCastr

Latest Threads

Top