[gfortran] Patch for potential bug in i/o library

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[gfortran] Patch for potential bug in i/o library

Janne Blomqvist
Hello,

I spent some time this weekend wondering about the performance of the
i/o library (more on that later), and I think I found a simple bug in
libgfortran/io/unix.c. Namely it seems to assume that all non-mmapped
files are not seekable. This might or might not be true. Anyway, we
have a convenient function for figuring this out, so why not use it?

ChangeLog:

        * unix.c (stream_at_bof): Don't assume that all non-mmapped files
        are non-seekable. (stream_at_eof): Likewise.

The patch itself is attached.


About the performance issues, in some cases gfortran (and g77 for that
matter) seem to be clobbered pretty badly (10X) by ifort (and
presumably some other commercial compilers as well). See e.g. PR
16339. I figured that it might be because gfortran uses mmap and ifort
doesn't, and also because ifort uses rather big writes in 256 kB
chunks whereas gfortran uses 8 kB. I tried changing libgfortran to not
use mmap, and the execution time of the test in PR 16339 dropped, but
not by much (from ~1.6 s to ~1.4 s). Also, increasing the buffer size
didn't change things much.

Looking at the execution profile using gprof it seems that the code
spends a large amount of time in the *_alloc_w_at functions, in total
about 45 %. In fact, alloc_w_at is called for every one of the 10e6
elements in the array. So, regardless of the relative merits of mmap
vs. (f)read/write, I guess the key to improving i/o performance would
be to make the library interface more coarse grained, e.g. enabling
transfer of contigous array sections in one go instead of looping over
all the individual elements?

--
Janne Blomqvist
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

Thomas Koenig
Janne Blomqvist wrote:

> I guess the key to improving i/o performance would
> be to make the library interface more coarse grained, e.g. enabling
> transfer of contigous array sections in one go instead of looping over
> all the individual elements?

Definitely.  For arrays, it would be straightforward to pass an
array descriptor to the i/o library, and have it handled there.
Lots of m4 to be written :-)  Implied do loops will be harder,
but not as important.  If these can be converted to an
array descriptor (sort of, using temporaries), that would be ideal.
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

bud davis
Thomas Koenig wrote:

>Janne Blomqvist wrote:
>
>  
>
>>I guess the key to improving i/o performance would
>>be to make the library interface more coarse grained, e.g. enabling
>>transfer of contigous array sections in one go instead of looping over
>>all the individual elements?
>>    
>>
>
>Definitely.  For arrays, it would be straightforward to pass an
>array descriptor to the i/o library, and have it handled there.
>Lots of m4 to be written :-)  Implied do loops will be harder,
>but not as important.  If these can be converted to an
>array descriptor (sort of, using temporaries), that would be ideal.
>
>  
>
if someone grabs this idea and implement it, please also keep in mind
the need to know how many bytes will go into an unformatted sequential
record before it is written....

the seeking back into the file to write the start marker will always be
a big performance loss (esp for NFS mounted files).


--bud

Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

bud davis
In reply to this post by Janne Blomqvist
Janne Blomqvist wrote:

>Hello,
>
>I spent some time this weekend wondering about the performance of the
>i/o library (more on that later), and I think I found a simple bug in
>libgfortran/io/unix.c. Namely it seems to assume that all non-mmapped
>files are not seekable. This might or might not be true. Anyway, we
>have a convenient function for figuring this out, so why not use it?
>
>ChangeLog:
>
> * unix.c (stream_at_bof): Don't assume that all non-mmapped files
> are non-seekable. (stream_at_eof): Likewise.
>
>The patch itself is attached.
>
>
>About the performance issues, in some cases gfortran (and g77 for that
>matter) seem to be clobbered pretty badly (10X) by ifort (and
>presumably some other commercial compilers as well). See e.g. PR
>16339. I figured that it might be because gfortran uses mmap and ifort
>doesn't, and also because ifort uses rather big writes in 256 kB
>chunks whereas gfortran uses 8 kB. I tried changing libgfortran to not
>use mmap, and the execution time of the test in PR 16339 dropped, but
>not by much (from ~1.6 s to ~1.4 s). Also, increasing the buffer size
>didn't change things much.
>
>Looking at the execution profile using gprof it seems that the code
>spends a large amount of time in the *_alloc_w_at functions, in total
>about 45 %. In fact, alloc_w_at is called for every one of the 10e6
>elements in the array. So, regardless of the relative merits of mmap
>vs. (f)read/write, I guess the key to improving i/o performance would
>be to make the library interface more coarse grained, e.g. enabling
>transfer of contigous array sections in one go instead of looping over
>all the individual elements?
>
>  
>
you are onto the big performance problem in the I/O library today.  
until we deal with arrays as contiguos blocks of memory the performance
will be dismal.

we get that fixed, then the buffer size and mmap yes / no will become
significant......


--bud

 
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

Janne Blomqvist
In reply to this post by Janne Blomqvist
On Mon, Jun 06, 2005 at 12:15:28AM +0300, Janne Blomqvist wrote:
> The patch itself is attached.

As a couple of perceptive people noticed, the patch itself actually
wasn't attached. Thanks for pointing it out. Here the patch really is.


--
Janne Blomqvist

seekable.patch (850 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

Janne Blomqvist
In reply to this post by bud davis
[Dropped gcc-patches]

On Sun, Jun 05, 2005 at 08:50:33PM -0500, bud davis wrote:

> Thomas Koenig wrote:
>
> >Janne Blomqvist wrote:
> >
> >
> >
> >>I guess the key to improving i/o performance would
> >>be to make the library interface more coarse grained, e.g. enabling
> >>transfer of contigous array sections in one go instead of looping over
> >>all the individual elements?
> >>  
> >>
> >
> >Definitely.  For arrays, it would be straightforward to pass an
> >array descriptor to the i/o library, and have it handled there.
> >Lots of m4 to be written :-)  Implied do loops will be harder,
> >but not as important.  If these can be converted to an
> >array descriptor (sort of, using temporaries), that would be ideal.
> >
> >
> >
> if someone grabs this idea and implement it, please also keep in mind
> the need to know how many bytes will go into an unformatted sequential
> record before it is written....
>
> the seeking back into the file to write the start marker will always be
> a big performance loss (esp for NFS mounted files).

Ah yes, that's a big one. Even without implementing any of the
"transfer arrays in one go" stuff I think this could help a lot even
in the case of one huge record as in PR 16339 since if the length of
the transfer would be known before any actual transfer takes place,
one could presumably do one big alloc_* via st_read/write for the
entire record instead of allocating separately for each element. Would
just have to make sure that big allocs and buffering play well
together, as currently an alloc bigger than the buffer size never
really occurs.

And additionally, once Buds "pluggable record markers" patch is
included, knowing the record size beforehand would allow an efficient
implementation of hp record markers.

Yet additionally, the IOLENGTH form of the INQUIRE statement could be
simplified quite a bit (perhaps it wouldn't even need to enter the
library at all if the record size computation would be done at the
compiler side).


--
Janne Blomqvist
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

Toon Moene
In reply to this post by Janne Blomqvist
Thomas Koenig wrote:

> Definitely.  For arrays, it would be straightforward to pass an
> array descriptor to the i/o library, and have it handled there.
> Lots of m4 to be written :-)  Implied do loops will be harder,
> but not as important.

Uh, oh.  I can recall a discussion at a previous employer (some 15 years
ago), where we (the support people of the University Supercomputer
Facility) had to explain to the user that, yes, indeed, despite the
prowess of the TLG (Three Letter Giant) that offered the hard-/software
for our supercomputing business, that whole array I/O was much, much
more efficient than exactly the same I/O written as an implied do loop.

Be forewarned.

--
Toon Moene - e-mail: [hidden email] - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
A maintainer of GNU Fortran 95: http://gcc.gnu.org/fortran/
Looking for a job: Work from home or at a customer site; HPC, (GNU)
Fortran & C
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

Tobias Schlüter
In reply to this post by Janne Blomqvist
Janne Blomqvist wrote:
> * unix.c (stream_at_bof): Don't assume that all non-mmapped files
> are non-seekable. (stream_at_eof): Likewise.

This is ok.

- Tobi
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

Janne Blomqvist
On Sat, Jun 11, 2005 at 02:51:09PM +0200, Tobias Schl?ter wrote:
> Janne Blomqvist wrote:
> > * unix.c (stream_at_bof): Don't assume that all non-mmapped files
> > are non-seekable. (stream_at_eof): Likewise.
>
> This is ok.

Thanks.

I don't have commit privileges, so could you or someone commit it?

--
Janne Blomqvist
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

Janne Blomqvist
PING

On Sat, Jun 11, 2005 at 07:01:58PM +0300, Janne Blomqvist wrote:

> On Sat, Jun 11, 2005 at 02:51:09PM +0200, Tobias Schl?ter wrote:
> > Janne Blomqvist wrote:
> > > * unix.c (stream_at_bof): Don't assume that all non-mmapped files
> > > are non-seekable. (stream_at_eof): Likewise.
> >
> > This is ok.
>
> Thanks.
>
> I don't have commit privileges, so could you or someone commit it?
>
> --
> Janne Blomqvist

--
Janne Blomqvist
Reply | Threaded
Open this post in threaded view
|

Re: [gfortran] Patch for potential bug in i/o library

FX Coudert
>>>> * unix.c (stream_at_bof): Don't assume that all non-mmapped files
>>>> are non-seekable. (stream_at_eof): Likewise.
>>
>>I don't have commit privileges, so could you or someone commit it?

Doing it now.

FX