[patch, libfortran] Adjust block size for libgfortran for unformatted reads

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[patch, libfortran] Adjust block size for libgfortran for unformatted reads

Thomas Koenig-6
Hello world,

the attached patch sets the I/O block size for unformatted files to
2**17 and makes this, and the block size for formatted files,
adjustable via environment variables.

The main reason is that -fconvert=big-endian was quite slow on
some HPC systems. A bigger buffer should eliminate that.  Also,
People who use unformatted files are likely to write large amounts
of data, so this seems like a good fit.  Finally, some benchmarking
showed that 131072 seemed like a good value to use. Thanks to Jerry
for support.

I didn't change the value for formatted files because, frankly, we are
using a lot of CPU for converting numbers there, so any gain
negligible (unless somebody comes up with a benchmark which says
otherwise).

As this is a change in behavior / new feature, I don't think that
backporting is indicated, but if somebody feels otherwise, please
speak up.

Regression-tested. OK for trunk?

Regards

        Thomas

2019-07-07  Thomas König  <[hidden email]>

        PR libfortran/91030
        * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
        (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.

2019-07-07  Thomas König  <[hidden email]>

        PR libfortran/91030
        * io/unix.c (BUFFER_SIZE): Delete.
        (BUFFER_SIZE_FORMATTED_DEFAULT): New variable.
        (BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable.
        (unix_stream): Add buffer_size.
        (buf_read): Use s->buffer_size instead of BUFFER_SIZE.
        (buf_write): Likewise.
        (buf_init): Add argument unformatted.  Handle block sizes
        for unformatted vs. formatted, using defaults if provided.
        (fd_to_stream): Add argument unformatted in call to buf_init.
        * libgfortran.h (options_t): Add buffer_size_formatted and
        buffer_size_unformatted.
        * runtime/environ.c (variable_table): Add
        GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED.


p2.diff (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

Manfred Schwarb
Am 07.07.19 um 22:13 schrieb Thomas Koenig:

> Hello world,
>
> the attached patch sets the I/O block size for unformatted files to
> 2**17 and makes this, and the block size for formatted files,
> adjustable via environment variables.
>
> The main reason is that -fconvert=big-endian was quite slow on
> some HPC systems. A bigger buffer should eliminate that.  Also,
> People who use unformatted files are likely to write large amounts
> of data, so this seems like a good fit.  Finally, some benchmarking
> showed that 131072 seemed like a good value to use. Thanks to Jerry
> for support.
>
> I didn't change the value for formatted files because, frankly, we are
> using a lot of CPU for converting numbers there, so any gain
> negligible (unless somebody comes up with a benchmark which says
> otherwise).

formatted write: Did you try writing to an USB stick or similar? I guess
for flash based devices anything below 64k will slow down operation.
Computers like Raspberry Pi and the like often have flash based storage,
and it is not extremely unrealistic to run fortran programs on them.


Cheers.
Manfred

>
> As this is a change in behavior / new feature, I don't think that
> backporting is indicated, but if somebody feels otherwise, please
> speak up.
>
> Regression-tested. OK for trunk?
>
> Regards
>
>     Thomas
>
> 2019-07-07  Thomas König  <[hidden email]>
>
>     PR libfortran/91030
>     * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
>     (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.
>
> 2019-07-07  Thomas König  <[hidden email]>
>
>     PR libfortran/91030
>     * io/unix.c (BUFFER_SIZE): Delete.
>     (BUFFER_SIZE_FORMATTED_DEFAULT): New variable.
>     (BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable.
>     (unix_stream): Add buffer_size.
>     (buf_read): Use s->buffer_size instead of BUFFER_SIZE.
>     (buf_write): Likewise.
>     (buf_init): Add argument unformatted.  Handle block sizes
>     for unformatted vs. formatted, using defaults if provided.
>     (fd_to_stream): Add argument unformatted in call to buf_init.
>     * libgfortran.h (options_t): Add buffer_size_formatted and
>     buffer_size_unformatted.
>     * runtime/environ.c (variable_table): Add
>     GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED.
>

Reply | Threaded
Open this post in threaded view
|

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

Janne Blomqvist-3
On Mon, Jul 8, 2019 at 11:18 AM Manfred Schwarb <[hidden email]> wrote:

>
> Am 07.07.19 um 22:13 schrieb Thomas Koenig:
> > Hello world,
> >
> > the attached patch sets the I/O block size for unformatted files to
> > 2**17 and makes this, and the block size for formatted files,
> > adjustable via environment variables.
> >
> > The main reason is that -fconvert=big-endian was quite slow on
> > some HPC systems. A bigger buffer should eliminate that.  Also,
> > People who use unformatted files are likely to write large amounts
> > of data, so this seems like a good fit.  Finally, some benchmarking
> > showed that 131072 seemed like a good value to use. Thanks to Jerry
> > for support.
> >
> > I didn't change the value for formatted files because, frankly, we are
> > using a lot of CPU for converting numbers there, so any gain
> > negligible (unless somebody comes up with a benchmark which says
> > otherwise).
>
> formatted write: Did you try writing to an USB stick or similar? I guess
> for flash based devices anything below 64k will slow down operation.
> Computers like Raspberry Pi and the like often have flash based storage,
> and it is not extremely unrealistic to run fortran programs on them.

Good point. If you happen to have a USB stick handy, can you try the
simple C benchmark program at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?

(the kernel will coalesce IO's by itself, so the granularity of IO
syscalls is not necessarily the same as the actual IO to devices.
Network filesystems like NFS/Lustre/GPFS may have less latitude here
due to coherency requirements etc.)


--
Janne Blomqvist
Reply | Threaded
Open this post in threaded view
|

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

Janne Blomqvist-3
In reply to this post by Thomas Koenig-6
On Sun, Jul 7, 2019 at 11:13 PM Thomas Koenig <[hidden email]> wrote:
>
> Hello world,
>
> the attached patch sets the I/O block size for unformatted files to
> 2**17 and makes this, and the block size for formatted files,
> adjustable via environment variables.

Should the default be affected by the page size
(sysconf(_SC_PAGESIZE)) and/or the IO blocksize (we already fstat() a
file when we open it, so we could get st_blksize for free)?

Though one could of course argue those aren't particularly useful;
except for power, everything uses a 4k page size (?), and the IO
blocksize varies from too small (4k, ext4) to too large (1 MB, NFS (or
whatever rsize/wsize is set to), or 4 MB for Lustre).

> 2019-07-07  Thomas König  <[hidden email]>
>
>         PR libfortran/91030
>         * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
>         (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.

One of these should be _UNFORMATTED?

--
Janne Blomqvist
Reply | Threaded
Open this post in threaded view
|

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

Steve Kargl
In reply to this post by Janne Blomqvist-3
On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote:

>
> Good point. If you happen to have a USB stick handy, can you try the
> simple C benchmark program at
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?
>
> (the kernel will coalesce IO's by itself, so the granularity of IO
> syscalls is not necessarily the same as the actual IO to devices.
> Network filesystems like NFS/Lustre/GPFS may have less latitude here
> due to coherency requirements etc.)
>

Writing to USB is constrained be the speed of the bus.

kernel: ugen6.3: <Kingston DataTraveler 3.0> at usbus6
kernel: umass1 on uhub0
kernel: umass1: <Kingston DataTraveler 3.0, class 0/0, rev 2.10/0.01, addr 3> on usbus6
kernel: da1 at umass-sim1 bus 1 scbus5 target 0 lun 0
kernel: da1: <Kingston DataTraveler 3.0 > Removable Direct Access SPC-4 SCSI device
kernel: da1: Serial Number 0C9D927F0A77F330A89D1210
kernel: da1: 40.000MB/s transfers
kernel: da1: 29510MB (60437492 512 byte sectors)
kernel: da1: quirks=0x2<NO_6_BYTE>

Mounting the thumb drive as a MSDOSFS on FreeBSD.

Test using 100000000 bytes
Block size of file system: 16384
bs =       1024, 8.86 MiB/s
bs =       2048, 7.52 MiB/s
bs =       4096, 7.32 MiB/s
bs =       8192, 7.32 MiB/s
bs =      16384, 7.40 MiB/s
bs =      32768, 7.40 MiB/s
bs =      65536, 5.57 MiB/s
bs =     131072, 7.43 MiB/s
bs =     262144, 5.55 MiB/s
bs =     524288, 7.43 MiB/s
bs =    1048576, 5.56 MiB/s
bs =    2097152, 7.39 MiB/s
bs =    4194304, 7.62 MiB/s
bs =    8388608, 5.58 MiB/s
bs =   16777216, 7.42 MiB/s
bs =   33554432, 5.59 MiB/s
bs =   67108864, 4.77 MiB/s

For the same binary, writing to a UFS2 on a SATAII SSD

Test using 100000000 bytes
Block size of file system: 32768
bs =       1024, 123.09 MiB/s
bs =       2048, 210.30 MiB/s
bs =       4096, 184.75 MiB/s
bs =       8192, 237.28 MiB/s
bs =      16384, 244.42 MiB/s
bs =      32768, 243.20 MiB/s
bs =      65536, 253.77 MiB/s
bs =     131072, 243.32 MiB/s
bs =     262144, 238.81 MiB/s
bs =     524288, 243.87 MiB/s
bs =    1048576, 246.55 MiB/s
bs =    2097152, 242.39 MiB/s
bs =    4194304, 243.68 MiB/s
bs =    8388608, 243.88 MiB/s
bs =   16777216, 242.21 MiB/s
bs =   33554432, 253.19 MiB/s
bs =   67108864, 178.29 MiB/s

--
Steve
Reply | Threaded
Open this post in threaded view
|

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

Thomas Koenig-6
... of course, better with the actual patch.


p3.diff (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

Steve Kargl
In reply to this post by Steve Kargl
On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:

> OK, so here is a new version.
>
> I think the discussion has shown that enlaring the buffer makes sense,
> and that the buffer size for unformatted seems to be too bad.
>
> I've reversed the names of the environment variables according to
> Behnard's suggestion.
>
> So, OK for trunk?
>
> Also, what should we do about gcc-9?  I have now come to think
> that we should add the environment variables to set the buffer lengths,
> but leave the old default (8192).
>
> What do you think?
>

If you are inclined to back port a portion of the patch to 9-branch,
then bumping up the old default would seem to be the most important
part.  As dje noted, users seem to have an aversion to reading the
documentation, so finding the environment variables may not happen.

Isn't 8192 an internal implementation detail for libgfortran?  Can
bumping it to larger value in 9-branch cause an issue for a normal
user?

--
Steve
Reply | Threaded
Open this post in threaded view
|

Re: [patch, libfortran] Adjust block size for libgfortran for unformatted reads

Thomas Koenig-6
Hi Steve,

> On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:
>> OK, so here is a new version.
>>
>> I think the discussion has shown that enlaring the buffer makes sense,
>> and that the buffer size for unformatted seems to be too bad.
>>
>> I've reversed the names of the environment variables according to
>> Behnard's suggestion.
>>
>> So, OK for trunk?
>>
>> Also, what should we do about gcc-9?  I have now come to think
>> that we should add the environment variables to set the buffer lengths,
>> but leave the old default (8192).
>>
>> What do you think?
>>
>
> If you are inclined to back port a portion of the patch to 9-branch,
> then bumping up the old default would seem to be the most important
> part.  As dje noted, users seem to have an aversion to reading the
> documentation, so finding the environment variables may not happen.
>
> Isn't 8192 an internal implementation detail for libgfortran?  Can
> bumping it to larger value in 9-branch cause an issue for a normal
> user?

Well, it allocates a bigger memory block, that's all.

Upon reconsideration, I think your point about people not reading
the docs is valid :-|

So, I will commit the patch to trunk over the weekend and to 9.2
a few days afterwards, unless somebody objects.

Regards

        Thomas