Indirect memory addresses vs. lra

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Indirect memory addresses vs. lra

John Darrington

I'm trying to write a back-end for an architecture (s12z - the ISA you can
download from [1]).  This arch accepts indirect memory addresses.   That is to
say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
function returns true for such addresses, LRA insists on reloading them out of
existence.

For example, when compiling a code fragment:

  volatile unsigned char *led = 0x2F2;
  *led = 1;

the ira dump file shows:

(insn 7 6 8 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
        (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
     (nil))
(insn 8 7 14 2 (set (mem/v:QI (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8]) [0 *led_7+0 S1 A8])
        (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
     (nil))

which is a perfectly valid insn, and the most efficient assembler for it is:
mov.p #0x2f2, y
mov.b #1, [0,y]

However the reload dump shows this has been changed to:

(insn 7 6 22 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
        (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
     (nil))
(insn 22 7 8 2 (set (reg:PSI 8 x [22])
        (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])) "/home/jmd/MemMem/memmem.c":16:8 96 {movpsi}
     (nil))
(insn 8 22 14 2 (set (mem/v:QI (reg:PSI 8 x [22]) [0 *led_7+0 S1 A8])
        (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
     (nil))

and ends up as:

mov.p #0x2f2, y
mov.p (0,y) x
mov.b #1, (0,x)

So this wastes a register (which leads to other issues which I don't want to go
into in this email).

After a lot of debugging I tracked down the part of lra which is doing this
reload to the function process_addr_reg at lra-constraints.c:1378

 if (! REG_P (reg))
    {
      if (check_only_p)
        return true;
      /* Always reload memory in an address even if the target supports such addresses.  */
      new_reg = lra_create_new_reg_with_unique_value (mode, reg, cl, "address");
      before_p = true;
    }

Changing this to

 if (! REG_P (reg))
    {
      if (check_only_p)
        return true;
      return false;
    }

solves my immediate problem.  However I imagine there was a reason for doing
this reload, and presumably a better way of avoiding it.

Can someone explain the reason for this reload, and how I can best ensure that
indirect memory operands are left in the compiled code?



[1] https://www.nxp.com/docs/en/reference-manual/S12ZCPU_RM_V1.pdf

--
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Vladimir Makarov

On 2019-08-04 3:18 p.m., John Darrington wrote:

> I'm trying to write a back-end for an architecture (s12z - the ISA you can
> download from [1]).  This arch accepts indirect memory addresses.   That is to
> say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
> function returns true for such addresses, LRA insists on reloading them out of
> existence.
>
> For example, when compiling a code fragment:
>
>    volatile unsigned char *led = 0x2F2;
>    *led = 1;
>
> the ira dump file shows:
>
> (insn 7 6 8 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
>          (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
>       (nil))
> (insn 8 7 14 2 (set (mem/v:QI (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8]) [0 *led_7+0 S1 A8])
>          (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
>       (nil))
>
> which is a perfectly valid insn, and the most efficient assembler for it is:
> mov.p #0x2f2, y
> mov.b #1, [0,y]
>
> However the reload dump shows this has been changed to:
>
> (insn 7 6 22 2 (set (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])
>          (const_int 754 [0x2f2])) "/home/jmd/MemMem/memmem.c":15:27 96 {movpsi}
>       (nil))
> (insn 22 7 8 2 (set (reg:PSI 8 x [22])
>          (mem/f/c:PSI (reg/f:PSI 9 y) [3 led+0 S4 A8])) "/home/jmd/MemMem/memmem.c":16:8 96 {movpsi}
>       (nil))
> (insn 8 22 14 2 (set (mem/v:QI (reg:PSI 8 x [22]) [0 *led_7+0 S1 A8])
>          (const_int 1 [0x1])) "/home/jmd/MemMem/memmem.c":16:8 98 {movqi}
>       (nil))
>
> and ends up as:
>
> mov.p #0x2f2, y
> mov.p (0,y) x
> mov.b #1, (0,x)
>
> So this wastes a register (which leads to other issues which I don't want to go
> into in this email).
>
> After a lot of debugging I tracked down the part of lra which is doing this
> reload to the function process_addr_reg at lra-constraints.c:1378
>
>   if (! REG_P (reg))
>      {
>        if (check_only_p)
>          return true;
>        /* Always reload memory in an address even if the target supports such addresses.  */
>        new_reg = lra_create_new_reg_with_unique_value (mode, reg, cl, "address");
>        before_p = true;
>      }
>
> Changing this to
>
>   if (! REG_P (reg))
>      {
>        if (check_only_p)
>          return true;
>        return false;
>      }
>
> solves my immediate problem.  However I imagine there was a reason for doing
> this reload, and presumably a better way of avoiding it.
>
> Can someone explain the reason for this reload, and how I can best ensure that
> indirect memory operands are left in the compiled code?
>
The old reload (reload[1].c) supports such addressing.  As modern
mainstream architectures have no this kind of addressing, it was not
implemented in LRA.

I don't think the above simple change will work fully.  For example, you
need to constrain memory nesting.  The constraints should be described,
may be some hooks should be implemented (may be not and
TARGET_LEGITIMATE_ADDRESS will be enough), may be additional address
anslysis and transformations should be implemented in LRA, etc.  But may
be implementing this is not hard either.

It is also difficult for me to say is it worth to do.  Removing such
addressing helps to remove redundant memory reads.  On the other hand,
its usage can decrease #insns and save registers for better RA and
utilize hardware on design of which a lot of efforts were spent.

In any case, if somebody implements this, it can be included in LRA.

>
> [1] https://www.nxp.com/docs/en/reference-manual/S12ZCPU_RM_V1.pdf
>
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Paul Koning-6


> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <[hidden email]> wrote:
>
>
> On 2019-08-04 3:18 p.m., John Darrington wrote:
>> I'm trying to write a back-end for an architecture (s12z - the ISA you can
>> download from [1]).  This arch accepts indirect memory addresses.   That is to
>> say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
>> function returns true for such addresses, LRA insists on reloading them out of
>> existence.
>> ...
> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.

Is LRA only intended for "modern mainstream architectures"?

If yes, why is the old reload being deprecated?  You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.

Indirect addressing is a key feature in size-optimized code.

        paul

Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Segher Boessenkool
On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
> > On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <[hidden email]> wrote:
> > The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
>
> Is LRA only intended for "modern mainstream architectures"?

I sure hope not!  But it has only been *used* and *tested* much on such,
so far.  Things are designed to work well for modern archs.

> If yes, why is the old reload being deprecated?  You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.
>
> Indirect addressing is a key feature in size-optimized code.

That doesn't mean that LRA has to support it, btw, not necessarily; it
may well be possible to do a good job of this in the later passes?
Maybe postreload, maybe some peepholes, etc.?


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Paul Koning-6


> On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <[hidden email]> wrote:
>
> On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
>>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <[hidden email]> wrote:
>>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
>>
>> Is LRA only intended for "modern mainstream architectures"?
>
> I sure hope not!  But it has only been *used* and *tested* much on such,
> so far.  Things are designed to work well for modern archs.
>
>> If yes, why is the old reload being deprecated?  You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.
>>
>> Indirect addressing is a key feature in size-optimized code.
>
> That doesn't mean that LRA has to support it, btw, not necessarily; it
> may well be possible to do a good job of this in the later passes?
> Maybe postreload, maybe some peepholes, etc.?

Possibly.  But as Vladimir points out, indirect addressing affects register allocation (reducing register pressure).  In older architectures that implement indirect addressing, that is one of the key ways in which the feature reduces code size.  While I can see how peephole optimization can convert a address load plus a register indirect into a memory indirect instruction, does that help the register become available for other uses or is post-LRA too late for that?  My impression is that it is too late, since at this point we're dealing with hard registers and making one free via peephole helps no one else.

        paul


Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Paul Koning-6
In reply to this post by Segher Boessenkool


> On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <[hidden email]> wrote:
>
> On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
>>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <[hidden email]> wrote:
>>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
>>
>> Is LRA only intended for "modern mainstream architectures"?
>
> I sure hope not!  But it has only been *used* and *tested* much on such,
> so far.

That's not entirely accurate.  At the prodding of people pushing for the removal of CC0 and reload, I've added LRA support to pdp11 in the V9 cycle.  And it works pretty well, in the sense of passing the compile tests.  But I haven't yet examined the code quality vs. the old one in any detail.

        paul

Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Vladimir Makarov
In reply to this post by Paul Koning-6

On 2019-08-08 12:43 p.m., Paul Koning wrote:

>
>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <[hidden email]> wrote:
>>
>>
>> On 2019-08-04 3:18 p.m., John Darrington wrote:
>>> I'm trying to write a back-end for an architecture (s12z - the ISA you can
>>> download from [1]).  This arch accepts indirect memory addresses.   That is to
>>> say, those of the form (mem (mem (...)))  and although my TARGET_LEGITIMATE_ADDRESS
>>> function returns true for such addresses, LRA insists on reloading them out of
>>> existence.
>>> ...
>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
> Is LRA only intended for "modern mainstream architectures"?


No.  As I wrote patches implementing indirect addressing is welcomed. 
It is hard to implement everything at once and by one person.


> If yes, why is the old reload being deprecated?
>    You can't have it both ways.  Unless you want to obsolete all "not modern mainstream architectures" in GCC, it doesn't make sense to get rid of core functionality used by those architectures.
>
> Indirect addressing is a key feature in size-optimized code.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Segher Boessenkool
In reply to this post by Paul Koning-6
On Thu, Aug 08, 2019 at 01:25:27PM -0400, Paul Koning wrote:

> > On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <[hidden email]> wrote:
> > On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
> >> Indirect addressing is a key feature in size-optimized code.
> >
> > That doesn't mean that LRA has to support it, btw, not necessarily; it
> > may well be possible to do a good job of this in the later passes?
> > Maybe postreload, maybe some peepholes, etc.?
>
> Possibly.  But as Vladimir points out, indirect addressing affects
> register allocation (reducing register pressure).

Yeah, good point, esp. if you have only one or two registers that you
can use for addressing at all.  So it will have to happen during (or
before?) RA, alright.


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Segher Boessenkool
In reply to this post by Paul Koning-6
On Thu, Aug 08, 2019 at 01:30:41PM -0400, Paul Koning wrote:

>
>
> > On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <[hidden email]> wrote:
> >
> > On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
> >>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <[hidden email]> wrote:
> >>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
> >>
> >> Is LRA only intended for "modern mainstream architectures"?
> >
> > I sure hope not!  But it has only been *used* and *tested* much on such,
> > so far.
>
> That's not entirely accurate.  At the prodding of people pushing for
> the removal of CC0 and reload, I've added LRA support to pdp11 in the
> V9 cycle.

I said "much" :-)

Pretty much all design input so far has been from "modern mainstream
architectures", as far as I can make out.  Now one of those has the
most "interesting" (for RA) features that many less mainstream archs
have (a not-so-very-flat register file), so it should still work pretty
well hopefully.

> And it works pretty well, in the sense of passing the
> compile tests.  But I haven't yet examined the code quality vs. the
> old one in any detail.

That would be quite interesting to see, also for the other ports that
still need conversion: how much (if any) degradation should you expect
from a straight-up conversion of a port to LRA, without any retuning?


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Jeff Law
On 8/8/19 1:19 PM, Segher Boessenkool wrote:

> On Thu, Aug 08, 2019 at 01:30:41PM -0400, Paul Koning wrote:
>>
>>
>>> On Aug 8, 2019, at 1:21 PM, Segher Boessenkool <[hidden email]> wrote:
>>>
>>> On Thu, Aug 08, 2019 at 12:43:52PM -0400, Paul Koning wrote:
>>>>> On Aug 8, 2019, at 12:25 PM, Vladimir Makarov <[hidden email]> wrote:
>>>>> The old reload (reload[1].c) supports such addressing.  As modern mainstream architectures have no this kind of addressing, it was not implemented in LRA.
>>>>
>>>> Is LRA only intended for "modern mainstream architectures"?
>>>
>>> I sure hope not!  But it has only been *used* and *tested* much on such,
>>> so far.
>>
>> That's not entirely accurate.  At the prodding of people pushing for
>> the removal of CC0 and reload, I've added LRA support to pdp11 in the
>> V9 cycle.
>
> I said "much" :-)
>
> Pretty much all design input so far has been from "modern mainstream
> architectures", as far as I can make out.  Now one of those has the
> most "interesting" (for RA) features that many less mainstream archs
> have (a not-so-very-flat register file), so it should still work pretty
> well hopefully.
Yea, it's certainly designed with the more mainstream architectures in
mind.  THe double-indirect case that's being talked about here is well
out of the mainstream and not a feature of anything LRA has targetted to
date.  So I'm not surprised it's not working.

My suggestion would be to ignore the double-indirect aspect of the
architecture right now, get the port working, then come back and try to
make double-indirect addressing modes work.

>
>> And it works pretty well, in the sense of passing the
>> compile tests.  But I haven't yet examined the code quality vs. the
>> old one in any detail.
>
> That would be quite interesting to see, also for the other ports that
> still need conversion: how much (if any) degradation should you expect
> from a straight-up conversion of a port to LRA, without any retuning?
I did the v850 last year where it was a wash or perhaps a slight
improvement for codesize, which is a reasonable approximation for
performance on that target.

I was working a bit on converting the H8 away from cc0 with an eye
towards LRA as well.  Given how registers overlap on the H8, the most
straightforward port should end up with properties much like 32bit x86.
  I suspect the independent addressing of the high/low register parts
might be better handled by LRA, but I wasn't going to do anything beyond
the "just make it work".

jeff
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

John Darrington
On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:

     Yea, it's certainly designed with the more mainstream architectures in
     mind.  THe double-indirect case that's being talked about here is well
     out of the mainstream and not a feature of anything LRA has targetted to
     date.  So I'm not surprised it's not working.
     
     My suggestion would be to ignore the double-indirect aspect of the
     architecture right now, get the port working, then come back and try to
     make double-indirect addressing modes work.
     
This sounds like sensible advice.  However I wonder if this issue is
related to the other major outstanding problem I have, viz: the large
number of test failures which report "Unable to find a register to
spill" - So far, nobody has been able to explain how to solve that
issue and even the people who appear to be more knowlegeable have
expressed suprise that it is even happening at all.

Even if it should turn out not to be related, the message I've been
receiving in this thread is lra should not be expected to work for
non "mainstream" backends.  So perhaps there is another, yet to be
discovered, restriction which prevents my backend from ever working?

On the other hand, given my lack of experience with gcc,  it could be
that lra is working perfectly, and I have simply done something
incorrectly.    But the uncertainty voiced in this thread means that it
is hard to be sure that I'm not trying to do something which is
currently unsupported.

J'

--
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Segher Boessenkool
Hi!

On Fri, Aug 09, 2019 at 10:14:39AM +0200, John Darrington wrote:

> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
>
>      Yea, it's certainly designed with the more mainstream architectures in
>      mind.  THe double-indirect case that's being talked about here is well
>      out of the mainstream and not a feature of anything LRA has targetted to
>      date.  So I'm not surprised it's not working.
>      
>      My suggestion would be to ignore the double-indirect aspect of the
>      architecture right now, get the port working, then come back and try to
>      make double-indirect addressing modes work.
>      
> This sounds like sensible advice.  However I wonder if this issue is
> related to the other major outstanding problem I have, viz: the large
> number of test failures which report "Unable to find a register to
> spill" - So far, nobody has been able to explain how to solve that
> issue and even the people who appear to be more knowlegeable have
> expressed suprise that it is even happening at all.

No one is surprised.  It is just the funny way that LRA says "whoops I
am going in circles, there is no progress and there will never be, I'd
better stop that".  Everyone doing new ports / new conversions to LRA
sees that error all the time.

The error could be pretty much *anywhere* in your port.  You have to
look at what LRA did, and why, and why that is wrong, and fix that.

> Even if it should turn out not to be related, the message I've been
> receiving in this thread is lra should not be expected to work for
> non "mainstream" backends.

LRA is more likely to have problems in situations where it has not been
tested before.  You can replace LRA by anything else, and this isn't
limited to GCC (or software, or human endeavours, or humanity even).

> So perhaps there is another, yet to be
> discovered, restriction which prevents my backend from ever working?

From ever?  Nah, we can patch.  Also, Occam's razor says there likely
is an error in your backend you haven't found yet.

> On the other hand, given my lack of experience with gcc,  it could be
> that lra is working perfectly, and I have simply done something
> incorrectly.    But the uncertainty voiced in this thread means that it
> is hard to be sure that I'm not trying to do something which is
> currently unsupported.

Is your code in some branch in our git?  Or in some other public git?
Do you have a representative testcase?


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Paul Koning-6


> On Aug 9, 2019, at 10:16 AM, Segher Boessenkool <[hidden email]> wrote:
>
> Hi!
>
> On Fri, Aug 09, 2019 at 10:14:39AM +0200, John Darrington wrote:
>> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
>>
>>  ...  However I wonder if this issue is
>> related to the other major outstanding problem I have, viz: the large
>> number of test failures which report "Unable to find a register to
>> spill" - So far, nobody has been able to explain how to solve that
>> issue and even the people who appear to be more knowlegeable have
>> expressed suprise that it is even happening at all.
>
> No one is surprised.  It is just the funny way that LRA says "whoops I
> am going in circles, there is no progress and there will never be, I'd
> better stop that".  Everyone doing new ports / new conversions to LRA
> sees that error all the time.
>
> The error could be pretty much *anywhere* in your port.  You have to
> look at what LRA did, and why, and why that is wrong, and fix that.

I've run into this a number of times.  The difficulty is that, for someone who understands the back end and the documented rules but not the internals of LRA, it tends to be hard to figure out what the problem is.  And since the causes tend to be obscure and undocumented, I find myself having to relearn the analysis from time to time.

It has been stated that LRA is more dependent on correct back end definitions than Reload is, but unfortunately the precise definition of "correct" can be less than obvious to a back end maintainer.

        paul


Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Jeff Law
In reply to this post by John Darrington
On 8/9/19 2:14 AM, John Darrington wrote:

> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
>
>      Yea, it's certainly designed with the more mainstream architectures in
>      mind.  THe double-indirect case that's being talked about here is well
>      out of the mainstream and not a feature of anything LRA has targetted to
>      date.  So I'm not surprised it's not working.
>      
>      My suggestion would be to ignore the double-indirect aspect of the
>      architecture right now, get the port working, then come back and try to
>      make double-indirect addressing modes work.
>      
> This sounds like sensible advice.  However I wonder if this issue is
> related to the other major outstanding problem I have, viz: the large
> number of test failures which report "Unable to find a register to
> spill" - So far, nobody has been able to explain how to solve that
> issue and even the people who appear to be more knowlegeable have
> expressed suprise that it is even happening at all.
You're going to have to debug what LRA is doing and why.  There's really
no short-cuts here.  We can't really do it for you.  Even if you weren't
using LRA you'd be doing the same process, just on even more difficult
to understand codebase.

>
> Even if it should turn out not to be related, the message I've been
> receiving in this thread is lra should not be expected to work for
> non "mainstream" backends.  So perhaps there is another, yet to be
> discovered, restriction which prevents my backend from ever working?
It's possible.  But that's not really any different than reload.
There's certainly various aspects of architectures that reload can't
handle as well -- even on architectures that were mainstream processors
when reload was under active development and maintenance.  THere's even
a good chance reload won't handle double-indirect addressing modes well
-- they were far from mainstream and as a result the code which does
purport to handle double-indirect addressing modes hasn't been
used/tested all that much over the last 25+ years.

>
> On the other hand, given my lack of experience with gcc,  it could be
> that lra is working perfectly, and I have simply done something
> incorrectly.    But the uncertainty voiced in this thread means that it
> is hard to be sure that I'm not trying to do something which is
> currently unsupported.
My recommendation is to continue with the LRA path.

jeff
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Vladimir Makarov
In reply to this post by John Darrington

On 2019-08-09 4:14 a.m., John Darrington wrote:

> On Thu, Aug 08, 2019 at 01:57:41PM -0600, Jeff Law wrote:
>
>       Yea, it's certainly designed with the more mainstream architectures in
>       mind.  THe double-indirect case that's being talked about here is well
>       out of the mainstream and not a feature of anything LRA has targetted to
>       date.  So I'm not surprised it's not working.
>      
>       My suggestion would be to ignore the double-indirect aspect of the
>       architecture right now, get the port working, then come back and try to
>       make double-indirect addressing modes work.
>      
> This sounds like sensible advice.  However I wonder if this issue is
> related to the other major outstanding problem I have, viz: the large
> number of test failures which report "Unable to find a register to
> spill" - So far, nobody has been able to explain how to solve that
> issue and even the people who appear to be more knowlegeable have
> expressed suprise that it is even happening at all.

Basically, LRA behaves here as older reload.  If an RTL insn needs hard
regs and there are no free regs, LRA/reload put pseudos assigned to hard
regs and living through the insn into memory.  So it is very hard to run
into problem "unable to find a register to spill", if the insn needs
less regs provided by architecture. That is why people are surprised. 
Still it can happens as one RTL insn can be implemented by a few machine
insns.  Most frequent case here are GCC asm insns requiring a lot of
input/output/and clobbered regs/operands.

If you provide LRA dump for such test (it is better to use
-fira-verbose=15 to output full RA info into stderr), I probably could
say more.

The less regs the architecture has, the easier to run into such error
message if something described wrong in the back-end.  I see your
architecture is 16-bit micro-controller with only 8 regs, some of them
is specialized.  So your architecture is really register constrained.

> Even if it should turn out not to be related, the message I've been
> receiving in this thread is lra should not be expected to work for
> non "mainstream" backends.  So perhaps there is another, yet to be
> discovered, restriction which prevents my backend from ever working?
>
> On the other hand, given my lack of experience with gcc,  it could be
> that lra is working perfectly, and I have simply done something
> incorrectly.    But the uncertainty voiced in this thread means that it
> is hard to be sure that I'm not trying to do something which is
> currently unsupported.

LRA/reload is the most machine-dependent machine-independent pass in
GCC.  It is connected to machine-dependent code by numerous ways. Big
part of making a new backend  is to make LRA/reload and
machine-dependent code communication in the right way.

Sometimes it is hard to decide who is responsible for RA related bugs:
RA or back-end.  Sometimes an innocent change in RA solving one problem
for a particular target might results in numerous new bugs for other
targets.  Therefore it is very difficult to say will your small change
to permit indirect memory addressing work in general case.

Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

John Darrington
On Fri, Aug 09, 2019 at 01:34:36PM -0400, Vladimir Makarov wrote:
     
     If you provide LRA dump for such test (it is better to use
     -fira-verbose=15 to output full RA info into stderr), I probably could
     say more.

I've attached such a dump (generated from gcc/testsuite/gcc.c-torture/compile/pr53410-2.c).
     
     The less regs the architecture has, thoke easier to run into such error
     message if something described wrong in the back-end.?? I see your
     architecture is 16-bit micro-controller with only 8 regs, some of them is
     specialized.?? So your architecture is really register constrained.

That's not quite correct.  It is a 24-bit micro-controller (the address
space is 24 bits wide).  There are 2 address registers (plus stack
pointer and program counter) and there are 8 general purpose data
registers (of differing sizes).
     

J'

--
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

John Darrington
In reply to this post by Segher Boessenkool
On Fri, Aug 09, 2019 at 09:16:44AM -0500, Segher Boessenkool wrote:

     Is your code in some branch in our git?  

No.  But it could be pushed there if people think it would be
appropriate to do so, and if I'm given the permissions to do so.
     
     Or in some other public git?

It's in my repo on gcc135 ~jmd/gcc-s12z (branch s12z)


     Do you have a representative testcase?

I think gcc/testsuite/gcc.c-torture/compile/pr53410-2.c is as
representative as any.
     

J'

     

--
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Segher Boessenkool
In reply to this post by John Darrington
Hi!

On Sat, Aug 10, 2019 at 08:05:53AM +0200, John Darrington wrote:
> Choosing alt 5 in insn 14:  (0) m  (1) m {*movsi}
>    14: [r40:PSI+0x20]=[r41:PSI]
>     Inserting insn reload before:
>    48: r40:PSI=r34:PSI
>    49: r41:PSI=[y:PSI+0x2f]

insn 14 is a mem-to-mem move (another feature not many more modern /
more RISCy CPUs have).  That requires both of your address registers.
So far, so good.  The reloads (insn 48 and 49) require address
registers themselves; that isn't necessarily a problem either.  But
this requires careful juggling.  Maybe you will need some backend code
for this, or to optimise this (although right now you just want it to
*work* :-) )

For some reason LRA didn't manage.  Register inheritance seems to be
implicated (but that might be a red herring).  Vladimir will probably
find out more, and/or correct me :-)


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

Segher Boessenkool
In reply to this post by John Darrington
On Sat, Aug 10, 2019 at 08:10:27AM +0200, John Darrington wrote:

> On Fri, Aug 09, 2019 at 09:16:44AM -0500, Segher Boessenkool wrote:
>
>      Is your code in some branch in our git?  
>
> No.  But it could be pushed there if people think it would be
> appropriate to do so, and if I'm given the permissions to do so.
>      
>      Or in some other public git?
>
> It's in my repo on gcc135 ~jmd/gcc-s12z (branch s12z)

That will work fine, for me at least.

>      Do you have a representative testcase?
>
> I think gcc/testsuite/gcc.c-torture/compile/pr53410-2.c is as
> representative as any.

Okido, thanks!


Segher
Reply | Threaded
Open this post in threaded view
|

Re: Indirect memory addresses vs. lra

John Darrington
In reply to this post by Segher Boessenkool
On Sat, Aug 10, 2019 at 11:12:18AM -0500, Segher Boessenkool wrote:
     Hi!
     
     On Sat, Aug 10, 2019 at 08:05:53AM +0200, John Darrington wrote:
     > Choosing alt 5 in insn 14:  (0) m  (1) m {*movsi}
     >    14: [r40:PSI+0x20]=[r41:PSI]
     >     Inserting insn reload before:
     >    48: r40:PSI=r34:PSI
     >    49: r41:PSI=[y:PSI+0x2f]
     
     insn 14 is a mem-to-mem move (another feature not many more modern /
     more RISCy CPUs have).  That requires both of your address registers.
     So far, so good.  The reloads (insn 48 and 49) require address
     registers themselves; that isn't necessarily a problem either.

So far as I can see, insn 48 is completely redundant.  It's copying a
pseudo reg (74) into another pseudo reg (40).
This is pointless and a waste, since insn 14 does not modify 74.
I don't understand why lra feels the need to do it.

If lra knew about (mem (mem ...)) style addressing, then insn 49 would
also be redundant (which is why I raised the topic).

In summary, what we have is:

(insn 48 84 49 2 (set (reg/f:PSI 40 [34])
        (reg/f:PSI 74 [34]))
     (nil))
(insn 49 48 14 2 (set (reg:PSI 41)
        (mem/f/c:PSI (plus:PSI (reg/f:PSI 9 y)
                (const_int 47 [0x2f])) [3 p+0 S4 A8]))
     (nil))
(insn 14 49 15 2 (set (mem:SI (plus:PSI (reg/f:PSI 40 [34])
                (const_int 32 [0x20])) [2  S4 A64])
        (mem:SI (reg:PSI 41) [2 *p_5(D)+0 S4 A8]))

where, like you say, insns 48 and 49 are reloads.  But these two reloads
are unnecessary and cause the machine to run out of PSImode registers.
The above could be easier and more efficiently done simply as:

(insn 14 11 15 2 (set
        (mem:SI (plus:PSI (reg/f:PSI 74 [34]) (const_int 32 [0x20])) [2  S4 A64])
        (mem/f/c:PSI (mem:PSI (plus:PSI (reg/f:PSI 9 y)
                (const_int 47 [0x2f])) [3 p+0 S4 A8])))


This is exactly what we had before lra messed with things.  It can be
represented in the ISA with one assembler instruction:
  mov.p (32, x), [47, y]
and if I'm not mistaken, alternative 5 of my "movpsi" pattern should do
this just fine.


     But
     this requires careful juggling.  Maybe you will need some backend code

Could you give a hint into which set of hooks/constraints/predicates
this backend code should go?
     

--
Avoid eavesdropping.  Send strong encrypted email.
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.

12