[Bug target/24073] New: (vector float){a, b, 0, 0} code gen is not good

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] New: (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org
Take the following example:
#define vector __attribute__((vector_size(16)))

float a; float b;
vector float f(void) { return (vector float){ a, b, 0.0, 0.0}; }
---
Currently we get:
        subl    $12, %esp
        movss   _b, %xmm0
        movss   _a, %xmm1
        unpcklps        %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        xorl    %eax, %eax
        xorl    %edx, %edx
        movl    %eax, (%esp)
        movl    %edx, 4(%esp)
        xorps   %xmm1, %xmm1
        movlhps %xmm1, %xmm0
        addl    $12, %esp

------
We should be able to produce:
movss _b, %xmm0
movss _a, %xmm1
shufps 60, /*[0, 3, 3, 0]*/, %xmm1, %xmm0 // _a, 0, 0, _b
shufps 201, /*[3, 0, 2, 1]*/, %xmm0, %xmm0 // _a, _b, 0, 0

This is from Nathan Begeman.

--
           Summary: (vector float){a, b, 0, 0} code gen is not good
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: minor
          Priority: P2
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: pinskia at gcc dot gnu dot org
                CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org


--
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ssemmx


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org


--
           What    |Removed                     |Added
----------------------------------------------------------------------------
 GCC target triplet|                            |i786-pc-darwin7.9


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-27 05:07 -------
The issue is in ix86_expand_vector_init.

--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From belyshev at depni dot sinp dot msu dot ru  2005-09-27 05:51 -------
Confirmed.

--
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2005-09-27 05:51:20
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From uros at kss-loka dot si  2005-09-27 11:19 -------
With a couple of months old mainline gcc (20050716), following asm is produced:
(-O2 -msse2 -fomit-frame-pointer):

        subl $12, %esp
        movss b, %xmm0
        movss a, %xmm1
        unpcklps %xmm0, %xmm1
        movaps %xmm1, %xmm0
        xorl %eax, %eax
        xorl %edx, %edx
        movl %eax, (%esp)
        movl %edx, 4(%esp)
>>> movlps (%esp), %xmm1
        addl $12, %esp
        movlhps %xmm1, %xmm0
        ret

This explains where all those xor and moves come from. It looks that newer
compilers somehow fix the damage by using xorps, a bit late in the game, IMO.

This part of bug depends on PR target/22076.

Other than that, the problem is that V4SF vector initialization is decomposed
to two V2SF initializations (these are MMX insns and this further confuses
x87/MMX switching patch) that are later concated to V4SF.

--
           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|                            |22076


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From uros at kss-loka dot si  2005-09-27 11:41 -------
I think that following example wins the contest:

vector float f(void) { return (vector float){ a, a, b, b}; }

gcc -O2 -msse -fomit-frame-pointer

        subl $28, %esp
        movss a, %xmm0
        movss %xmm0, 4(%esp)
        movss b, %xmm0
        movd 4(%esp), %mm0
        punpckldq %mm0, %mm0
        movss %xmm0, 4(%esp)
        movq %mm0, 16(%esp)
        movd 4(%esp), %mm0
        punpckldq %mm0, %mm0
        movq %mm0, 8(%esp)
        movlps 16(%esp), %xmm1
        movhps 8(%esp), %xmm1
        addl $28, %esp
        movaps %xmm1, %xmm0
        ret

Note the usage of MMX registers.

--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073
Reply | Threaded
Open this post in threaded view
|

[Bug target/24073] (vector float){a, b, 0, 0} code gen is not good

cvs-commit at gcc dot gnu.org
In reply to this post by cvs-commit at gcc dot gnu.org

------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-27 14:33 -------
(In reply to comment #4)
> I think that following example wins the contest:
>
> vector float f(void) { return (vector float){ a, a, b, b}; }

For this, it is a different bug.  The issue with the above is that ix86_expand_vector_init_duplicate check
for mmx_okay is bad.
Currently, we have
      if (!mmx_ok && !TARGET_SSE)
but I if I change it to:
      if (!mmx_ok)
we get:
        movss   _a, %xmm0
        movss   _b, %xmm1
        unpcklps        %xmm0, %xmm0
        unpcklps        %xmm1, %xmm1
        movlhps %xmm1, %xmm0
Which looks ok to me.  That testcase should be opened into another bug as it is obviously wrong.

--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24073