IEC559 totalOrder

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

IEC559 totalOrder

Matthias Kretz-2
Here's an IEC559 totalOrder implementation that works with clang and GCC, for
float, double, long double, and __float128 (at least the non-clang branch
should work with float16, too):
https://godbolt.org/z/QzGWdB

Note that I simplified the order reversal of negative values slightly, and
producing the bitmask for the sign is now "simpler".

As Jakub noted on IRC, 32-bit x86's long double is a problem. Since it has
sizeof == 12 there's no *constexpr* way to cast it to an integer.
std::bit_cast also wouldn't help. It seems there's no way around a builtin for
this function. (Besides that, I ignored the presence of unnormal values in
long double - you still get a totalOrder, but I'm not sure IEC559 requests a
different behavior.)

Cheers,
  Matthias

--

#include <type_traits>
#include <limits>
#include <climits>

template <class T>
  constexpr bool totalOrder(T xx, T yy)
{
  //static_assert(std::numeric_limits<T>::is_iec559); // fails for __float128
  constexpr int Bytes = sizeof(T) == 12 ? 16 : sizeof(T);
  using II = std::conditional_t<Bytes == sizeof(signed char), signed char,
             std::conditional_t<Bytes == sizeof(signed short), signed short,
             std::conditional_t<Bytes == sizeof(signed int), signed int,
             std::conditional_t<Bytes == sizeof(signed long), signed long,
             std::conditional_t<Bytes == sizeof(signed long long), signed long
long,
                                __int128>>>>>;
#ifdef __clang__
  constexpr int unused_bits = sizeof(T) <= 8 ? 0 :
                            (Bytes - 8) * CHAR_BIT - 1 -
                            __builtin_ctzll((__builtin_bit_cast(II, T(-0.)) >>
8 * CHAR_BIT));
  const II xi = __builtin_bit_cast(II, xx) << unused_bits;
  const II yi = __builtin_bit_cast(II, yy) << unused_bits;
  constexpr II signbit = __builtin_bit_cast(II, T(-0.)) << unused_bits;
  const auto x = xi < 0 ? ~(xi ^ signbit) : xi;
  const auto y = yi < 0 ? ~(yi ^ signbit) : yi;
  return x <= y;
#else
  using F [[gnu::vector_size(Bytes)]] = T;
  using I [[gnu::vector_size(Bytes)]] = II;
  constexpr int unused_bits = sizeof(T) <= 8 ? 0 :
                            (Bytes - 8) * CHAR_BIT - 1 -
                            __builtin_ctzll((reinterpret_cast<I>(F{T(-0.)}) >>
8 * CHAR_BIT)[0]);
  const I xi = reinterpret_cast<I>(F{xx}) << unused_bits;
  const I yi = reinterpret_cast<I>(F{yy}) << unused_bits;
  constexpr I signbit = reinterpret_cast<I>(F{T(-0.)}) << unused_bits;
  const auto x = xi[0] < 0 ? (~(xi ^ signbit))[0] : xi[0];
  const auto y = yi[0] < 0 ? (~(yi ^ signbit))[0] : yi[0];
  return x <= y;
#endif
}

template <typename T> struct Test
{
  // numeric_limits<__float128> produces bogus values
  static constexpr auto qnan = std::numeric_limits<double>::quiet_NaN();
  static constexpr auto snan = std::numeric_limits<double>::signaling_NaN();
  static constexpr auto inf  = std::numeric_limits<double>::infinity();
 
  static_assert( totalOrder<T>( 0.0,  0.0));
  static_assert( totalOrder<T>( 1.0,  1.0));
  static_assert( totalOrder<T>(-1.0, -1.0));
  static_assert( totalOrder<T>(-0.0, -0.0));
  static_assert( totalOrder<T>(qnan, qnan));
  static_assert( totalOrder<T>(snan, snan));
  static_assert( totalOrder<T>( inf,  inf));
  static_assert( totalOrder<T>(-0.0,  0.0));
  static_assert( totalOrder<T>(-0.0, +1.0));
  static_assert( totalOrder<T>(-qnan, +0.0));
  static_assert( totalOrder<T>(-qnan, -inf));
  static_assert( totalOrder<T>(-3., -2.));
  static_assert( totalOrder<T>(2., 3.));
  static_assert( totalOrder<T>(2., inf));
  static_assert( totalOrder<T>(0., qnan));
  static_assert( totalOrder<T>(0., 2.));
  static_assert( totalOrder<T>(2., qnan));
  static_assert( totalOrder<T>(inf, qnan));
  static_assert( totalOrder<T>(snan, qnan));
  static_assert(!totalOrder<T>(snan, -qnan));
  static_assert(!totalOrder<T>(0.0, -0.0));
  static_assert(!totalOrder<T>(0.0, -qnan));
  static_assert(!totalOrder<T>(0.0, -1.0));
  static_assert(!totalOrder<T>(1.0, -1.0));
  static_assert(!totalOrder<T>(qnan, -inf));
};

Test<float> a;
Test<double> b;
Test<long double> c;
Test<__float128> d;

--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 std::experimental::simd              https://github.com/VcDevel/std-simd
──────────────────────────────────────────────────────────────────────────



Reply | Threaded
Open this post in threaded view
|

Re: IEC559 totalOrder

Jonathan Wakely-3
On 09/11/19 08:40 +0100, Matthias Kretz wrote:

>Here's an IEC559 totalOrder implementation that works with clang and GCC, for
>float, double, long double, and __float128 (at least the non-clang branch
>should work with float16, too):
>https://godbolt.org/z/QzGWdB
>
>Note that I simplified the order reversal of negative values slightly, and
>producing the bitmask for the sign is now "simpler".
>
>As Jakub noted on IRC, 32-bit x86's long double is a problem. Since it has
>sizeof == 12 there's no *constexpr* way to cast it to an integer.
>std::bit_cast also wouldn't help. It seems there's no way around a builtin for
>this function. (Besides that, I ignored the presence of unnormal values in
>long double - you still get a totalOrder, but I'm not sure IEC559 requests a
>different behavior.)
This is indistinguishable from magic, thanks!

I've reworked your totalOrder as follows:

  template<floating_point _Fp> requires is_iec559<_Fp>
    constexpr strong_ordering
    __fp_strong_order(_Fp __e, _Fp __f)
    {
      struct _Selector : __make_unsigned_selector_base
      {
        using _Ints = _List<signed char, signed short, signed int,
              signed long, signed long long
#ifdef __GLIBCXX_TYPE_INT_N_0
                , signed __GLIBCXX_TYPE_INT_N_0
#endif
                >;
        using type = typename __select<sizeof(_Fp), _Ints>::__type;
      };
      using _Si = _Selector::type;

      auto __bit_cast = [](_Fp __x) -> _Si {
#ifdef __clang__
          return __builtin_bit_cast(_Si, __x);
#else
          using _Fp_v [[__gnu__::__vector_size__(sizeof(_Fp))]] = _Fp;
          using _Si_v [[__gnu__::__vector_size__(sizeof(_Fp))]] = _Si;
          return reinterpret_cast<_Si_v>(_Fp_v{__x})[0];
#endif
      };

      constexpr int __unused_bits = sizeof(_Fp) <= 8
        ? 0
        : (sizeof(_Fp) - 8) * __CHAR_BIT__ - 1
          - __builtin_ctzll(_Si(__bit_cast(_Fp(-0.)) >> 8 * __CHAR_BIT__));
      constexpr _Si __signbit = __bit_cast(_Fp(-0.)) << __unused_bits;
      const _Si __ei = __bit_cast(__e) << __unused_bits;
      const _Si __fi = __bit_cast(__f) << __unused_bits;
      const _Si __e_cmp = __ei < 0 ? (~_Si(__ei ^ __signbit)) : __ei;
      const _Si __f_cmp = __fi < 0 ? (~_Si(__fi ^ __signbit)) : __fi;
      return __e_cmp <=> __f_cmp;
    }

And then for the testsuite we can define totalOrder as:

  template<typename T>
    constexpr bool totalOrder(T e, T f)
    {
      return is_lteq(std::strong_order(e, f));
    }

I *think* I've got casts in the right places in __fp_strong_order to
avoid problems with integer promotions when sizeof(_Fp) < sizeof(int),
so that it's safe to immediately do [0] on the vector type to convert
back to a scalar.

Does that look OK? Have I broken it somehow?

I've also defined this, for the non-IEEE 60559 case:

    template<typename _Tp>
      concept is_iec559 = floating_point<_Tp>
        && numeric_limits<_Tp>::is_iec559
        // Exclude x86 12-byte long double:
        && ((sizeof(_Tp) & (sizeof(_Tp) - 1)) == 0);

    template<floating_point _Fp> requires (!is_iec559<_Fp>)
      constexpr strong_ordering
      __fp_strong_order(_Fp __e, _Fp __f)
      {
        auto __pord = __e <=> __f;
        if (is_lt(__pord))
          return strong_ordering::less;
        else if (is_gt(__pord))
          return strong_ordering::greater;
        else if (is_eq(__pord))
          return strong_ordering::equal;
        else
          return strong_ordering::equivalent;
      }

A patch against current trunk is attached, although it fails some
__float128 tests, because is_iec559 is false for that type:

.../testsuite/18_support/comparisons/algorithms/strong_order.cc: In instantiation of 'struct Test<__float128>':
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:128:   required from here
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:107: error: non-constant condition for static assertion
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:107:   in 'constexpr' expansion of 'totalOrder<__float128>(((__float128)(- Test<__float128>::qnan)), ((__float128)(+0.0)))'
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:88:   in 'constexpr' expansion of 'std::__cmp_alg::strong_order.std::__cmp_cust::_Strong_order::operator()<__float128&, __float128&>(e, f)'
.../libsupc++/compare:731:   in 'constexpr' expansion of 'std::__cmp_cust::__fp_strong_order<__float128>(__e, __f)'
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:107: error: '(-QNaNf128 < 0.0f128)' is not a constant expression
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:108: error: non-constant condition for static assertion
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:108:   in 'constexpr' expansion of 'totalOrder<__float128>(((__float128)(- Test<__float128>::qnan)), ((__float128)(- Test<__float128>::inf)))'
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:88:   in 'constexpr' expansion of 'std::__cmp_alg::strong_order.std::__cmp_cust::_Strong_order::operator()<__float128&, __float128&>(e, f)'
.../libsupc++/compare:731:   in 'constexpr' expansion of 'std::__cmp_cust::__fp_strong_order<__float128>(__e, __f)'
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:108: error: '(-QNaNf128 < -Inff128)' is not a constant expression
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:117: error: static assertion failed
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:118: error: static assertion failed
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:119: error: static assertion failed
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:122: error: non-constant condition for static assertion
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:122:   in 'constexpr' expansion of 'totalOrder<__float128>(((__float128)((double)Test<__float128>::qnan)), ((__float128)(- Test<__float128>::inf)))'
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:88:   in 'constexpr' expansion of 'std::__cmp_alg::strong_order.std::__cmp_cust::_Strong_order::operator()<__float128&, __float128&>(e, f)'
.../libsupc++/compare:731:   in 'constexpr' expansion of 'std::__cmp_cust::__fp_strong_order<__float128>(__e, __f)'
.../testsuite/18_support/comparisons/algorithms/strong_order.cc:122: error: '(+QNaNf128 < -Inff128)' is not a constant expression

The same cases fail for long double on 32-bit x86, because that uses
the !is_iec559 overload as well, which needs a built-in as you said.

I think we still need a better implementation of the !is_iec559
overload for the __float128 case. Alternatively, if the built-in
handles all FP types (including non-IEEE ones) then we don't need a
second overload anyway.


patch.txt (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: IEC559 totalOrder

Jonathan Wakely-3
On 13/11/19 22:03 +0000, Jonathan Wakely wrote:

>On 09/11/19 08:40 +0100, Matthias Kretz wrote:
>>Here's an IEC559 totalOrder implementation that works with clang and GCC, for
>>float, double, long double, and __float128 (at least the non-clang branch
>>should work with float16, too):
>>https://godbolt.org/z/QzGWdB
>>
>>Note that I simplified the order reversal of negative values slightly, and
>>producing the bitmask for the sign is now "simpler".
>>
>>As Jakub noted on IRC, 32-bit x86's long double is a problem. Since it has
>>sizeof == 12 there's no *constexpr* way to cast it to an integer.
>>std::bit_cast also wouldn't help. It seems there's no way around a builtin for
>>this function. (Besides that, I ignored the presence of unnormal values in
>>long double - you still get a totalOrder, but I'm not sure IEC559 requests a
>>different behavior.)
>
>This is indistinguishable from magic, thanks!
>
>I've reworked your totalOrder as follows:
>
> template<floating_point _Fp> requires is_iec559<_Fp>
>   constexpr strong_ordering
>   __fp_strong_order(_Fp __e, _Fp __f)
>   {
>     struct _Selector : __make_unsigned_selector_base
>     {
>       using _Ints = _List<signed char, signed short, signed int,
>             signed long, signed long long
>#ifdef __GLIBCXX_TYPE_INT_N_0
>               , signed __GLIBCXX_TYPE_INT_N_0
>#endif
>               >;
>       using type = typename __select<sizeof(_Fp), _Ints>::__type;
>     };
>     using _Si = _Selector::type;
>
>     auto __bit_cast = [](_Fp __x) -> _Si {
>#ifdef __clang__
>         return __builtin_bit_cast(_Si, __x);
>#else
>         using _Fp_v [[__gnu__::__vector_size__(sizeof(_Fp))]] = _Fp;
>         using _Si_v [[__gnu__::__vector_size__(sizeof(_Fp))]] = _Si;
>         return reinterpret_cast<_Si_v>(_Fp_v{__x})[0];
>#endif
>     };
>
>     constexpr int __unused_bits = sizeof(_Fp) <= 8
>       ? 0
>       : (sizeof(_Fp) - 8) * __CHAR_BIT__ - 1
>         - __builtin_ctzll(_Si(__bit_cast(_Fp(-0.)) >> 8 * __CHAR_BIT__));
>     constexpr _Si __signbit = __bit_cast(_Fp(-0.)) << __unused_bits;
>     const _Si __ei = __bit_cast(__e) << __unused_bits;
>     const _Si __fi = __bit_cast(__f) << __unused_bits;
>     const _Si __e_cmp = __ei < 0 ? (~_Si(__ei ^ __signbit)) : __ei;
>     const _Si __f_cmp = __fi < 0 ? (~_Si(__fi ^ __signbit)) : __fi;
>     return __e_cmp <=> __f_cmp;
>   }
>
>And then for the testsuite we can define totalOrder as:
>
> template<typename T>
>   constexpr bool totalOrder(T e, T f)
>   {
>     return is_lteq(std::strong_order(e, f));
>   }
>
>I *think* I've got casts in the right places in __fp_strong_order to
>avoid problems with integer promotions when sizeof(_Fp) < sizeof(int),
>so that it's safe to immediately do [0] on the vector type to convert
>back to a scalar.
>
>Does that look OK? Have I broken it somehow?
>
>I've also defined this, for the non-IEEE 60559 case:
>
>   template<typename _Tp>
>     concept is_iec559 = floating_point<_Tp>
> && numeric_limits<_Tp>::is_iec559
> // Exclude x86 12-byte long double:
> && ((sizeof(_Tp) & (sizeof(_Tp) - 1)) == 0);
>
>   template<floating_point _Fp> requires (!is_iec559<_Fp>)
>     constexpr strong_ordering
>     __fp_strong_order(_Fp __e, _Fp __f)
>     {
> auto __pord = __e <=> __f;
> if (is_lt(__pord))
>  return strong_ordering::less;
> else if (is_gt(__pord))
>  return strong_ordering::greater;
> else if (is_eq(__pord))
>  return strong_ordering::equal;
> else
>  return strong_ordering::equivalent;
>     }
>
>A patch against current trunk is attached, although it fails some
>__float128 tests, because is_iec559 is false for that type:
>
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc: In instantiation of 'struct Test<__float128>':
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:128:   required from here
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:107: error: non-constant condition for static assertion
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:107:   in 'constexpr' expansion of 'totalOrder<__float128>(((__float128)(- Test<__float128>::qnan)), ((__float128)(+0.0)))'
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:88:   in 'constexpr' expansion of 'std::__cmp_alg::strong_order.std::__cmp_cust::_Strong_order::operator()<__float128&, __float128&>(e, f)'
>.../libsupc++/compare:731:   in 'constexpr' expansion of 'std::__cmp_cust::__fp_strong_order<__float128>(__e, __f)'
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:107: error: '(-QNaNf128 < 0.0f128)' is not a constant expression
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:108: error: non-constant condition for static assertion
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:108:   in 'constexpr' expansion of 'totalOrder<__float128>(((__float128)(- Test<__float128>::qnan)), ((__float128)(- Test<__float128>::inf)))'
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:88:   in 'constexpr' expansion of 'std::__cmp_alg::strong_order.std::__cmp_cust::_Strong_order::operator()<__float128&, __float128&>(e, f)'
>.../libsupc++/compare:731:   in 'constexpr' expansion of 'std::__cmp_cust::__fp_strong_order<__float128>(__e, __f)'
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:108: error: '(-QNaNf128 < -Inff128)' is not a constant expression
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:117: error: static assertion failed
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:118: error: static assertion failed
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:119: error: static assertion failed
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:122: error: non-constant condition for static assertion
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:122:   in 'constexpr' expansion of 'totalOrder<__float128>(((__float128)((double)Test<__float128>::qnan)), ((__float128)(- Test<__float128>::inf)))'
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:88:   in 'constexpr' expansion of 'std::__cmp_alg::strong_order.std::__cmp_cust::_Strong_order::operator()<__float128&, __float128&>(e, f)'
>.../libsupc++/compare:731:   in 'constexpr' expansion of 'std::__cmp_cust::__fp_strong_order<__float128>(__e, __f)'
>.../testsuite/18_support/comparisons/algorithms/strong_order.cc:122: error: '(+QNaNf128 < -Inff128)' is not a constant expression
>
>The same cases fail for long double on 32-bit x86, because that uses
>the !is_iec559 overload as well, which needs a built-in as you said.
>
>I think we still need a better implementation of the !is_iec559
>overload for the __float128 case. Alternatively, if the built-in
>handles all FP types (including non-IEEE ones) then we don't need a
>second overload anyway.
Here's another version of that patch that doesn't include unrelated
edits to __fp_weak_order, so is a bit easier to read.



patch.txt (8K) Download Attachment