Opened 7 years ago

Closed 5 years ago

#6790 closed defect (wontfix)

CPL_IS_DOUBLE_A_INT cleanup needed

Reported by: Kurt Schwehr Owned by: Even Rouault
Priority: normal Milestone: closed_because_of_github_migration
Component: default Version: unspecified
Severity: normal Keywords: asan UndefinedBehavior
Cc:

Description

CPL_IS_DOUBLE_A_INT is a bit wonky as added in r29955 with undefined behavior waiting to cause issues. Why not just do the defined thing all the time? GDAL is getting built on non-x86 platforms (arm, power, sparc). A couple extra cycles to be defined behavior seems to be worth it.

e.g.

#if defined(MAKE_SANITIZE_HAPPY) || !(defined(__i386__) || defined(__x86_64__) || defined(_M_IX86) || defined(_M_X64))
/*! @cond Doxygen_Suppress */
#define CPL_CPU_REQUIRES_ALIGNED_ACCESS
/*! @endcond */
/** Returns whether a double fits on a int */
#define CPL_IS_DOUBLE_A_INT(d)  ( (d) >= INT_MIN && (d) <= INT_MAX && (double)(int)(d) == (d) )
#else
/* This is technically unspecified behaviour if the double is out of range, but works OK on x86 */
/** Returns whether a double fits on a int */
#define CPL_IS_DOUBLE_A_INT(d)  ( (double)(int)(d) == (d) )
#endif

Could we not just have this (with A -> AN grammar fix)? I tried to be as obvious as possible, typed and let the optimizer make it go fast rather than return d >= INT_MIN && d <= INT_MAX && (double)(int)d == d;

#ifdef __cplusplus

#include <limits>  // Move to the top of the file.

inline
bool CPLIsDoubleAnInt(double d)
{
  if (d > std::numeric_limits<int>::max()) return false;
  if (d < std::numeric_limits<int>::min()) return false;
  return d == static_cast<double>(static_cast<int>(d));
}
#endif  // __cplusplus

Or better yet... use modf as used here:

http://stackoverflow.com/questions/1521607/check-double-variable-if-it-contains-an-integer-and-not-floating-point

#include <cmath>  // Move to the top of the file.
#include <limits>  // Move to the top of the file.

inline
bool CPL_IS_DOUBLE_A_INT(double d)
{
  if (d > std::numeric_limits<int>::max()) return false;
  if (d < std::numeric_limits<int>::min()) return false;
  double dfTmp = 0.0;
  return modf(d, &dfTmp) == 0.0;
}

Change History (7)

comment:1 by Even Rouault, 7 years ago

Could you try to benchmark the performance effect of your proposed change on exporting to WKT a huge geometry (I think this must be the most demanding user of this macro) ?

comment:2 by Kurt Schwehr, 7 years ago

I did a quick look with micorbenchmark.

TL/DR: the unsafe macro and the check and cast in an inline are about the same speed when optimized. So I think we should go with this and drop the use of undefined behavior:

inline bool CPLIsDoubleAnInt2(double d) {
  if (d > std::numeric_limits<int>::max()) return false;
  if (d < std::numeric_limits<int>::min()) return false;
  return d == static_cast<double>(static_cast<int>(d));
}

This was on my desktop with Xeon(R) CPU E5-1650 0 @ 3.20GHz that's about 2.5 years old with a fairly newish clang bulid env for opt and gcc for debug.

Using an inline with limit check and cast is pretty fast. Both about 2.3 ns. In most of my tests, the cast inline function was actually faster than the unsafe macro. With gcc debug mode, it's all pretty slow. Adding const and ref (&) to the cast case didn't help on performance in the optimized case and hurt in the debug case.

It looks like a difference between 2 ns and 25 ns per check for modf. So, for 1G points it'll add about ~20 seconds. So modf seems like a bad idea.

Benchmark              Time(ns)        CPU(ns)     Iterations
-------------------------------------------------------------
BM_MacroUnsafe                2.26           2.26   310165815  
BM_Macro                     14.6           14.7     48352200  
BM_MacroUnsafeInline          2.30           2.30   305550706  
BM_MacroInline               14.7           14.7     44510538  
BM_Cast                       2.29           2.29   310908966  
BM_Modf                      40.2           40.2     17409677  
BM_ModfConstRef              25.9           25.9     26201663

DEBUG: Benchmark              Time(ns)        CPU(ns)     Iterations
--------------------------------------------------------------------
DEBUG: BM_MacroUnsafe               96.3           96.3      7073333  
DEBUG: BM_Macro                    115            115        6342469  
DEBUG: BM_MacroUnsafeInline        118            118        6082977  
DEBUG: BM_MacroInline              138            138        4955087  
DEBUG: BM_Cast                     177            177        3886193  
DEBUG: BM_Modf                     204            204        3461143  
DEBUG: BM_ModfConstRef             158            158        4573156  

Benchmark        Time(ns)        CPU(ns)     Iterations
-------------------------------------------------------
BM_MacroUnsafe          2.37           2.37   297256205  
BM_Cast                 2.28           2.28   308528102  

The test code is unexciting...

#define CPL_IS_DOUBLE_AN_INT_UNSAFE(d) ((double)(int)(d) == (d))
#define CPL_IS_DOUBLE_AN_INT(d) \
  ((d) >= INT_MIN && (d) <= INT_MAX && (double)(int)(d) == (d))

inline bool CPLIsDoubleAnInt0(double d) {
  return CPL_IS_DOUBLE_AN_INT_UNSAFE(d);
}

inline bool CPLIsDoubleAnInt1(double d) { return CPL_IS_DOUBLE_AN_INT(d); }

inline bool CPLIsDoubleAnInt2(double d) {
  if (d > std::numeric_limits<int>::max()) return false;
  if (d < std::numeric_limits<int>::min()) return false;
  return d == static_cast<double>(static_cast<int>(d));
}

inline bool CPLIsDoubleAnInt3(double d) {
  if (d > std::numeric_limits<int>::max()) return false;
  if (d < std::numeric_limits<int>::min()) return false;
  double dfTmp = 0.0;
  return modf(d, &dfTmp) == 0.0;
}

inline bool CPLIsDoubleAnInt4(const double& d) {
  if (d > std::numeric_limits<int>::max()) return false;
  if (d < std::numeric_limits<int>::min()) return false;
  double dfTmp = 0.0;
  return modf(d, &dfTmp) == 0.0;
}

const auto kValuesYes = {-1.0, 0.0, 1.0, -1000001.0, 1000001.0};
const auto kValuesNo = {-1.1,  1e-19,  1.00001,   -1000001.999, 1000001.999,
                        1e100, -1e100, 1.234e300, -1.234e300};

static void BM_MacroUnsafe(benchmark::State& state) {
  while (state.KeepRunning()) {
    for (const auto& v : kValuesYes) {
      CHECK(CPL_IS_DOUBLE_AN_INT_UNSAFE(v)) << v;
    }
    for (const auto& v : kValuesNo) {
      CHECK(!CPL_IS_DOUBLE_AN_INT_UNSAFE(v)) << v;
    }
  }
}
BENCHMARK(BM_MacroUnsafe);

static void BM_Macro(benchmark::State& state) {
  while (state.KeepRunning()) {
    for (const auto& v : kValuesYes) {
      CHECK(CPL_IS_DOUBLE_AN_INT(v)) << v;
    }
    for (const auto& v : kValuesNo) {
      CHECK(!CPL_IS_DOUBLE_AN_INT(v)) << v;
    }
  }
}
BENCHMARK(BM_Macro);
// ...

comment:3 by Even Rouault, 7 years ago

ok, so the unsafe way is ~2.3 ns and the safe way ~14.6 ns ? That doesn't really tell the effect in a real world case like OGRMakeWktCoordinate(), but their relative cost compared to all the rest of the work done in the function might be neglectable.

OK just go ahead.

comment:4 by Kurt Schwehr, 7 years ago

Sorry, I should have lead with the boiled down summary. The fastest safe method is about the same speed as the unsafe method in this test. But yes, this is an artificial test.

Benchmark        Time(ns)        CPU(ns)     Iterations
-------------------------------------------------------
BM_MacroUnsafe          2.37           2.37   297256205  
BM_Cast                 2.28           2.28   308528102 

comment:5 by Even Rouault, 7 years ago

Ah the difference is really neglectable. Let's adopt the safe way unconditionaly

comment:6 by Kurt Schwehr, 7 years ago

r37201 I wasn't able to keep the inline in the cpl_port.h header. grib has some weirdness that I didn't figure out.

comment:7 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: newclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.