Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#4543 closed enhancement (fixed)

Use Ryū to output floating point numbers

Reported by: Algunenano Owned by: Algunenano
Priority: medium Milestone: PostGIS 3.1.0
Component: liblwgeom Version: master
Keywords: Cc:

Description

PG12 introduced an implementation of Ryū (https://dl.acm.org/citation.cfm?id=3192369) to speed up the transformation between floating points and strings, as it can be 10x faster than a straight sprintf(str, "%f", double).

It seemed to me that multiple Postgis' functions could use a similar improvement so I've tested nasty hack to use Postgres' RYU implementation for lwprint_double (this is a hack, so it doesn't take into account desired precision or space left in the buffer):

Before:

explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
                                                                            QUERY PLAN                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=547.194..5313.276 rows=13 loops=1)
 Planning Time: 0.144 ms
 Execution Time: 5313.322 ms
(3 rows)

After:

 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=64.062..478.339 rows=13 loops=1)
 Planning Time: 0.628 ms
 Execution Time: 478.373 ms
(3 rows)

The hack involves using PG function in liblwgeom, so that's a no-go from the start, but it's useful to compare performance and see what kind of improvement we could expect to get if we decided to go that way:

diff --git a/liblwgeom/lwprint.c b/liblwgeom/lwprint.c
index af56c4c27..1a0dc886d 100644
--- a/liblwgeom/lwprint.c
+++ b/liblwgeom/lwprint.c
@@ -486,27 +486,13 @@ trim_trailing_zeros(char* str)
  * truncated and misses a terminating NULL.
  *
  */
+/* This is also provided by snprintf.c */
+extern int double_to_shortest_decimal_bufn(double f, char *result);
+
 int
 lwprint_double(double d, int maxdd, char* buf, size_t bufsize)
 {
-  double ad = fabs(d);
-  int ndd;
-  int length = 0;
-  if (ad <= FP_TOLERANCE)
-  {
-      d = 0;
-      ad = 0;
-  }
-  if (ad < OUT_MAX_DOUBLE)
-  {
-      ndd = ad < 1 ? 0 : floor(log10(ad)) + 1; /* non-decimal digits */
-      if (maxdd > (OUT_MAX_DOUBLE_PRECISION - ndd)) maxdd -= ndd;
-      length = snprintf(buf, bufsize, "%.*f", maxdd, d);
-  }
-  else
-  {
-      length = snprintf(buf, bufsize, "%g", d);
-  }
-  trim_trailing_zeros(buf);
-  return length;
+    int b = double_to_shortest_decimal_bufn(d, buf);
+    buf[b] = 0;
+    return b;
 }
\ No newline at end of file

Change History (8)

comment:1 by Algunenano, 5 years ago

I've made another integration doing a simpler hack and using upstream ryu printf's implementation and I get way less changes (some differences in the exponent version that I still need to have a look at) and good performance:

Before:

cartodb_dev_user_3e4a6fc6-4137-4c59-bc63-066f80efb90e_db=# explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
                                                                            QUERY PLAN                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=529.342..5680.107 rows=13 loops=1)
 Planning Time: 0.039 ms
 Execution Time: 5680.130 ms
(3 rows)

cartodb_dev_user_3e4a6fc6-4137-4c59-bc63-066f80efb90e_db=# explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
                                                                            QUERY PLAN                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=574.798..5470.276 rows=13 loops=1)
 Planning Time: 0.055 ms
 Execution Time: 5470.301 ms
(3 rows)

cartodb_dev_user_3e4a6fc6-4137-4c59-bc63-066f80efb90e_db=# explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
                                                                            QUERY PLAN                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=533.664..5223.448 rows=13 loops=1)
 Planning Time: 0.078 ms
 Execution Time: 5223.478 ms
(3 rows)

After:

cartodb_dev_user_3e4a6fc6-4137-4c59-bc63-066f80efb90e_db=# explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=91.176..846.130 rows=13 loops=1)
 Planning Time: 0.077 ms
 Execution Time: 846.159 ms
(3 rows)

cartodb_dev_user_3e4a6fc6-4137-4c59-bc63-066f80efb90e_db=# explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=82.195..840.579 rows=13 loops=1)
 Planning Time: 0.052 ms
 Execution Time: 840.601 ms
(3 rows)

cartodb_dev_user_3e4a6fc6-4137-4c59-bc63-066f80efb90e_db=# explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
                                                                           QUERY PLAN                                                                           
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020  (cost=0.00..33.63 rows=13 width=32) (actual time=87.366..845.805 rows=13 loops=1)
 Planning Time: 0.068 ms
 Execution Time: 845.830 ms
(3 rows)

This new version is slower than the original hack but I've yet to investigate why. One possibility is that the original hack never used the exponential output, but that isn't an option for us AFAIK, right?

comment:2 by Algunenano, 5 years ago

Some updates:

  • I have ryu now integrated inside postgis (under deps) so it builds and links without the need of anything external.
  • I've found some inconsistencies between what lwprint_double says it does with maxdd and what it actually does. Changing this breaks some tests but I think it's ok.
  • Ryu's scientific notation output doesn't trim extra zeros, so it might output 1.00000000e+100 instead of 1e+100 I'm not sure whether I want to try to fix it or just use snprintf (as it is) since those big numbers are rare on GIS.
  • I've started working on improving other parts of the print stack to keep improving the performance. I'm not sure if I'll continue down this path or move into fixing the broken tests (either by accepting the output or by changing the code).

Comparison of the current status:

  • ST_AsText with big geometries:
    explain analyze Select ST_AsText(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
    
    • Before: 5166.606 / 5220.705 / 5218.330
    • After: 715.381 / 713.122 / 713.993
  • ST_AsGeoJson with big geometries:
    explain analyze Select ST_AsGeoJson(the_geom) from benchmark_4c7214d90a79aa6760367a084a4d4a2f61fbe1c6cc4f7f9e76020;
    
    • Before: 4738.816 / 4729.487 / 4810.050
    • After: 1057.564 / 1048.442 / 1062.282
  • ST_AsText with points (3 + 1 workers):
    explain analyze Select ST_AsText(the_geom) from yellow_tripdata_2015_07_1m;
    
    • Before: Before: 610.948 / 606.455 / 602.095
    • After: 274.195 / 273.759 / 279.207
  • ST_AsGeoJson with points (3 + 1 workers):
    explain analyze Select ST_AsGeoJson(the_geom) from yellow_tripdata_2015_07_1m;
    
    • Before: 581.969 / 580.237 / 582.805
    • After: 320.685 / 316.013 / 320.374

comment:3 by Algunenano, 5 years ago

Working PR with minimal output changes: https://github.com/postgis/postgis/pull/523

comment:4 by mwtoews, 5 years ago

Thanks for introducing Ryu, as I haven't seen this yet. It appears to be a possible successor to Google's double-conversation.

Another good resource on the same topic is from Ryan Juckett, with an implementation called Dragon4, which is now used by numpy to format floats to positional or scientific notation strings. It's a shame there is no direct comparison between Dragon4 and Ryu, although I will say the publication for Ryu presents itself well.

comment:5 by Raúl Marín <git@…>, 5 years ago

Resolution: fixed
Status: assignedclosed

In 04f93f1/git:

Introduce ryu to print doubles

This massively (x10) speeds up lwprint_double and will be followed up
by improvements to improve other output functions that rely on it.

There has been some complications in adapting the function itself since
the output used by Postgis is by no means standard, choosing the number
of decimal digits based on the integer digits, user input and size
of the input buffer (although it should always be big enough)

Closes #4543
Closes https://github.com/postgis/postgis/pull/523

comment:6 by Algunenano, 5 years ago

Ryu introduced in 04f93f1, I'll add other PR's / commits with other improvements (#4614, #4615…)

comment:7 by Raúl Marín <git@…>, 5 years ago

In 2ee8e58e/git:

Ryu Makefile: Include shell variable

Libtool uses the default shell, and for dronie this seems to be
different than bash so it fails to understand the syntax.

References #4543

comment:8 by Raúl Marín <git@…>, 5 years ago

In 0752db5/git:

lwprint: Avoid using snprintf

It can be simplified with simple pointer writes

Closes https://github.com/postgis/postgis/pull/539
References #4543

Note: See TracTickets for help on using tickets.