Opened 8 years ago
Closed 8 years ago
#3742 closed defect (fixed)
mvt regress failure on debbie and winnie
Reported by: | robe | Owned by: | Björn Harrtell |
---|---|---|---|
Priority: | blocker | Milestone: | PostGIS 2.4.0 |
Component: | postgis | Version: | master |
Keywords: | Cc: |
Description
Both debbie and winnie are showing this regress error on mvt regress
mvt .. failed (diff expected obtained: /projects/postgis/tmp/2.4.0dev_pg9.5_geos3.6.1_gdal2.1.3w64/test_120_diff) ----------------------------------------------------------------------------- --- mvt_expected 2017-04-21 13:42:19 -0400 +++ /projects/postgis/tmp/2.4.0dev_pg9.5_geos3.6.1_gdal2.1.3w64/test_120_out 2017-04-21 15:56:31 -0400 @@ -25,8 +25,8 @@ TA6|GisKBHRlc3QSDhIEAAABARgBIgQJMt4/GgJjMRoCYzIiAigBIgIwASiAIHgC TA7|Gk4KBHRlc3QSDBICAAAYASIECTTcPxIMEgIAARgBIgQJMt4/EgwSAgABGAEiBAk03D8aAmMxIgsK CW90aGVydGVzdCIGCgR0ZXN0KIAgeAI= -TA8|GkUKBHRlc3QSDBICAAAYASIECTLePxIMEgIAARgBIgQJNNw/EgwSAgACGAEiBAk03D8aAmMxIgIo -ASICKAEiAigCKIAgeAI= +TA8|GkEKBHRlc3QSDBICAAAYASIECTLePxIMEgIAABgBIgQJNNw/EgwSAgABGAEiBAk03D8aAmMxIgIo +ASICKAIogCB4Ag== TU2 ERROR: pgis_asmvt_transfn: parameter row cannot be other than a rowtype TU3 -----------------------------------------------------------------------------
Seems to have started happening as result of r15358 commit
Commit c11837ee6a26f6e42c33a96fe8ae019da93986cc by bjorn Expand tests to verify redundant value encoding fix Closes #3741 git-svn-id: http://svn.osgeo.org/postgis/trunk@15359 b70326c6-7e19-0410-871a-916f4a2858ee
Change History (19)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
okay just ran without 9.3 debbie and all passed. So for debbie it's just 9.3. For winnie 9.6 is okay but then 9.5 and 9.4 fail.
comment:5 by , 8 years ago
well I think debbie's 9.6 and 9.5 I read from head weekly. winnie I'm a bit lazy about updating her PostgreSQL. It could possibly be an issue with micro fixes. I'll upgrade debbie's 9.3 to latest micro and see if the issue goes away.
comment:6 by , 8 years ago
Sadly issue did not go away after the upgrade:
For the record before I did 9.3 upgrade debbie was running 9.3.10 (dronie is running PostgreSQL 9.3.14 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit and seemed to breeze thru mvt without a problem).
Now she's running:
PostgreSQL 9.3.16 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 5.2.1-22) 5.2.1 20151010, 64-bit
and still get
----------------------------------------------------------------------------- --- mvt_expected 2017-04-21 17:41:38.547468171 +0000 +++ /var/lib/jenkins/workspace/postgis/tmp/2_4_pg9.3w64/test_113_out 2017-04-23 16:28:57.204235634 +0000 @@ -25,8 +25,8 @@ TA6|GisKBHRlc3QSDhIEAAABARgBIgQJMt4/GgJjMRoCYzIiAigBIgIwASiAIHgC TA7|Gk4KBHRlc3QSDBICAAAYASIECTTcPxIMEgIAARgBIgQJMt4/EgwSAgABGAEiBAk03D8aAmMxIgsK CW90aGVydGVzdCIGCgR0ZXN0KIAgeAI= -TA8|GkUKBHRlc3QSDBICAAAYASIECTLePxIMEgIAARgBIgQJNNw/EgwSAgACGAEiBAk03D8aAmMxIgIo -ASICKAEiAigCKIAgeAI= +TA8|GkEKBHRlc3QSDBICAAAYASIECTLePxIMEgIAABgBIgQJNNw/EgwSAgABGAEiBAk03D8aAmMxIgIo +ASICKAIogCB4Ag== TU2 ERROR: pgis_asmvt_transfn: parameter row cannot be other than a rowtype TU3 -----------------------------------------------------------------------------
seems like cassert isn't the issue since I would expect 9.4, 9.5, and 9.6 to fail too. This will be very very interesting.
You see anything special about the output that might give a clue.
comment:7 by , 8 years ago
The good news is mvt is doing okay against debbie's PostgreSQL 10 runs too
comment:8 by , 8 years ago
okay I upgraded debbie's 9.4 as well which was at 9.4.5 before and was passing. still passes after the upgrade to
PostgreSQL 9.4.11 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 5.2.1-22) 5.2.1 20151010, 64-bit
since I used the same script to compile both the new 9.3 and 9.4, I think that rules out how I'm compiling and all that seems to also rule out microversion.
But here's another interesting twist. I just compiled postgis on my windows 7 64-bit against 9.6, and it fails mvt test as well same difference.
Creating database 'postgis_reg' Loading PostGIS into 'postgis_reg' Loading SFCGAL into 'postgis_reg' PostgreSQL 9.6.1, compiled by Visual C++ build 1800, 64-bit Postgis 2.4.0dev - r15360 - 2017-04-23 17:41:51 scripts 2.4.0dev r15360 GEOS: 3.6.1-CAPI-1.10.1 r4317 PROJ: Rel. 4.9.1, 04 March 2015 SFCGAL: 1.3.0 +++ /projects/postgis/tmp/2.4_pg9.6w64/test_120_out 2017-04-23 13:45:34.273093800 -0400 @@ -25,8 +25,8 @@ TA6|GisKBHRlc3QSDhIEAAABARgBIgQJMt4/GgJjMRoCYzIiAigBIgIwASiAIHgC TA7|Gk4KBHRlc3QSDBICAAAYASIECTTcPxIMEgIAARgBIgQJMt4/EgwSAgABGAEiBAk03D8aAmMxIgsK CW90aGVydGVzdCIGCgR0ZXN0KIAgeAI= -TA8|GkUKBHRlc3QSDBICAAAYASIECTLePxIMEgIAARgBIgQJNNw/EgwSAgACGAEiBAk03D8aAmMxIgIo -ASICKAEiAigCKIAgeAI= +TA8|GkEKBHRlc3QSDBICAAAYASIECTLePxIMEgIAABgBIgQJNNw/EgwSAgABGAEiBAk03D8aAmMxIgIo +ASICKAIogCB4Ag== TU2 ERROR: pgis_asmvt_transfn: parameter row cannot be other than a rowtype TU3
winnie's 9.6 64-bit run:
PostgreSQL 9.6.1, compiled by Visual C++ build 1800, 64-bit Postgis 2.4.0dev - r15360 - 2017-04-23 17:53:51 scripts 2.4.0dev r15360 GEOS: 3.6.1-CAPI-1.10.1 r4317 PROJ: Rel. 4.9.1, 04 March 2015 SFCGAL: 1.3.0 Running tests : mvt .. ok geobuf .. ok
So clearly PostgreSQL version (even micro) does not seem to be issue. Why winnie's 9.6 passes and mine fails — when we are setup mostly the same is very puzzling.
I'm suspecting mischievous spirits haunting things
comment:9 by , 8 years ago
Thanks for investigating, very valuable as I still haven't mustered up the guts to make a move on Debbie (or Winnie, I'm not that picky).
Decoding the output vector tile and compare the expected to the failing in detail will perhaps give some more info. As of yet I've only done naive inspection of decoded vector tiles using vector-tile-js but I've just found a more detailed way to inspect using protoc-c —decode.
The failing test case is:
SELECT 'TA8', encode(ST_AsMVT('test', 4096, 'geom', q), 'base64') FROM ( SELECT 1 AS c1, ST_AsMVTGeom(ST_GeomFromText('POINT(25 17)'), ST_MakeBox2D(ST_Point(0, 0), ST_Point(4096, 4096)), 4096, 0, false) AS geom UNION SELECT 1 AS c1, ST_AsMVTGeom(ST_GeomFromText('POINT(26 18)'), ST_MakeBox2D(ST_Point(0, 0), ST_Point(4096, 4096)), 4096, 0, false) AS geom UNION SELECT 2 AS c1, ST_AsMVTGeom(ST_GeomFromText('POINT(26 18)'), ST_MakeBox2D(ST_Point(0, 0), ST_Point(4096, 4096)), 4096, 0, false) AS geom) AS q;
It creates a tile with three point features. Duplicate values for c1 should only be encoded once and reused.
Decode of the expected binary:
layers { name: "test" features { tags: 0 tags: 0 type: POINT geometry: 9 geometry: 50 geometry: 8158 } features { tags: 0 tags: 1 type: POINT geometry: 9 geometry: 52 geometry: 8156 } features { tags: 0 tags: 2 type: POINT geometry: 9 geometry: 52 geometry: 8156 } keys: "c1" values { uint_value: 1 } values { uint_value: 1 } values { uint_value: 2 } extent: 4096 version: 2 }
Decode of the unexpected binary:
layers { name: "test" features { tags: 0 tags: 0 type: POINT geometry: 9 geometry: 50 geometry: 8158 } features { tags: 0 tags: 0 type: POINT geometry: 9 geometry: 52 geometry: 8156 } features { tags: 0 tags: 1 type: POINT geometry: 9 geometry: 52 geometry: 8156 } keys: "c1" values { uint_value: 1 } values { uint_value: 2 } extent: 4096 version: 2 }
On first glance it seems that the expected output is incorrectly containing redundant encoding of the c1 value. Both are valid tiles but the unexpected encode is more correct. I think I might have a lead on the problem here. I fixed it for string values recently but it appears I have similar issues with numeric values. Will look for a solution when time permits.
comment:10 by , 8 years ago
So does that mean the failing ones are the right behavior, not the wrong behavior?
comment:11 by , 8 years ago
I took out 9.3 testing for 2.4 for now on debbie until we resolve this. I'll keep winnie broken so that she catches when it's fixed.
comment:12 by , 8 years ago
Both outputs are valid tiles and will render the same (which is why I did not see anything wrong at first) but the unexpected output is the most correct encoding.
It must be related to the use of uthash (1) where I must be doing something wrong. Still can't explain at all though why it's affected by environment details.
comment:13 by , 8 years ago
When I run the TA8 test case manually in pgadmin I get: "GkEKBHRlc3QSDBICAAAYASIECTLePxIMEgIAABgBIgQJNNw/EgwSAgABGAEiBAk03D8aAmMxIgIo ASICKAIogCB4Ag==" (which is the correct encoding)
When I run the test case using run_test.pl or make check I get: "GkUKBHRlc3QSDBICAAAYASIECTLePxIMEgIAARgBIgQJNNw/EgwSAgACGAEiBAk03D8aAmMxIgIo ASICKAEiAigCKIAgeAI=" (which is the incorrect encoding)
It does not matter if I reorder the tests so that it tries this case before any other.
comment:14 by , 8 years ago
There be dragons in this one. Running the TA8 test case in pgadmin at my home computer (previous run was on work laptop, both systems at Debian Stretch) results in… "GkUKBHRlc3QSDBICAAAYASIECTLePxIMEgIAARgBIgQJNNw/EgwSAgACGAEiBAk03D8aAmMxIgIo ASICKAEiAigCKIAgeAI=", the incorrect encoding.
comment:17 by , 8 years ago
If I set debug message levels to the following:
- SET client_min_messages TO DEBUG2; - I get incorrect encoding
- SET client_min_messages TO DEBUG3; - I get correct encoding
comment:18 by , 8 years ago
I think you can use valgrind with postgres, it's just a tad noisier so I hear.
On closer inspection doesn't seem to be for all versions of PostgreSQL.
On debbie - seems like only 9.3 failed. I'm going to run again just to confirm: