Opened 9 years ago

Closed 9 years ago

#958 closed defect (fixed)

[raster] Regress test failures on 32 bit

Reported by: strk Owned by: jorgearevalo
Priority: blocker Milestone: PostGIS 2.0.0
Component: raster Version: master
Keywords: Cc:

Description

Many tests fail on my 32bit system, they pretty much all look like precision issues. Maybe you could pass final output to round() ?

I'm attaching the full regress run dir

Attachments (4)

pgis_reg_13871.tgz (2.7 KB) - added by strk 9 years ago.
regress results on 32bit as of r7171
st_band_valgrind.log (25.6 KB) - added by jorgearevalo 9 years ago.
Valgrind report
valgrind_server_stband.log (47.1 KB) - added by jorgearevalo 9 years ago.
Valgrind output of postgres executing query that caused a crash
pgis_reg_27067.tar.gz (1.7 KB) - added by jorgearevalo 9 years ago.

Download all attachments as: .zip

Change History (25)

Changed 9 years ago by strk

Attachment: pgis_reg_13871.tgz added

regress results on 32bit as of r7171

comment:1 Changed 9 years ago by pracine

I get the same. This is because test #4 make the server to crash in rt_band.sql:

123.4567
1234.567
1234.567
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
connection to server was lost

comment:2 Changed 9 years ago by pracine

The following query make the server to crash:

SELECT	ST_AddBand(
		ST_AddBand(
			ST_AddBand(
				ST_MakeEmptyRaster(200, 200, 10, 10, 2, 2, 0, 0,-1)
				, 1, '64BF', 1234.5678, NULL
			)
			, '64BF', 987.654321, NULL
		)
		, '64BF', 9876.54321, NULL
	)

If I remove one ST_AddBand imbrication, no problem.

comment:3 Changed 9 years ago by pracine

If I update everything to revision 7147, which was the first commited by Bborie, I get rt_band.sql to fail but the server does not crash:

*** rt_band_expected	Tue May 17 16:49:22 2011
--- /tmp/rtpgis_reg_6160/test_9_out	Tue May 17 16:51:50 2011
***************
*** 1,17 ****
  123.4567
  1234.567
  1234.567
! 1234.5678
! 987.654321
  9876.54321
! 1234.5678
! 987.654321
  9876.54321
! 1234.5678
! 1234.5678
  9876.54321
! 987.654321
! 1234.5678
  4
  3
  2
--- 1,17 ----
  123.4567
  1234.567
  1234.567
! 4.47593815953616e-91
! 4.47593815953616e-91
  9876.54321
! 4.47593815953616e-91
! 4.47593815953616e-91
  9876.54321
! 4.47593815953616e-91
! 4.47593815953616e-91
  9876.54321
! 4.47593815953616e-91
! 4.47593815953616e-91
  4
  3
  2

comment:4 Changed 9 years ago by pracine

Apparently this query also crashes with the preceding revision (r7107). It's just that this 3 level ST_AddBand query was introduced by BBorie in r7147. Continue tracking...

comment:5 Changed 9 years ago by pracine

I can confirm that this bug was introduced in r7106. The preceding query works fine with r7105 and not with r7106...

comment:6 Changed 9 years ago by strk

Priority: mediumblocker

I confirm the crash.

comment:7 Changed 9 years ago by Bborie Park

Do either of you experience this crash in 64-bit? Or only in 32-bit?

comment:8 Changed 9 years ago by pracine

I am on Windows 7 64-bit.

comment:9 Changed 9 years ago by Bborie Park

32-bit or 64-bit PostgreSQL?

comment:10 Changed 9 years ago by pracine

32 bit.

comment:11 Changed 9 years ago by Bborie Park

I also confirm this in 32-bit PostgreSQL with r7201 in Slackware Linux 13.37. Time to dust off gdb and find out what is going on.

comment:12 Changed 9 years ago by Bborie Park

A backtrace of the crash caused by the query.

#0  0xb74f10e4 in memcpy () from /lib/libc.so.6
#1  0xb4f440e3 in rt_raster_serialize (raster=0x8559760) at rt_api.c:4632
#2  0xb4f37617 in RASTER_addband () from /usr/local/pgsql/lib/rtpostgis-2.0.so
#3  0x0818c800 in ExecMakeFunctionResult ()
#4  0x0818ed73 in ExecProject ()
#5  0x0819d5b1 in ExecResult ()
#6  0x08187d39 in ExecProcNode ()
#7  0x081852d2 in standard_ExecutorRun ()
#8  0x081923f5 in fmgr_sql ()
#9  0x0818c800 in ExecMakeFunctionResult ()
#10 0x0818d779 in ExecEvalExprSwitchContext ()
#11 0x081f4647 in evaluate_expr ()
#12 0x081f6551 in simplify_function ()
#13 0x081f6b32 in eval_const_expressions_mutator ()
#14 0x081f6af6 in eval_const_expressions_mutator ()
#15 0x081f6af6 in eval_const_expressions_mutator ()
#16 0x081b699d in expression_tree_mutator ()
#17 0x081f69bb in eval_const_expressions_mutator ()
#18 0x081b6c9c in expression_tree_mutator ()
#19 0x081f69bb in eval_const_expressions_mutator ()
#20 0x081f848b in eval_const_expressions ()
#21 0x081e80c8 in preprocess_expression ()
#22 0x081eaef3 in subquery_planner ()
#23 0x081eb700 in standard_planner ()
#24 0x0823cb80 in pg_plan_query ()
#25 0x0823cc5d in pg_plan_queries ()
#26 0x0823d656 in PostgresMain ()
#27 0x0820c442 in BackendStartup ()
#28 0x0820ca19 in ServerLoop ()
#29 0x0820d3ec in PostmasterMain ()
#30 0x081b5056 in main ()

comment:13 Changed 9 years ago by strk

a valgrind report would be more useful as it would tell you the point in which what looks like a short allocation were made.

comment:14 Changed 9 years ago by jorgearevalo

Owner: changed from pracine to jorgearevalo
Status: newassigned

Attached a valgrind report. I'm not sure about the meaning of the libcrypto related errors. There're no more errors. Did I forget any parameter in valgrind call? My line was:

valgrind --leak-check=full --track-origins=yes -v --trace-children=yes --log-file=st_band_valgrind.log perl /usr/bin/psql -d testing2 -c "select st_value(st_band(st_addband(st_addband(st_addband(st_makeemptyraster(200, 200, 10, 10, 2, 2, 0, 0, -1), 1, '64BF', 1234.5678, NULL), '64BF', 987.654321, NULL), '64BF', 9876.54321, NULL), ARRAY[1]), 3, 3)"

Changed 9 years ago by jorgearevalo

Attachment: st_band_valgrind.log added

Valgrind report

comment:15 Changed 9 years ago by strk

That call checks a run of perl ??

Even if you drop the perl part, you're testin the client, not the server. Checking the server requires running the backend in single-user mode and passing it the offending query.

comment:16 Changed 9 years ago by jorgearevalo

Oh, what I've done... Of course. Absolutely stupid on my side. This is what happens when you do things quickly and without thinking. Attached the right log file. And I executed:

echo "select st_value(st_band(st_addband(st_addband(st_addband(st_makeemptyraster(200, 200, 10, 10, 2, 2, 0, 0, -1), 1, '64BF', 1234.5678, NULL), '64BF', 987.654321, NULL), '64BF', 9876.54321, NULL), ARRAY[1]), 3, 3)" | valgrind --leak-check=full --log-file=valgrind_stband.log --track-origins=yes -v --trace-children=yes /usr/lib/postgresql/8.4/bin/postgres --single -D /var/lib/postgresql/8.4/main/ -v 8.4 -c config-file=/etc/postgresql/8.4/main/postgresql.conf testing2

Changed 9 years ago by jorgearevalo

Attachment: valgrind_server_stband.log added

Valgrind output of postgres executing query that caused a crash

comment:17 Changed 9 years ago by jorgearevalo

Ok, one missed PG_FREE_IF_COPY call in line 2034 was causing the memory issue. Now I've problems with regress tests rt_asgdalraster, rt_asjpeg and rt_aspng. Working of them. Attached the diff files.

Changed 9 years ago by jorgearevalo

Attachment: pgis_reg_27067.tar.gz added

comment:18 Changed 9 years ago by strk

A missed PG_FREE can cause memory leaks, not segfaults, which is what this ticket is about now, right ?

comment:19 Changed 9 years ago by Bborie Park

The ticket for the regression failures in rt_asgdalraster, rt_asjpeg, rt_aspng is #957. I'm working on it.

comment:20 Changed 9 years ago by jorgearevalo

I'm getting the regress results of pgis_reg_27067 in a 32b Ubuntu 9.10 machine with r7202. No memory leaks neither segfaults. Only the failures of ticket #957.

Could you please give it a try?

comment:21 Changed 9 years ago by strk

Resolution: fixed
Status: assignedclosed

Ok, no more segfault, but in addition to the rt_asgdalraster (#957) I also have a failure in rt_asjpeg and rt_aspng. Let's close this one and use #957 for them.

Note: See TracTickets for help on using tickets.