Opened 11 years ago

Closed 11 years ago

Last modified 10 years ago

#1008 closed defect (fixed)

Too many columns in select crashes pgsql2shp

Reported by: joshjordan Owned by: mcayland
Priority: low Milestone: PostGIS 2.0.0
Component: loader/dumper Version: master
Keywords: pgsql2shp Cc:

Description

This select statement works if I take out half of the columns in it. It doesn't matter which half so it is not a specific column that causes the crash.

Versions:

2.0 USE_GEOS=1 USE_PROJ=1 USE_STATS=1 PostgreSQL 8.4.8 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.5.real (Ubuntu/Linaro? 4.5.2-8ubuntu4) 4.5.2, 64-bit

pgsql2shp -f /var/www/currentmap/joshj/nationwide_county -u postgres -P..... -p 5432 smartmaps "select distinct a.geom,a.gid, a.id, a.shporder, a.area, a.county, a.fullname, a.name, a.state, a.type, a.county_sea, a.population, a.male, a.female, a.age_5, a.age_5_to_9, a.age_10_to_, a.age_15_to_, a.age_20_to_, a.age_25_to_, a.age_35_to_, a.age_45_to_, a.age_55_to_, a.age_60_to_, a.age_65_to_, a.age_75_to_, a.age_85, a.median_age, a.age_18, a.male_18, a.female_18, a.age_21, a.age_62, a.age_65, a.male_65, a.female_65, a.f1_race, a.white, a.black, a.amindian, a.asian, a.asn_asian_, a.asn_chines, a.asn_filipi, a.asn_japane, a.asn_korean, a.asn_vietna, a.asn_other, a.hawaiian, a.hwn_native, a.hwn_guaman, a.hwn_samoan, a.hwn_other_, a.other_race, a.f2_races, a.ap_white, a.ap_black, a.ap_amindia, a.ap_asian, a.ap_hawaiia, a.ap_other, a.ap_hispani, a.h_ap_mexic, a.h_ap_puert, a.h_ap_cuban, a.h_ap_other, a.not_hispan, a.nh_white, a.in_househo, a.inhh_house, a.inhh_spous, a.inhh_child, a.inhh_own, a.inhh_other, a.inhh_othe1, a.inhh_nonre, a.inhh_unmar, a.in_group, a.ingrp_inst, a.ingrp_noni, a.households, a.hh_family, a.hh_family_, a.hh_family1, a.hh_fam_mar, a.hh_female_, a.hh_female1, a.hh_nonfami, a.hh_non_liv, a.hh_non_alo, a.hh_people_, a.hh_people1, a.average_hh, a.average_fa, a.housing_un, a.hu_occupie, a.hu_vacant, a.vachu_for_, a.owner_vaca, a.rental_vac, a.occhu_owne, a.occhu_rent, a.occhu_own_, a.occhu_ren1, a.in_school_, a.in_sch_nur, a.in_sch_kin, a.in_sch_ele, a.in_sch_hig, a.in_sch_col, a.populatio1, a.f25_9th_gr, a.f25_9th_to, a.f25_high_s, a.f25_some_c, a.f25_associ, a.f25_bachel, a.f25_grad_o, a.f25hs_gr, a.f25bache, a.populatio2, a.f15_never_, a.f15_now_ma, a.f15_separa, a.f15_widowe, a.f15_widow1, a.f15_divorc, a.f15_divor1, a.grandparen, a.grandpare1, a.civilian, a.civilian2, a.populatio3, a.f520_wdisa, a.populatio4, a.f2164_wdis, a.f2164_wdi1, a.f2164_no_d, a.f2164_no_1, a.populatio5, a.f65_wdisab, a.populatio6, a.f5_same_ho, a.f5_differe, a.f5_diff_hs, a.f5_diff_h1, a.f5_diff_h2, a.f5_diff_h3, a.f5_elsewhe, a.native, a.native_bor, a.native_bo1, a.native_bo2, a.native_bo3, a.foreign_en, a.foreign_na, a.foreign_no, a.foreign_bo, a.foreign_eu, a.foreign_as, a.foreign_af, a.foreign_oc, a.foreign_la, a.foreign_n1, a.f5_english, a.f5_other_l, a.f5_other_1, a.f5_other_2, a.f5_other_3, a.f5_other_i, a.f5_other_4, a.f5_other_5, a.f5_other_6, a.total_ance, a.ancestry_a, a.ancestry_c, a.ancestry_d, a.ancestry_1, a.ancestry_e, a.ancestry_f, a.ancestry_2, a.ancestry_g, a.ancestry_3, a.ancestry_h, a.ancestry_i, a.ancestry_4, a.ancestry_l, a.ancestry_n, a.ancestry_p, a.ancestry_5, a.ancestry_r, a.ancestry_s, a.ancestry_6, a.ancestry_7, a.ancestry_8, a.ancestry_9, a.ancestry10, a.ancestry_u, a.ancestry11, a.ancestry_w, a.ancestry12, a.ancestry_o, a.populatio7, a.f16_in_lab, a.f16_in_lf_, a.f16_in_lf1, a.f16_in_lf2, a.f16_in_lf3, a.f16_in_lf4, a.f16_not_in, a.females_16, a.fem_16_in_, a.fem_16_in1, a.fem_16_in2, a.own_childr, a.own_child6, a.workers_16, a.f16_mode_c, a.f16_mode_1, a.f16_mode_p, a.f16_mode_w, a.f16_mode_o, a.f16_mode_2, a.f16_mean_t, a.employed_c, a.ec_16_occ_, a.ec_16_occ1, a.ec_16_occ2, a.ec_16_occ3, a.ec_16_occ4, a.ec_16_occ5, a.ec_16_ind_, a.ec_16_ind1, a.ec_16_ind2, a.ec_16_ind3, a.ec_16_ind4, a.ec_16_ind5, a.ec_16_ind6, a.ec_16_ind7, a.ec_16_ind8, a.ec_16_ind9, a.ec_16_in10, a.ec_16_in11, a.ec_16_in12, a.ec_16_work, a.ec_16_wor1, a.ec_16_wor2, a.ec_16_wor3, a.household1, a.hh_income_, a.clon, a.clat, a.tv_hhs, a.cable_hhs, a.akhi_clat, a.akhi_clon from nationwide_county a where a.gid in (select distinct layergid from nationwide_county_matches where extent = 'Nationwide_DMA' and extentgid = 5)"

Attachments (2)

countybug.txt (12.3 KB) - added by joshjordan 11 years ago.
error log from command
postgis-pgsql2shp-many-columns.patch (1.4 KB) - added by mcayland 11 years ago.

Download all attachments as: .zip

Change History (13)

Changed 11 years ago by joshjordan

Attachment: countybug.txt added

error log from command

comment:1 Changed 11 years ago by strk

The code uses a lot of 'strcat' where 'strncat' should be used instead. Also, the current limit should be raised for this case to be supported.

#define MAX_QUERY_LEN 2048

comment:2 Changed 11 years ago by aperi2007

I guess also the limit of dbf file could be the reason of the crash. The dbf has a limit in the size of a record. I don't remenber if 2048 byte or 4096 byte for records.

comment:3 Changed 11 years ago by joshjordan

The query string length caused the crash. Its true the dbf spec has a limit of 255 columns and 255 chars in each column. I have never heard of or run into a limit on record length. Most tools ignore the 255 column limit including mapserver, openoffice, and postgis.

comment:4 Changed 11 years ago by aperi2007

I can partially confirm it.

the specs of dbf wrote:

*************************************************************************
The data file header structure for dBASE IV 2.0 table file.
**************************************************************************

File Structure:
===============

Byte     Contents       Meaning
-------  ----------     -------------------------------------------------
0        1byte          Valid dBASE IV file; bits 0-2 indicate version
                        number, bit 3 the presence of a dBASE IV memo
                        file, bits 4-6 the presence of an SQL table, bit
                        7 the presence of any memo file (either dBASE III
                        PLUS or dBASE IV).
1-3      3 bytes        Date of last update; formattted as YYMMDD.
4-7      32-bit number  Number of records in the file.
8-9      16-bit number  Number of bytes in the header.
10-11    16-bit number  Number of bytes in the record.
12-13    2 bytes        Reserved; fill with 0.
14       1 byte         Flag indicating incomplete transaction.
15       1 byte         Encryption flag.
16-27    12 bytes       Reserved for dBASE IV in a multi-user environment.
28       1 bytes        Production MDX file flag; 01H if there is an MDX,
                        00H if not.
29       1 byte         Language driver ID.
30-31    2 bytes        Reserved; fill with 0.
32-n*    32 bytes each  Field descriptor array (see below).
n + 1    1 byte         0DH as the field terminator.

The max number of bytes for a record are 216 byte = 65kbyte

So the max is not 2048 (wrong remenber) but 65 kbyte. (I don't know if in the DBIII version the size can be less.)

So if the query size is more than 65Kbyte ....

comment:5 Changed 11 years ago by joshjordan

Exporting a dbf that is out of spec may well cause a crash, but this crash is caused by query string length, not dbf record length. It will crash even when the exported dbf meets dbf spec when the select statement is over 2048 chars. I changed my buffer to 8192 and no problems since then.

comment:6 Changed 11 years ago by mcayland

Rather than keep MAX_QUERY_LEN static, it would make more sense to be a bit more intelligent about allocating space for the main query. Does the attached patch solve your problem?

Changed 11 years ago by mcayland

comment:7 Changed 11 years ago by joshjordan

That is a lot of added complexity to add a restriction of 254 fields. The dbf file format has only a total record length limit, you may have over 255 fields as long as all of the field lengths add up to less than 216. There is no other point in the postgis/mapserver stack where 255 fields is the limit, so this change will break things for some people. If you want to make this dynamically allocated, make the buffer the size of the input string. We are just talking about 8kb that is only temporarily used and then taken out of scope.

This issue can be put to rest by changing the 2kb static buffer to 8kb.

comment:8 Changed 11 years ago by mcayland

I think you may have misread the patch. The patch does not add a restriction of 254 fields, it simply sums the length of all the input fieldnames, then adds 1024 characters extra to cover any extra query prefix/suffix when allocating the main query string.

comment:9 Changed 11 years ago by joshjordan

Should be good then. I'll have to find time to get the latest postgis up.

comment:10 Changed 11 years ago by mcayland

Resolution: fixed
Status: newclosed

Now committed as r7566. I'll close this for now as it passes my tests here, but feel free to re-open if you find any further problems.

comment:11 in reply to:  9 Changed 10 years ago by laskew

Priority: mediumlow

Replying to joshjordan:

Should be good then. I'll have to find time to get the latest postgis up.

Hi Josh,

are you still around? I work for Smart Media Group and was wondering if I could ask you some questions about the Smart Maps application that you built for us. I will watch this thread to see if we can set up a time to talk.

Thanks.

Note: See TracTickets for help on using tickets.