Opened 14 years ago

Closed 14 years ago

## #873 closed enhancement (fixed)

# r.univar: allow percentile= to be doubles

Reported by: | hamish | Owned by: | |
---|---|---|---|

Priority: | major | Milestone: | 6.5.0 |

Component: | Raster | Version: | svn-develbranch6 |

Keywords: | r.univar, stats | Cc: | |

CPU: | All | Platform: | All |

## Description

Hi,

for `r.univar`

it is useful to have sub-integer percentile output, e.g. for calculating confidence intervals/sigma at 68.2689%, 95.45%, and 99.73%. Currently `r.univar`

only accepts whole integers for the percentile=. Actually it's worse: it accepts floats on the command line but they quietly get truncated into integers.(!)

see http://article.gmane.org/gmane.comp.gis.grass.user/20744

currently `r.colors.stddev`

is hit by this and probably anyone doing stats on map values.

the attached patch (vs. 6.5svn) attempts to fix this. please review. R-bridge users please compare with R's answers.

#spearfish g.region rast=elevation.10m r.univar elevation.10m perc=68,68.2689,95,95.45,99,99.73 -e

Hamish

### Attachments (1)

### Change History (5)

### by , 14 years ago

Attachment: |
runiv_perc_dbl.diff
added |
---|

### follow-up: 2 comment:1 by , 14 years ago

Replying to hamish:

for

`r.univar`

it is useful to have sub-integer percentile output, e.g. for calculating confidence intervals/sigma at 68.2689%, 95.45%, and 99.73%. Currently`r.univar`

only accepts whole integers for the percentile=. Actually it's worse: it accepts floats on the command line but they quietly get truncated into integers.(!)

See http://article.gmane.org/gmane.comp.gis.grass.user/20744

Currently

`r.colors.stddev`

is hit by this and probably anyone doing stats on map values.

Note that `r.quantile`

supports non-integer percentiles, works with large maps, and is significantly more efficient.

### comment:2 by , 14 years ago

Replying to glynn:

Note that

`r.quantile`

supports non-integer percentiles, works with large maps, and is significantly more efficient.

IMHO, it would still be nice if `r.univar`

supports floating point percentiles. The `r.univar`

manual could mention that `r.quantile`

is the better choice if only quantiles and/or percentiles are requested, e.g. under NOTES where it already says that `r.univar`

can use large amounts of system memory for extended statistics. I would suggest to also add `r.quantile`

to the SEE ALSO section.

Markus M

### comment:3 by , 14 years ago

patch committed to 6.5svn with r40939. (will port to trunk soon)

R-bridge users please compare with R's answers.

I'd still like verification before considering if this should go in 6.4 or not. I'd settle for the list of commands to test it myself.

mmetz:

The r.univar manual could mention that r.quantile is the better choice if only quantiles and/or percentiles are requested

added in the patch.

Hamish

### comment:4 by , 14 years ago

Resolution: | → fixed |
---|---|

Status: | new → closed |

backported to relbr64 with r41377.

Tested in 6.5svn against R's quantile(). mean residuals less than 3mm for Spearfish's elevation.10m.

test method follows.

GRASS 6.5svn:

#spearfish g.region rast=elevation.10m Q=`seq -s, 0 5 100` r.univar -e elevation.10m percentile=$Q -g n=2654802 null_cells=0 cells=2654802 min=1061.064087 max=1846.743408 range=785.679321 mean=1348.37144603811 mean_of_abs=1348.37144603811 stddev=175.494309916948 variance=30798.2528132259 coeff_var=13.0152793158443 sum=3579659211.6848597527 first_quartile=1196.8 median=1309.37 third_quartile=1480.29 percentile_0=1061.06 percentile_5=1123.09 percentile_10=1153.04 percentile_15=1171.19 percentile_20=1184.32 percentile_25=1196.8 percentile_30=1210.82 percentile_35=1229.45 percentile_40=1251.94 percentile_45=1280.39 percentile_50=1309.37 percentile_55=1346.4 percentile_60=1378.71 percentile_65=1410.04 percentile_70=1440.23 percentile_75=1480.29 percentile_80=1525.85 percentile_85=1568.51 percentile_90=1613.6 percentile_95=1671.23 percentile_100=1846.74 Q=`seq -s, 0 0.1 1` r.univar -e elevation.10m percentile=$Q -g n=2654802 null_cells=0 cells=2654802 min=1061.064087 max=1846.743408 range=785.679321 mean=1348.37144603811 mean_of_abs=1348.37144603811 stddev=175.494309916948 variance=30798.2528132259 coeff_var=13.0152793158443 sum=3579659211.6848597527 first_quartile=1196.8 median=1309.37 third_quartile=1480.29 percentile_0= 1061.06 percentile_0_1=1073.82 percentile_0_2=1078.85 percentile_0_3=1083.27 percentile_0_4=1085.54 percentile_0_5=1087.8 percentile_0_6=1090.37 percentile_0_7=1092.09 percentile_0_8=1093.77 percentile_0_9=1095.2 percentile_1= 1096.48

R 2.7.1:

library(spgrass6) G <- gmeta6() spear <- rast.get6("elevation.10m", cat = FALSE, ignore.stderr = TRUE) summary(spear) Q5 <- quantile(spear$elevation.10m, probs=seq(0,1,0.05), na.rm=TRUE, names=TRUE, type=8) cat(Q5, sep="\n") 1061.064 1123.092 1153.037 1171.192 1184.320 1196.800 1210.821 1229.449 1251.937 1280.389 1309.368 1346.401 1378.709 1410.040 1440.232 1480.286 1525.850 1568.510 1613.603 1671.230 1846.743 Q1 <- quantile(spear$elevation.10m, probs=seq(0,0.01,0.001), na.rm=TRUE, names=TRUE, type=8) cat(Q1, sep="\n") 1061.064 1073.824 1078.849 1083.274 1085.538 1087.803 1090.373 1092.091 1093.775 1095.205 1096.479

Q5_G = [ ... 1061.06 1123.09 1153.04 1171.19 1184.32 1196.8 1210.82 1229.45 1251.94 1280.39 1309.37 1346.4 1378.71 1410.04 1440.23 1480.29 1525.85 1568.51 1613.6 1671.23 1846.74 ]; Q5_R = [ ... 1061.064 1123.092 1153.037 1171.192 1184.320 1196.800 1210.821 1229.449 1251.937 1280.389 1309.368 1346.401 1378.709 1410.040 1440.232 1480.286 1525.850 1568.510 1613.603 1671.230 1846.743 ]; Q1_G = [ ... 1061.06 1073.82 1078.85 1083.27 1085.54 1087.8 1090.37 1092.09 1093.77 1095.2 1096.48 ]; Q1_R = [ ... 1061.064 1073.824 1078.849 1083.274 1085.538 1087.803 1090.373 1092.091 1093.775 1095.205 1096.479 ]; %summarize residuals >> mean ( abs(Q5_G - Q5_R) ) ans = 0.001 meter >> std( abs(Q5_G - Q5_R) ) ans = 0.0014 >> mean ( abs(Q1_G - Q1_R) ) ans = 0.003 meter >> std( abs(Q1_G - Q1_R) ) ans = 0.0015

Hamish

**Note:**See TracTickets for help on using tickets.

patch to allow r.univar to use percentile=double