Opened 17 years ago

Closed 12 years ago

#37 closed enhancement (fixed)

r.in.xyz increase region based on input data

Reported by: marisn Owned by: hamish
Priority: minor Milestone: 6.5.0
Component: Raster Version: unspecified
Keywords: r.in.xyz Cc: grass-dev@…
CPU: Unspecified Platform: Unspecified

Description

It would be nice, if r.in.xyz would support increasing region based on input data like "v.in.ogr -e" does.

Change History (14)

comment:1 by hamish, 17 years ago

Cc: grass-dev@… added
Owner: changed from grass-dev@… to hamish
Status: newassigned

Hi,

While I do see the merit of the convenience, I am not especially inclined to have that feature added. IMHO the first r.in.xyz run with -s should be followed by "g.region n= s= e= w= res= -a" followed by a "r.in.xyz method=n" and "r.univar map=n" to tune the region extent and resolution to something suitable before proceeding. aka the "Eat your vegetables" response.

Besides failing the "do one thing well" test, I think there was some goal to have only the g.region module be able to change the mapset's region. (other modules may change it internally for the duration of execution but that should not affect other modules or the $MAPSET/WIND file)

Hamish

comment:2 by epatton, 17 years ago

I wrote a shell script wrapper around 'r.in.xyz -s' specifically to set set the region bounds to the maximum extent of the input data. The script also lets you enter in the desired resolution for the g.region call. Anyone here could probably write a similar one in 5 minutes, which is why I never posted it before, but I could post it if anyone wants. Where I work with so much xyz data on a regular basis, it does save me a lot of "g.region n= s=", etc.

~ Eric.

comment:3 by glynn, 17 years ago

r.in.xyz shouldn't change the current region.

What it probably *should* do is to create a map whose size is based upon the input data, rather than using the current region. IOW, the way that other r.in.* commands work; unlike most r.* commands, the output from r.in.* commands isn't normally based upon the current region.

This would require two passes, which won't work with input=- if stdin is a pipe. But then the same is true of a script which runs r.in.xyz -s, g.region, r.in.xyz.

A script approach should definitely use either $WIND_OVERRIDE or $GRASS_REGION rather than modifying WIND.

On a related note, r.in.xyz should probably have some way to explicitly specify the bounds and resolution of the created map, either separate n=/s=/e=/w=/res= options and/or a region= option to use a named region.

comment:4 by hamish, 17 years ago

Glynn:

What it probably *should* do is to create a map whose size is based upon the input data, rather than using the current region.

..

On a related note, r.in.xyz should probably have some way to explicitly specify the bounds and resolution of the created map, either separate n=/s=/e=/w=/res= options and/or a region= option to use a named region.

I hear your point re. standardizing r.in.* modules, but r.in.xyz is not like other import modules and I do not think it should work that way. It is as much a binning filter as a data importer; the XYZ data is not a finished product to be loaded into the GIS without data loss. The module is by design a data aggregator, ie lossy. It is not loading the data raw & intact, it is processing it into something new.

here are some screenshots of a recent use:

http://bambi.otago.ac.nz/hamish/grass/r.in.xyz/saunders_track.jpg http://bambi.otago.ac.nz/hamish/grass/r.in.xyz/saunders_bathy.png (with thanks to Bob Covill for the idea)

XYZ from the ship's combined NMEA lat/lon+depth sounder is processed for an area of interest. In the image with the chart you can see the ship's track (red) continues away many miles from the survey site back to port. Loading in at rough resolution then using d.zoom to interactively set the bounds is way easier than trying to use d.where + manually set 'n= s= w= e='. And 'res=' is another thing altogether, but being able to use 'g.region res= -a' really helps in a way which would be a pain to calculate manually.

With a LIDAR study with 40GB of raw x,y,z ASCII results is probably more important to not try and load the entire spatial coverage in at once.

So I would argue that not using the full spatial extent of the XYZ data is the typical mode of operation for this module, and using d.zoom & 'g.region res= -a' after getting the rough coordinates with 'r.in.xyz -s' and then doing a rough first pass is the recommended workflow.

In summary, it's a data aggregator not just a data importer and I'm leaning to "won't fix".

Hamish

comment:5 by hamish, 17 years ago

Resolution: wontfix
Status: assignedclosed

On grass-dev Glynn suggested that perhaps it shouldn't be called "r.in.*" then as it acts differently to other import modules.

Well it does import data, it just aggregates it as it does that, so it's a hybrid. Thus I can argue it both ways depending on context ;)

Closing this wish as "wontfix" as the base module should not permanently change the region settings.

As doing that can be a logical part of the workflow, it would be nice if Eric uploaded his script to the wiki add-ons section or if it's just a few-liner job perhaps a new page about using r.in.xyz in the wiki somewhere or a new example in the help page?

Hamish

comment:6 by glynn, 17 years ago

Resolution: wontfix
Status: closedreopened

Re-opening. An option to create a map based upon the bounds of the data rather than the current region is (IMHO) a reasonable enhancement to r.in.xyz. It shouldn't need a separate script.

For enhancements, wontfix should be reserved for "this is a bad idea" (e.g. bloat) rather than simply "won't be implemented immediately".

comment:7 by hamish, 17 years ago

Glynn:

An option to create a map based upon the bounds of the data rather than the current region is (IMHO) a reasonable enhancement to r.in.xyz. It shouldn't need a separate script.

It would be possible to automatically set the bounds for the new map after scanning the data, but as there is no associated resolution in the data you have to use whatever the current region settings are, and then code in a 'g.region -a' approach or not. At which point before running the module for the final product you have to scan the data and run 'g.region res= -a' and spend a little time thinking about the region size anyway. Or if you don't want to code in -a (it's problematic enough in g.region already), be content with it automatically using some haywire resolution setting, which I am not. The bin size is a critical part of what the module does and fundamental to the meaning of the results.

Hamish

in reply to:  7 comment:8 by glynn, 17 years ago

Replying to hamish:

It would be possible to automatically set the bounds for the new map after scanning the data, but as there is no associated resolution in the data you have to use whatever the current region settings are, and then code in a 'g.region -a' approach or not.

That just means aligning the bounds to the existing grid (resolution and offset).

At which point before running the module for the final product you have to scan the data and run 'g.region res= -a' and spend a little time thinking about the region size anyway. Or if you don't want to code in -a (it's problematic enough in g.region already),

The only things that are problematic about g.region -a are:

  1. The existing behaviour is counterintuitive. That mostly derives

from the fact that you *might* be changing the resolution at the same time, in which case you have to choose the offset arbitrarily.

  1. Any change to the eixting behaviour creates an incompatibility with

previous versions.

An enhancement to r.in.xyz wouldn't have either of these problems. There is no reason to change the resolution, and there is no existing behaviour to remain compatible with.

comment:9 by neteler, 17 years ago

I post again to trac:

Real world example which will a problem for the average user:

GRASS 6.3.0svn (pat): > r.in.xyz d325095655/p325099657.txt
out=`basename d325095655/p325099657.txt` fs=space
Scanning data ...
Writing to map ...
 100%
r.in.xyz complete. 0 points found in region.

(yes, there are points in the file)

All GRASS import programs actually import the map without the need that the user defines the boundary. Here, the user has to preset the region (extra step) and then s/he can import.

Since we cannot change to much in GRASS 6, a flag to avoid this extra step would be nice, something like:

-e Scan data file and import complete map

I am asking because I have to process *many* maps. Sure, I can easily write a shell script for that but other users maybe not. Also for the sake of consistency... or rename the module which doesn't make much sense.

Markus

comment:10 by hamish, 17 years ago

I will write a demonstration r.in.xyz.auto script providing the desired functionality wrt bounds so we can test how other methods could work.

However, for this module picking your resolution (thus average number of input points per cell bin) is *absolutely fundamental* to results and must be chosen with care and supervision. This probably means a few trial 'r.in.xyz method=n' + 'r.univar n_map' steps to get it right. This is unknowable information from the data which must be supplied by the user, and the workflow is multi-step.

Solutions:

  • We could use the current region resolution with bounds taken from scanning step. In this case the user is 90% likely not to have considered the resolution first and will get bad results or a out-of-memory error as the resolution will be badly wrong. Also it means using a g.region step first anyway, so we're not really that much better off.
  • We could add a res= option for use with an optional auto-bounds setting flag. I'm not very excited about adding new options and flags, but it's a possibility. 'g.region -a' like code would need into the module and offset region would be impossible (s=25 n=75 res=50).
  • We could choose a default resolution that made the output like a 1000x1000 map. Lame.

I think that all these solution are to some extent poor and would be little more consistent with other r.in.* modules than the current way. In any case import from auto-scanned bounds should not be changed to be the default, and custom subregion and resolution must remain possible. In practice I have found that a cropping step is generally wanted, so it is useful to combine it with this first import step.

Re consistency we just have to note that the module works on the current region like other r. modules and not like other r.in. modules.

I would also note that this is all well discussed in the man page, with an example.

still unconvinced, Hamish

comment:12 by martinl, 15 years ago

Component: defaultRaster
CPU: Unspecified
Keywords: r.in.xyz added
Platform: Unspecified

comment:13 by martinl, 15 years ago

Milestone: 6.4.06.5.0

comment:14 by hamish, 12 years ago

Resolution: fixed
Status: reopenedclosed

script in addons for years.

Note: See TracTickets for help on using tickets.