Opened 6 years ago

Last modified 6 years ago

#3632 new enhancement

Add a function to read parameters from a file to the parser

Reported by: wenzeslaus Owned by: grass-dev@…
Priority: normal Milestone: 8.0.0
Component: Parser Version: unspecified
Keywords: g.parser options parameters file long CLI Cc:
CPU: Unspecified Platform: Unspecified

Description

This is a suggestion to enhance the command line syntax parser to read the parameters from a file if specified.

Benefits:

  • A common way for models with many parameters. (Many models (outside of GRASS GIS) parse a "config file" rather than using command line parameters and shell scripts.)
  • Universal solution for long command lines including the extremely long ones removing the need for the file options such as the one in G7:r.series.

Questions and challenges:

  • Which format to use? (JSON, YAML, GRASS eval-friendly key-values, what g.parser uses, white-space separated generic command line style, ...)
  • Should we support multiple formats and decide based on file extension or content?
  • Are external libraries OK? We may need to add dependencies for both C and Python.
  • Is editing in GUI needed? Is "Save Parameters to a File" button enough? Load button needed too?
  • Should the format be able to embed another file? (For example, including color table. Kind of like what GUI direct text input does.)
  • Can the file be incomplete and supplied in the command line or vice versa? What takes precedence if both file and command line present? (Often it's the command line over file.)
  • Should some extra things be added to the file? Region? Location and mapset? (Probably out of scope for this ticket.)

The usual way:

> r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n

Using a file:

> cat params.txt
elevation=dtm
slope=dtm_slope
aspect=dtm_aspect
-n
> r.slope.aspect --parameter-file=params.txt

Interesting? Useful? Terrible? ... Let me know.

Change History (5)

comment:1 by mlennert, 6 years ago

As much as I've found the file option for r.series et al very useful, I cannot say that I have been confronted with situations where I felt a need for your proposed approach. I definitely would not want it to replace the file option.

If we go for such a parameter file, I would suggest to keep it very simple, i.e. one format, with my preference going to the one used in your example with a simple parameter=value pair per line, maybe with a special treatment of flags to create something like in the python parser, i.e. with the possibility to cite several flags at once:

[...]
flags=ng

instead of

[...]
-n
-g

in reply to:  1 ; comment:2 by mmetz, 6 years ago

Replying to mlennert:

As much as I've found the file option for r.series et al very useful, I cannot say that I have been confronted with situations where I felt a need for your proposed approach.

+1

I definitely would not want it to replace the file option.

+1

If we go for such a parameter file, I would suggest to keep it very simple, i.e. one format, with my preference going to the one used in your example with a simple parameter=value pair per line, maybe with a special treatment of flags to create something like in the python parser, i.e. with the possibility to cite several flags at once:

[...]
flags=ng

instead of

[...]
-n
-g

or simply:

-ng

the parser already handles something like -ng

Using the example

r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n

what is the advantage of the proposed approach over a file that contains exactly this line and executing this file? This is already working and handled by the OS.

in reply to:  2 ; comment:3 by wenzeslaus, 6 years ago

Replying to mmetz:

Using the example

r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n

what is the advantage of the proposed approach over a file that contains exactly this line and executing this file? This is already working and handled by the OS.

I'm thinking about these three points:

1) There is no command line length limitation as the line or lines are processed directly by the parser.

When I have a long r.series input:

> r.series in=map1900,map1901,...,map2100 out=slope meth=slope

I need to switch to file instead of input.

> cat input.txt
map1900
map1901
...
map2100
> r.series file=input.txt out=slope meth=slope

That of course assumes that file was implemented. In case of having the parameter file, I need to switch to that while still using input.

> cat params.txt
in=map1900,map1901,...,map2100
out=slope
meth=slope
> r.series --parameter-file=params.txt

2) The scripting is replaced with configuration. This "command line scripting" is OS-depended. You want a short line, but backslashes won't work in GUI Console (and on MS Win?). Escaping is done in different ways. In other words, the format is actually not well defined, so unless you already know "command line scripting" on your OS, this will be cumbersome. Another syntax related thing are comments which again could have a clearly defined syntax (# in Bash versus REM in CMD).

So the following ways of storing the parameters as a command with indentation and backslash which will work well in unix-like command line but not necessarily elsewhere (which is something we need to explain to the user),

r.series input=map2001,map2002,dummy,dummy,map2005,map2006,dummy,map2008 \
         output=res_slope,res_offset,res_coeff method=slope,offset,detcoeff

would become, e.g., the following YAML file:

input:
  - map2001
  - map2002
  - dummy
  - dummy
  - map2005
  - map2006
  - dummy
  - map2008
output:
  - res_slope
  - res_offset
  - res_coeff
method:
  - slope
  - offset
  - detcoeff

Here the advantage is for modules which are implementing some model/simulation which usually have a lot of parameters, e.g. G7:r.sim.water (which has over 20 parameters) or G7:r.topmodel (which actually requires a "parameters file").

3) This untangles the module from its parameters (splits the "command" into module name and parameters). This brings additional questions such as: Should we extend the format by adding module name or multiple modules and than read that using the grass command creating effectively a new API (similarly to e.g. PDAL JSON pipelines)? However, what I'm thinking about now is the advantage of storing the parameters separately from the command itself and than reusing them repetitively (in an interactive command line or a script) while being able to override or complete the parameters when needed. You can do the same with enough of Python, but this would be native.

in reply to:  1 comment:4 by wenzeslaus, 6 years ago

Replying to mlennert:

I definitely would not want it to replace the file option.

I don't know if to replace it or not. The thing is that we are adding file option to more and more modules. If the parameter file is available, then not only all modules but all their options too have a solution for the length limit of the command line arguments. That doesn't rule out the file option for convenience and alternative syntax, but you don't needed to implemented just because you need it once.

in reply to:  3 comment:5 by mmetz, 6 years ago

Replying to wenzeslaus:

Replying to mmetz:

Using the example

r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n

what is the advantage of the proposed approach over a file that contains exactly this line and executing this file? This is already working and handled by the OS.

I'm thinking about these three points:

1) There is no command line length limitation as the line or lines are processed directly by the parser.

When I have a long r.series input:

> r.series in=map1900,map1901,...,map2100 out=slope meth=slope

I need to switch to file instead of input.

> cat input.txt
map1900
map1901
...
map2100
> r.series file=input.txt out=slope meth=slope

That of course assumes that file was implemented. In case of having the parameter file, I need to switch to that while still using input.

You need to switch in any case. It is easier to create a list input names (glist ... output=mylist) than to create a parameter file.

2) The scripting is replaced with configuration. This "command line scripting" is OS-depended. You want a short line, but backslashes won't work in GUI Console (and on MS Win?). Escaping is done in different ways.

I am referring to your example

r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n

which works on any OS.

In other words, the format is actually not well defined, so unless you already know "command line scripting" on your OS, this will be cumbersome. Another syntax related thing are comments which again could have a clearly defined syntax (# in Bash versus REM in CMD).

scripting is of course OS dependent and not related to the proposed parameter file option

So the following ways of storing the parameters as a command with indentation and backslash which will work well in unix-like command line but not necessarily elsewhere (which is something we need to explain to the user),

I am referring to your example

r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n

no indentation or backslash

3) This untangles the module from its parameters (splits the "command" into module name and parameters).

The history would no longer make sense because the parameters of the called modules are no longer recorded in history. In the meantime, the parameter files could be altered, moved, or deleted.

In this context, calling a module and creating a script for a specific OS must not be mixed.

Note: See TracTickets for help on using tickets.