Opened 6 years ago
Last modified 6 years ago
#3632 new enhancement
Add a function to read parameters from a file to the parser
Reported by: | wenzeslaus | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 8.0.0 |
Component: | Parser | Version: | unspecified |
Keywords: | g.parser options parameters file long CLI | Cc: | |
CPU: | Unspecified | Platform: | Unspecified |
Description
This is a suggestion to enhance the command line syntax parser to read the parameters from a file if specified.
Benefits:
- A common way for models with many parameters. (Many models (outside of GRASS GIS) parse a "config file" rather than using command line parameters and shell scripts.)
- Universal solution for long command lines including the extremely long ones removing the need for the
file
options such as the one in G7:r.series.
Questions and challenges:
- Which format to use? (JSON, YAML, GRASS eval-friendly key-values, what g.parser uses, white-space separated generic command line style, ...)
- Should we support multiple formats and decide based on file extension or content?
- Are external libraries OK? We may need to add dependencies for both C and Python.
- Is editing in GUI needed? Is "Save Parameters to a File" button enough? Load button needed too?
- Should the format be able to embed another file? (For example, including color table. Kind of like what GUI direct text input does.)
- Can the file be incomplete and supplied in the command line or vice versa? What takes precedence if both file and command line present? (Often it's the command line over file.)
- Should some extra things be added to the file? Region? Location and mapset? (Probably out of scope for this ticket.)
The usual way:
> r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n
Using a file:
> cat params.txt elevation=dtm slope=dtm_slope aspect=dtm_aspect -n > r.slope.aspect --parameter-file=params.txt
Interesting? Useful? Terrible? ... Let me know.
Change History (5)
follow-ups: 2 4 comment:1 by , 6 years ago
follow-up: 3 comment:2 by , 6 years ago
Replying to mlennert:
As much as I've found the file option for r.series et al very useful, I cannot say that I have been confronted with situations where I felt a need for your proposed approach.
+1
I definitely would not want it to replace the file option.
+1
If we go for such a parameter file, I would suggest to keep it very simple, i.e. one format, with my preference going to the one used in your example with a simple parameter=value pair per line, maybe with a special treatment of flags to create something like in the python parser, i.e. with the possibility to cite several flags at once:
[...] flags=nginstead of
[...] -n -g
or simply:
-ng
the parser already handles something like -ng
Using the example
r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n
what is the advantage of the proposed approach over a file that contains exactly this line and executing this file? This is already working and handled by the OS.
follow-up: 5 comment:3 by , 6 years ago
Replying to mmetz:
Using the example
r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -nwhat is the advantage of the proposed approach over a file that contains exactly this line and executing this file? This is already working and handled by the OS.
I'm thinking about these three points:
1) There is no command line length limitation as the line or lines are processed directly by the parser.
When I have a long r.series input:
> r.series in=map1900,map1901,...,map2100 out=slope meth=slope
I need to switch to file
instead of input
.
> cat input.txt map1900 map1901 ... map2100 > r.series file=input.txt out=slope meth=slope
That of course assumes that file
was implemented. In case of having the parameter file, I need to switch to that while still using input
.
> cat params.txt in=map1900,map1901,...,map2100 out=slope meth=slope > r.series --parameter-file=params.txt
2) The scripting is replaced with configuration. This "command line scripting" is OS-depended. You want a short line, but backslashes won't work in GUI Console (and on MS Win?). Escaping is done in different ways. In other words, the format is actually not well defined, so unless you already know "command line scripting" on your OS, this will be cumbersome. Another syntax related thing are comments which again could have a clearly defined syntax (#
in Bash versus REM
in CMD).
So the following ways of storing the parameters as a command with indentation and backslash which will work well in unix-like command line but not necessarily elsewhere (which is something we need to explain to the user),
r.series input=map2001,map2002,dummy,dummy,map2005,map2006,dummy,map2008 \ output=res_slope,res_offset,res_coeff method=slope,offset,detcoeff
would become, e.g., the following YAML file:
input: - map2001 - map2002 - dummy - dummy - map2005 - map2006 - dummy - map2008 output: - res_slope - res_offset - res_coeff method: - slope - offset - detcoeff
Here the advantage is for modules which are implementing some model/simulation which usually have a lot of parameters, e.g. G7:r.sim.water (which has over 20 parameters) or G7:r.topmodel (which actually requires a "parameters file").
3) This untangles the module from its parameters (splits the "command" into module name and parameters). This brings additional questions such as: Should we extend the format by adding module name or multiple modules and than read that using the grass
command creating effectively a new API (similarly to e.g. PDAL JSON pipelines)? However, what I'm thinking about now is the advantage of storing the parameters separately from the command itself and than reusing them repetitively (in an interactive command line or a script) while being able to override or complete the parameters when needed. You can do the same with enough of Python, but this would be native.
comment:4 by , 6 years ago
Replying to mlennert:
I definitely would not want it to replace the file option.
I don't know if to replace it or not. The thing is that we are adding file option to more and more modules. If the parameter file is available, then not only all modules but all their options too have a solution for the length limit of the command line arguments. That doesn't rule out the file option for convenience and alternative syntax, but you don't needed to implemented just because you need it once.
comment:5 by , 6 years ago
Replying to wenzeslaus:
Replying to mmetz:
Using the example
r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -nwhat is the advantage of the proposed approach over a file that contains exactly this line and executing this file? This is already working and handled by the OS.
I'm thinking about these three points:
1) There is no command line length limitation as the line or lines are processed directly by the parser.
When I have a long r.series input:
> r.series in=map1900,map1901,...,map2100 out=slope meth=slopeI need to switch to
file
instead ofinput
.> cat input.txt map1900 map1901 ... map2100 > r.series file=input.txt out=slope meth=slopeThat of course assumes that
file
was implemented. In case of having the parameter file, I need to switch to that while still usinginput
.
You need to switch in any case. It is easier to create a list input names (glist ... output=mylist
) than to create a parameter file.
2) The scripting is replaced with configuration. This "command line scripting" is OS-depended. You want a short line, but backslashes won't work in GUI Console (and on MS Win?). Escaping is done in different ways.
I am referring to your example
r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n
which works on any OS.
In other words, the format is actually not well defined, so unless you already know "command line scripting" on your OS, this will be cumbersome. Another syntax related thing are comments which again could have a clearly defined syntax (
#
in Bash versusREM
in CMD).
scripting is of course OS dependent and not related to the proposed parameter file option
So the following ways of storing the parameters as a command with indentation and backslash which will work well in unix-like command line but not necessarily elsewhere (which is something we need to explain to the user),
I am referring to your example
r.slope.aspect elevation=dtm slope=dtm_slope aspect=dtm_aspect -n
no indentation or backslash
3) This untangles the module from its parameters (splits the "command" into module name and parameters).
The history would no longer make sense because the parameters of the called modules are no longer recorded in history. In the meantime, the parameter files could be altered, moved, or deleted.
In this context, calling a module and creating a script for a specific OS must not be mixed.
As much as I've found the file option for r.series et al very useful, I cannot say that I have been confronted with situations where I felt a need for your proposed approach. I definitely would not want it to replace the file option.
If we go for such a parameter file, I would suggest to keep it very simple, i.e. one format, with my preference going to the one used in your example with a simple parameter=value pair per line, maybe with a special treatment of flags to create something like in the python parser, i.e. with the possibility to cite several flags at once:
instead of