Opened 6 years ago

Last modified 5 years ago

#3639 new enhancement

Write table like output directly to database

Reported by: sbl Owned by: grass-dev@…
Priority: normal Milestone: 8.0.0
Component: Database Version: svn-trunk
Keywords: table, database, write Cc:
CPU: Unspecified Platform: Unspecified

Description

It would be useful to be able to write table-like output from modules like v.distance, r.stats, r.category, r.what, directly into a table in the database...

(see also: #3638)

Change History (2)

comment:1 by mlennert, 6 years ago

In v.distance this is already available. I don't know if the solution is sufficiently general, and the outputs generated by the modules sufficiently similar to consider v.distance's code as boiler plate that can be integrated elsewhere. One issue is column types that need to be correctly defined. For the raster modules those will have to be determined from the input raster maps.

Moritz

comment:2 by sbl, 5 years ago

I am considering adding a function to the Python scripting library for parsing table-like stdout from GRASS modules into Numpy arrays. Nothing complicated, but probably a convenient wrapper function!?

See:

def stdout2numpy(stdout=None, sep=',', names=False, null_value=None,
                 fill_value=None, comments='#', usecols=None):
    """Read table-like output from grass modules as Numpy array;
    format instructions are handed down to Numpys genfromtxt function

    param str|byte stdout: tabular stdout from GRASS GIS module call
    param str sep: Separator delimiting columns
    param list names: List of strings with names for columns
    param str null_value: Characters representing the no-data value
    param str fill_value: Value to fill no-data with
    param str comments: Character that 
    param list usecols: List of columns to import 
    
    """
    import numpy as np
    from io import BytesIO
    if type(stdout) == str:
        stdout = gscript.encode(stdout)
    elif type(stdout) != byte:
        gscript.fatal(_('Unsupported data type'))
    np_array = np.genfromtxt(BytesIO(stdout), 
                             missing_values=null_value,
                             filling_values=fill_value,
                             usecols=usecols,
                             names=names,
                             dtype=None, delimiter=sep)
    return np_array

or alternatively as np_parse_command (equivalent to parse_command()):

def np_parse_command(*args, **kwargs):
    """Passes all arguments to read_command, then parses the output
    using Numpys genfromtxt() function to generate a Numpy array from 
    table like output.

    Parsing options in Numpys genfromtxt() function can be
    optionally given by <em>parse</em> dictionary, e.g.

    ::

        parse_command(..., parse = { 'delimiter' : '|' }))

    As far as possible, at least standard parser options in GRASS
    commands are handed down and translated to genfromtxt()
    automatically, e.g. separator (GRASS) -> delimiter (Numpy).

    import numpy as np
    from io import BytesIO
    stdout = read_command(*args, **kwargs)
    if type(stdout) == str:
        stdout = gscript.encode(stdout)
    elif type(stdout) != byte:
        gscript.fatal(_('Unsupported data type'))

     
    filling_values = parse['filling_values'] if 'filling_values' in parse else np.nan
    usecols = parse['usecols'] if 'usecols' in parse else None
    names = parse['names'] if 'names' in parse else None
    dtype = parse['dtype'] if 'dtype' in parse else None

    # Check if separator is specified in GRASS command
    if 'delimiter' not in parse:
        if 'separator' in args:
            parse['delimiter'] = args['separator']
        elif 'separator' in kwargs:
            parse['delimiter'] = kwargs['separator']
        else:
            parse['delimiter'] = ','

    # Check if null_value is specified in GRASS command
    if 'missing_values' not in parse:
        if 'null_value' in args:
            parse['missing_values'] = args['null_value']
        elif 'null_value' in kwargs:
            parse['missing_values'] = kwargs['null_value']
        else:
            parse['missing_values'] = '*'

    np_array = np.genfromtxt(BytesIO(stdout), **parse)
    return np_array

That would probably be a first step for a Python function that can write output to DB...

Any thoughts/opinions?

Note: See TracTickets for help on using tickets.