Opened 14 years ago

Closed 12 years ago

Last modified 11 years ago

#1109 closed enhancement (fixed)

g.mlist functionality extension: return e.g. $prefix.001 to $prefix.031 out of $prefix.365 maps

Reported by: nikos Owned by: grass-dev@…
Priority: minor Milestone:
Component: Default Version: unspecified
Keywords: g.mlist, pattern, Cc:
CPU: Unspecified Platform: Unspecified

Description

A problem:

365 maps were produced using an r.sun(script) with a common prefix. It is required to extract average values per month [i.e. (1+2+...+31)/31, (32+33+...+ 59)/28, etc.]. For this task r.series can be used and, naturally, typing 31+28+... map names to feed the "input=" can/should be avoided.

Figuring out a wildcard(s) pattern or regular expression to pick e.g. only maps 001 to 031 out of 365 (all of them with the same prefix of course) is not easy (I think). The usual way of using "[0-something]" per character trick, will return also maps >031. My understanding is that g.mlist is also not "smart" enough to solve this quickly.

A wish:

How about adding such a functionality? Given there are many maps with a common prefix, the user can set a $BEGIN and an $END number and g.mlist would return those (counting from $BEGIN, adding all existing numbered maps up to $END) with the separator that the user has set.

Change History (13)

comment:1 by hamish, 14 years ago

try adding seq into your script. It's a really handy little command.

MAPS=""
for i in `seq 32 59` ; do
  i_str=`echo $i | awk '{printf("%03d", $1)}'`
  if [ -n "$MAPS" ] ; then
     MAPS="$MAPS,radmap.$i_str"
  else
     MAPS="radmap.$i_str"
  fi
done

you can add another loop around that with expr and "31 28 31..." to automatically set the $BEGIN and $END values for seq.

Hamish

comment:2 by hamish, 14 years ago

even easier, you can avoid the awk for formatting:

  seq -f '%03g' 32 59

comment:3 by hamish, 14 years ago

... thus:

   seq -f '%03g' 32 59 | sed -e 's/^/diffuse./' | tr '\n' ','

comment:4 by nikos, 14 years ago

Thank you Hamish. I'll have to try that one. I found my way around as follows:

# julian days of months
for DAYS in "1 31 jan" "32 59 feb" "60 90 mar" \
             "91 120 apr" "121 151 may" "152 181 jun" \
            "182 212 jul" "213 243 aug" "244 273 sep" \
            "274 304 oct" "305 334 nov" "335 365 dec"

    # parse "${DAYS}" and set positional parameters
    do
     set -- $DAYS ; echo $1 $2 $3

    # loop over asked period of days
    for DAY in `seq $1 $STEP $2` ; do

        # print leading 0's
        DAY_STR=`echo $DAY | awk '{printf("%.03d", $1)}'`
        echo "${DAY_STR}"

        # separate map names of interest with $SEP
        DAY_STR2=`g.mlist rast pat="$PREFIX${DAY_STR}" sep="$SEP"`
        echo "${DAY_STR2}"
        
        # store in a list
        echo "${DAY_STR2}" | tr '\n' "," >> "${PREFIX}${3}.TEMP.LIST"

    done

    # remove last "orphan" separator
    cat "${PREFIX}${3}.TEMP.LIST" | \
        sed 's/.$//' >> "${PREFIX}${3}.LIST"

    # remove temp file
    rm "${PREFIX}${3}.TEMP.LIST"


    # average per month
    r.series input=`cat "${PREFIX}${3}.LIST"` \
             out="${PREFIX}${3}" method=average

    # remove lists
    rm "${PREFIX}${3}.LIST"

done

( on-sight: what do you think about the above? drawbacks? )

comment:5 by nikos, 14 years ago

( The multiple "echos" were added for checking (when it was still not working) )

comment:6 by hamish, 14 years ago

as long as it works and is understandable by you, I wouldn't be too concerned with how elegant the outer loops are. that's not the computationally expensive part.

if your map names are longer than ~10 chars in total, beware that the maximum command line length may be 4096 chars on some OSs. so eg it may be better to sum months to get the full year instead of summing up 365 maps directly. (although if you can it might be interesting to compare results of doing it both ways, in the past I'd gotten two slightly different answers by doing that which confused me)

Hamish

in reply to:  3 ; comment:7 by jef, 14 years ago

Replying to hamish:

or even shorter:

seq -s, -f 'diffuse.%03g' 32 59

in reply to:  6 ; comment:8 by nikos, 14 years ago

Replying to hamish:

as long as it works and is understandable by you, I wouldn't be too concerned with how elegant the outer loops are. that's not the computationally expensive part.

I think I understand it (+have the illusion that I am 100% sure I know what I am doing :-p)

if your map names are longer than ~10 chars in total, beware that the maximum command line length may be 4096 chars on some OSs. so eg it may be better to sum months to get the full year instead of summing up 365 maps directly.

Actually that is what the script is doing: it creates a list with the days of jan, feb, march, etc., which is then fed as an input to "r.series method=average".

The results (per month mean, units as derived from r.sun) over a study area in Greece seem rational to me (e.g. r.info for january: min=535.582112958354, max=5788.08932396673 and for july: min=1695.6559625441, max=8480.92313508065).

(although if you can it might be interesting to compare results of doing it both ways, in the past I'd gotten two slightly different answers by doing that which confused me)

That is strange. It's only summing and dividing, right? No time currently (=added in ToDo list).

(I still consider the enhancement wish valid, especially when working with (r.)series which follow a naming pattern)

Nikos

in reply to:  8 comment:9 by nikos, 14 years ago

if your map names are longer than ~10 chars in total, beware that the maximum command line length may be 4096 chars on some OSs. so eg it may be better to sum months to get the full year instead of summing up 365 maps directly.

Actually that is what the script is doing: it creates a list with the days of jan, feb, march, etc., which is then fed as an input to "r.series method=average".

I meant: it creates and uses the list(s) in a loop, so not all-in-one.

in reply to:  7 comment:10 by nikos, 14 years ago

Replying to jef:

or even shorter:

seq -s, -f 'diffuse.%03g' 32 59

*nix Rocks!

comment:11 by hamish, 12 years ago

How about a g.name.sequence wrapper module with mandatory min= max= basename= options and an optional padding(?)= option (default=3) for this?

g.sequence min=32 max=59 pad=4 base="day."

would print to stdout day.0032,day.0033,...,day.0059.

? then you wouldn't have to be a UNIX power tools or python expert to make it happen.

Hamish

in reply to:  11 comment:12 by hamish, 12 years ago

Resolution: fixed
Status: newclosed

Replying to hamish:

How about a g.name.sequence wrapper module

done for g6 addons in r55850.

http://grasswiki.osgeo.org/wiki/AddOns/GRASS_6#g.name.sequence

enjoy, Hamish

comment:13 by nikos, 11 years ago

Cool! Very nice :-)

Note: See TracTickets for help on using tickets.