Opened 5 years ago

Closed 5 years ago

#3756 closed defect (fixed)

r.watershed: flow accumulation wrong with edge contamination

Reported by: itati01 Owned by: grass-dev@…
Priority: normal Milestone:
Component: Raster Version: svn-releasebranch76
Keywords: r.watershed, flow accumulation, edge contamination Cc:
CPU: Unspecified Platform: Unspecified

Description

Dear developers,

I compared the results of the flow accumulation (=facc) in the independent Python module pysheds to r.watershed with the flag -a. The input of pysheds was the flow direction (=fdr, D8) created by r.watershed, so the resulting rasters should be identical.

The difference in facc (=pysheds - r.watershed) is zero for most of the raster, except for raster cells affected by edge contamination where pysheds overestimates facc compared to r.watershed. At first, I suspected an error in the much younger pysheds, however r.watershed seems to be the culprit.

Note: The catchment I use is crossed by a channel, which in turn is crossed by streams. I removed this channel by setting the DEM to nodata except for the crossings + some buffer. So, I cannot exclude edge-contaminated cells.

Small differences are along the majority of the nodata values and they increase downstream. The source of the problem could be a) a mis-interpretation of negative fdr or b) wrong facc in case of edge contamination (or unexpected behaviour). For pysheds, I assumed that the absolute value indicates the direction and the negative sign indicates edge contamination. I was puzzled about the output of d.rast.arrow type=drainage: the arrows are shifted, e.g. -2 was identical to +1 not +2. Also tested it with pysheds to no avail.

(By the way, the output of d.rast.arrow looks wrong, e.g. the red rectangles in figure 1. If someone confirms this error, I can open a separate ticket for this issue. The error in d.rast.arrow would be here:

if (...map_type == 5) {
    ...
    aspect_c = (int)(aspect_f + 0.5);
}

).

Most likely it is about the cumulation of negative values. Take the rectangle in figure 2 as an example. It contains raster cells 1-9, I am counting row-wise starting with the top left cell. One route is 8 + 9 -> 6 + 1 -> 2 -> Channel in blue. Pyshed produces what I expect: facc == 3 for cell6 and facc == 5 for cell2, while r.watershed calculates for all cells +1, or -1 without the flag -a.

The reason is presumably: cells 8, 9 and 1 are edge contaminated (i.e. facc is turned to -1). So, for cell 6: facc = -1 + -1 + 1 = -1 (1 is the default weight for all cells). The calculation for cell 2 is identical.

I suggest to subtract not to add the weights in case of edge-contaminated upslope cells.

Attachments (2)

d_rast_error.png (103.8 KB ) - added by itati01 5 years ago.
Figure 1: Output of d.rast.arrow with the flow direction as input, the blue rectangle similar to Figure 2, red rectangles indicate wrong interpretation of negative flow directions
error_facc_grass.png (36.5 KB ) - added by itati01 5 years ago.
Figure 2: Unexpected flow accumulation, r.watershed returns 1 for all cells within the rectangle, nodata in blue

Download all attachments as: .zip

Change History (7)

by itati01, 5 years ago

Attachment: d_rast_error.png added

Figure 1: Output of d.rast.arrow with the flow direction as input, the blue rectangle similar to Figure 2, red rectangles indicate wrong interpretation of negative flow directions

by itati01, 5 years ago

Attachment: error_facc_grass.png added

Figure 2: Unexpected flow accumulation, r.watershed returns 1 for all cells within the rectangle, nodata in blue

comment:1 by mmetz, 5 years ago

Regarding edge cells (cells bordering a nodata cell), flow can not be distributed and thus not accumulated because of these neighbor cells with nodata. Flow distribution can not be determined if any neighbor is nodata. Consequently, flow accumulation for cells where any neighbor is nodata is set to -1. Exceptions are cells with a nodata neighbor that receive surface flow from cells that do not have a nodata neighbor.

in reply to:  1 ; comment:2 by itati01, 5 years ago

Dear Markus,

Thank you for your comment.

I understand the problem with nodata cells in flow accumulation and would normally drop edge-contaminated cells. However, this is an exceptional use case, with nodata cells within the catchment. So, I used the flag -a hoping to obtain the number of upslope cells. I would like to ignore the nodata cells, and can actually do so because they form a channel (which I burned into the DEM to avoid artifical inflow from outside the watershed). So, there should really be no flow out of these nodata cells.

According to your explanation, my flow accumulation of -/+1 for cells 2 and 6 within the rectangle in Figure 2 must be wrong. Cell 6, i.e. the 3rd cell in the 2nd row, only receives input from the 2 cells in row 3 which were correctly - as neighbors of nodata cells - set to -1. So, the weight of cell 6 should also be -1 (only flow from cells with nodata as neighbor) and the flow accumulation should be -/+3 - what I hoped to obtain.

However, even with "regular" inflow, it would be nice to have an option to ignore edge contamination (feature request?). Just assume, we have 4 upslope cells, of which 2 are edge contaminated. So, currently, we would get a flow accumulation of 1, not 5?

in reply to:  2 ; comment:3 by mmetz, 5 years ago

Replying to itati01:

Dear Markus,

Thank you for your comment.

I understand the problem with nodata cells in flow accumulation and would normally drop edge-contaminated cells. However, this is an exceptional use case, with nodata cells within the catchment.

The same rule applies here too: any cell bordering a nodata cell does not distribute flow because it can not be determined where flow should be distributed to.

So, I used the flag -a hoping to obtain the number of upslope cells.

The -a flag simply converts negative flow accumulation (receiving an unknown amount of flow) to positive flow accumulation.

I would like to ignore the nodata cells, and can actually do so because they form a channel (which I burned into the DEM to avoid artifical inflow from outside the watershed). So, there should really be no flow out of these nodata cells.

This is correct, there will be no flow out of nodata cells. But there is an unknown amount of flow from nodata cells.

According to your explanation, my flow accumulation of -/+1 for cells 2 and 6 within the rectangle in Figure 2 must be wrong. Cell 6, i.e. the 3rd cell in the 2nd row, only receives input from the 2 cells in row 3 which were correctly - as neighbors of nodata cells - set to -1. So, the weight of cell 6 should also be -1 (only flow from cells with nodata as neighbor) and the flow accumulation should be -/+3 - what I hoped to obtain.

The other cells that might contribute to cell 6 are cells 8 and 9, both are edge-contaminated, thus cells 8 and 9 can not distribute flow, and thus flow accumulation of cell 6 is -1.

However, even with "regular" inflow, it would be nice to have an option to ignore edge contamination (feature request?).

This would cause flow accumulation to follow edge-contaminated cells which is wrong. This could be easily demonstrated be setting the current region to a small subregion, calculate flow accumulation, then enlarge the region, calculate flow accumulation again, and compare the two results for flow accumulation. The result for the subregion would be obviously wrong.

Just assume, we have 4 upslope cells, of which 2 are edge contaminated. So, currently, we would get a flow accumulation of 1, not 5?

No, if the other 2 upslope cells are not edge-contaminated, we would get a flow accumulation of 3 because the not edge-contaminated cells are allowed to distribute flow, also towards edge-contaminated cells.

Regarding edge-contaminated cells, flow accumulation and flow direction are treated differently. Edge-contaminated cells can not distribute flow because this can cause flow to incorrectly follow edge-contaminated cells. Flow direction for edge-contaminated cells is set to another not-nodata cell if possible: the other cell is downstream, i.e. has a lower elevation value. The reason is that edge-contaminated cells most likely belong to the same catchment like a neighboring lower lying cell. Otherwise all edge-contaminated cells that do not receive flow accumulation from not edge-contaminated cells would form isolated single-cell catchments.

in reply to:  3 comment:4 by itati01, 5 years ago

Your first answer explained already everything. Thank you very much for this exhaustive explanation. I had assumed that negative values are also cumulated. So, I will close this ticket and have to find some other way without setting channel cells to nodata.

comment:5 by itati01, 5 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.