| 719 | |
| 720 | ---- |
| 721 | {{{ |
| 722 | #!html |
| 723 | <div style='float: right; width:100px;'> |
| 724 | <a href="#PostGISRasterSoCIdea2012-DistanceAnalysisToolsforPostGISRaster"> <img src="https://lh3.googleusercontent.com/-YR7gUJULQrM/T8nEPQ-AtzI/AAAAAAAAA1M/T_6vRjE8E4Q/s40/scroll_to_top_icon_40x40.png"><br>back to top</a> |
| 725 | </div> |
| 726 | }}} |
| 727 | |
| 728 | == Week 8 Report == |
| 729 | |
| 730 | '''What did you get done this week?''' |
| 731 | * Did research on the scanning algorithm used by GDAL and GRASS |
| 732 | |
| 733 | * Second revised document of "Creating an Euclidean Distance Raster in PostGIS" is here: |
| 734 | [http://trac.osgeo.org/postgis/wiki/PostGIS_Raster_SoC_Idea_2012/Distance_Analysis_Tools/document] |
| 735 | [[BR]][[BR]] |
| 736 | '''"gdalproximity" for euclidean distance in GDAL and the "r.grow.distance" in GRASS''' |
| 737 | |
| 738 | * Both of them require source features in raster format |
| 739 | * Both of them create distance raster (and temporary file during computation) from the source raster, which dosenot allow source features to fall outside of the resulted distance raster |
| 740 | * Both of them use linescan to iterate among pixels to find the nearest target to the current pixel and the shortest distance |
| 741 | * "gdalproximity" in GDAL allocates buffer for two scanlines of distance, one loop from top to bottom of the raster, the other loop from bottom to top. For each current rows, it scans colums from left to right, and right to left. Each scanline also iterate among the source points, and replace current distance value with the shorter one (from above,below,left,right) if that applies. (Although ArcGIS doesn't reveal their scanning algorithm, from what was briefly described in their documentation, it is very likely that they are using a similar way of scanning method) |
| 742 | * "r.grow.distance" in GRASS dose it in another way. It computes the distance to the source points from current pixel at the same time of creating the output distance raster. It iterates among rows and then columns in each scan row, scanning left, right, and then top-left,above,top-right, replacing distance value while shorter one found. And then does another scanning while writing the output distance raster, in a reversed way of scanning (first top-left,above, top-right, and then left and right). |
| 743 | * Both methods could be very inefficient if the target points (features) come in a very large size. |
| 744 | |
| 745 | '''Source code:''' |
| 746 | |
| 747 | * "gdalproximity" (both c++ and java implementation): |
| 748 | [http://code.metager.de/source/xref/gdal/alg/gdalproximity.cpp] |
| 749 | [http://code.metager.de/source/xref/gdal/swig/java/apps/GDALProximity.java] |
| 750 | |
| 751 | * grass/r.grow.distance (C implementation) |
| 752 | [http://trac.osgeo.org/grass/browser/grass/branches/develbranch_6/raster/r.grow.distance/main.c] |
| 753 | |
| 754 | {{{ |
| 755 | '''Comments from Mentor''' |
| 756 | >> Do those algorithms have known names? |
| 757 | |
| 758 | > They are called "sequential algorithms" in distance mapping, |
| 759 | with which the distance surface will ideally be created across the |
| 760 | entire image(raster) in one scan. ArcGIS also uses sequential |
| 761 | algorithms in the scanning process. |
| 762 | |
| 763 | > The distance for the current pixel |
| 764 | under scanning is computed using recently computed values from the |
| 765 | present scan in the neighborhood. For example, in one row scan, |
| 766 | Dist_row(col) is determined by comparing current Dist_row(col) with |
| 767 | the distance value assigned in its neighbor (say, Dist_row(col-1)) |
| 768 | plus the distance from current pixel to its neighbor. If the computed |
| 769 | distance is greater than Dist_row(col), then do nothing; otherwise |
| 770 | Dist_row(col) will be replaced with the newly computed distance. |
| 771 | |
| 772 | > GRASS's "r.grow.distance" is computing octagonal distance, while GDAL |
| 773 | "gdalproximity" is doing a chessboard scanning manner. GRASS |
| 774 | "r.grow.distance" creates the distance surface in a "growing" manner |
| 775 | from the source points, computing distance from it's neighbor to the |
| 776 | left/right/above/top-left/top-right. GDAL computes from the neighbor |
| 777 | to the left/right, above/below, topright/bottomleft. |
| 778 | |
| 779 | >> What is important here is "how many times each algorithm scan the whole |
| 780 | >> raster?". Is one more efficient than the other one? |
| 781 | |
| 782 | > In terms of scanline, both approaches do only one scan for the whole |
| 783 | raster, row by row. |
| 784 | In terms of pixels, |
| 785 | GDAL "gdalproximity" scan all the pixel once. It has two scanlines |
| 786 | from top to bottom and bottom |
| 787 | to top, then for each row, it also has two scanlines from right to |
| 788 | left and left to right. |
| 789 | (ArcGIS says they are using a "two-scan sequential process", however, |
| 790 | they didn't explain it in details in the document, so I would assume |
| 791 | they are using similar algorithms as GDAL.) |
| 792 | |
| 793 | > GRASS actually scans columns in each row 4 times. The first time is to |
| 794 | assign distance value of "0" to source pixels. Then 3 times for the |
| 795 | neighbor to the left, right, and topleft/above/topright. |
| 796 | |
| 797 | > So in terms of scanning times, it looks like GDAL is more efficient. |
| 798 | |
| 799 | >> To which one of those two algorithms our approach is similar (or |
| 800 | >> comparable)? |
| 801 | |
| 802 | > Our approach will scan the whole |
| 803 | raster only once. So we could use a similar scanning algorithm as GDAL |
| 804 | in terms of utilizing scanlines to do multiple scans simultaneously. |
| 805 | |
| 806 | >> Could you describe, in two short sentences, how each of them decide which |
| 807 | >> source is the nearest for each pixel? |
| 808 | |
| 809 | > I think both approaches don't have a specific |
| 810 | process to actually determine the nearest source pixel to the current |
| 811 | one, but use this sequential scanning to replace distance with shorter |
| 812 | one from their neighbors. So it's like an accumulative method and |
| 813 | scanning in a "growing" manner. I actually think this algorithms could |
| 814 | be reusable for cost-weighted distance computation. |
| 815 | |
| 816 | >> Are you still confident in our approach now that you understand better |
| 817 | >> those two algorithms? Why? |
| 818 | |
| 819 | > Yes, I think our approach utilizing KNN indexing will show its |
| 820 | efficiency while dealing with vary large source dataset and very high |
| 821 | resolution resulted distance raster. Because both methods in GDAL and |
| 822 | GRASS require the source data come in a raster format, and create |
| 823 | resulted raster from the input source raster, and also will create |
| 824 | temporary files during the computing process which will be very memory |
| 825 | intensive when dealing with large dataset. Also, they don't provide |
| 826 | the scalability for large tiled coverage and flexibility for source |
| 827 | data being outside of the resulted distance extent. |
| 828 | }}} |
| 829 | |
| 830 | |
| 831 | [[BR]][[BR]] |
| 832 | ''' What do you plan on doing next week?''' |
| 833 | |
| 834 | * Continue to work on implementing a plpgsql prototype for Approach 3 in the document |