Opened 13 years ago

Closed 13 years ago

Last modified 13 years ago

#4075 closed enhancement (wontfix)

getPoints() for all Geometry classes

Reported by: Marvin Owned by: warmerdam
Priority: normal Milestone:
Component: default Version: unspecified
Severity: critical Keywords: cost getPoint() getPoints() java bindings multi-part single-part speed
Cc:

Description

Hello,

could you please implement a method getPoints() for several sub-classes of the OGRGeometry class? I am talking of the following classes in particular:

OGRPoint OGRMultipoint OGRLineString OGRMultiLineString OGRPolygon OGRMultiPolygon

For all these classes, a getPoints() method could return a data structure of type double.

In case of OGRMultiLineString and OGRMultiPolygon it could be double[][][], with the first dimension indicating the sub-geometry of a multipart geometry, the second dimension indicating the point of that sub-geometry and the third dimension the coordinates.

In all other cases mentioned, it could return a double[][], with the first dimension indicating the point of the geometry and the third dimension the coordinates.

I know getPoints() already exists for OGRLineString. But such a method is clearly missing for the other geometry classes mentioned. Would be very nice to have them.

The background: I am trying to lay the foundations for a Geometry.getPoints() method in the Java bindings that can work with just a single call to the native GDAL library. Currently, one would have to call getPoint(int i) of the native GDAL library repeatedly for each single point of a geometry at an enormous cost. Solving the problem within the original GDAL implementation first, i.e. at the root of the problem, would make the speed enhancement of the bindings easy, since they could rely on existing convenience methods in the native GDAL libraries.

Thanks Marvin

Change History (8)

comment:1 by Even Rouault, 13 years ago

Resolution: wontfix
Status: newclosed

Geometry.GetPoints() has been recently added in the bindings for GDAL 1.9.0. See http://trac.osgeo.org/gdal/changeset/22044/trunk/gdal/swig and http://gdal.org/java/org/gdal/ogr/Geometry.html#GetPoints()

In case of a polygon or a multigeometry, you can iterate over the parts of the geometry with Geometry.GetGeometryRef() and then call Geometry.GetPoints() on them. I fail to see how GetPoints() could return a different signature (double[][] or double[][][] or double[][][][] in Java, but how would that work in C/C++ ???) depending on the geometry type. There is perhaps a way of doing this, but I am afraid it would be quite complicated (and especially to map this to SWIG). So I don't think there is a real benefice in going on this road, as GetPoints() is now a workable solution and should solve the main overhead in going from Java to native code. Hopefully the number of parts of a geometry is generally far less than its number of points.

in reply to:  1 comment:2 by Marvin, 13 years ago

Replying to rouault:

Geometry.GetPoints() has been recently added in the bindings for GDAL 1.9.0. See http://trac.osgeo.org/gdal/changeset/22044/trunk/gdal/swig and http://gdal.org/java/org/gdal/ogr/Geometry.html#GetPoints()

I know, but its only for OGRLineString and OGRPoint.

In case of a polygon or a multigeometry, you can iterate over the parts of the geometry with Geometry.GetGeometryRef() and then call Geometry.GetPoints() on them.

So I don't think there is a real benefice in going on this road, as GetPoints() is now a workable solution and should solve the main overhead in going from Java to native code. Hopefully the number of parts of a geometry is generally far less than its number of points.

Okay, I see now the Geometry.GetGeometryRef() returns an OGRLinearRing for polygons, which is an OGRLineString and has GetPoints(). Obviously, this is done with OGRPolygon::getExteriorRing() and getInteriorRing().

But for OGRMultiPolygon we would still have to call GetGeometryRef() for every part. After each such call another call of GetPoints() must be made. Just think of Canada with its dozens of islands. Its that number of islands times 2 that equals the number of single calls to the native GDAL libraries. Same story for OGRMultiLineString, if you think of the complex delta of rivers for example.

I fail to see how GetPoints() could return a different signature (double[][] or double[][][] or double[][][][] in Java, but how would that work in C/C++ ???) depending on the geometry type. There is perhaps a way of doing this, but I am afraid it would be quite complicated (and especially to map this to SWIG).

It's not really that complicated. My dream is to simplify the whole structure of geometry types from the perspective of the library user. Not having to worry whether the geometry is single- or multi-part. Not having to make recursive calls for some types and no recursive calls for others to get all points in the geometry. Treating every geometry as mulit-part.

It is clear that there is no point structure common to all geometry types. However, one could group types as follows:

All point and multi-point geometries belong to one common type. They can be subsumed under the concept of "point geometries". For them, the GetPoints() should return a double[][], with the first dimension used as the point index of the multi-point and the second for the coordinates. If the first dimension is of length 1, it simply means that it is a single-point geometry.

All linestring and multi-linestring geometries belong to the common type of "line geometries". For them, the GetPoints() should return a double[][][], with the first dimension used as the linestring index of the multi-linestring, the second as the the point index of that linestring, and the third for the coordinates of that point. If the first dimension is of length 1, it simply means that it is a single-linestring geometry.

All polygon and multi-polygon geometries belong to the common type of "polygon geometries". For them, the GetPoints() should return a double[][][][], with the first dimension used as the polygon index of the multi-polygon, the second as the linear ring index of that polygon, the third as the point index of that linear ring and the fourth for the coordinates of that point. If the first dimension is of length 1, it simply means that it is a single-polygon geometry. If the second dimension is of length 1, it simply means that it is not a doughnut.

So how does this fit into GDAL? Well: "point geometries" pertain to OGRPoint and OGRMultiPoint objects. "line geometries" pertain to OGRLineString and OGRMultiLineString objects. "polygon geometries" pertain to OGRPolygon and OGRMultiPolygon objects.

This requires an interface "OGRGeneralPoints" in GDAL that serves as a generalization of OGRPoint and OGRMultiPoint and has a GetPoint() method. This method should return a double[][] as described above. In analogy, there also have to be "OGRGeneralLineStrings" and "OGRGeneralPolygons" interfaces with their GetPoint() methods (returning double[][][] and double[][][][]).

If all this is implemented, a client could then obtain the geometry with OGRFeature::GetGeometry(). I assume that in C++ there is an equivalent way of what "instanceof" does in Java. The client could then find out whether the OGRGeometry returned is of any of the types OGRGeneralPoints, OGRGeneralLines or OGRGeneralPolygons. If so, the client could then cast the OGRGemetry to that type and call the GetPoints() method. Knowing the type in advance, the client could create arrays of type double with the appropriate dimensions to read into.

It would be great if someone could enhance GDAL this way. Otherwise - since I am not primarily a GDAL developer but need the GDAL Java bindings to work efficiently for my own application - I will have to implement the three versions of GetPoints() in some utility class for Java and content myself with somehow exploiting the limited Geometry.GetPoints() and the Geometry.GetGeometryType() for that purpose.

So if anyone likes to, you may reopen this ticket. Otherwise you can leave it closed.

Thank you ;-)

comment:3 by Marvin, 13 years ago

Can anyone at this point already give an outlook on whether this might be implemented or not? I would like to know whether it is worth starting my own solution or not.

Thanks Marvin

comment:4 by Even Rouault, 13 years ago

Personally, I'm not convinced it is really worth the pain. I'm not sure why you introduce new classes for the different geometry types, they already exist in C++ side. So basically your plan still sums up as a GetPoints() method that return different types, which is not allowed in C++. You could of course return a void*, but that wouldn't help to map it to SWIG. The ratio speed improvement/effort is likely low. But this is just my personal opinion and I'm biased since I don't have a crying need for sheer speed. If you wish, you can bring the issue on gdal-dev mailing list, which has more visibility than a bug tracker only scrutinized by a few pair of eyes. Anyway, someone would have to code that...

An easier approach would be to implement (or use an existing one) a wkb decoder on Java side, but I'm not sure of the impact of the extra wkb encoding/decoding.

comment:5 by warmerdam, 13 years ago

I agree with Even. Fetching WKB is the route to go if you really want the whole geometry in one call. How common multi-part polygons are will depend a lot on the particular dataset.

in reply to:  4 comment:6 by Marvin, 13 years ago

Replying to rouault:

Personally, I'm not convinced it is really worth the pain. I'm not sure why you introduce new classes for the different geometry types, they already exist in C++ side.

Yes, they all exist, but the class hierarchy does not allow you to identify their membership to a particular "general type" (point, line or polygon) by means of appropriate common father classes. For example, in GDAL there is OGRPoint and OGRMultiPoint. But they are treated as entirely distinct types, with no common interface connecting them except the very general OGRGeometry (which already subsumes ALL types whatsoever). No way of telling that they are both general point geometries. OGRGeneralPoint could be their father and do the trick, so you then know their GetPoint() would return a double[][].

But anyway, adding such classes may not be that important after all, you could also implement GetPoints() at top level (OGRGeometry) and return double[][][][] for all three general types and throw an Exception for the others. Dimensions not needed could simply be set to 1 in length. This would mean the following:

OGRPoint returns double[1][1][1]?

OGRLineString returns double[1][1]??

OGRPolygon returns double[1]???

OGRMultiPoint returns double[1][1]??

OGRMultiLineString returns double[1]???

OGRMultiPolygon returns double????

At the same time, one could introduce new wkb constants (wkbGeneralPoint, wkbGeneralLine, wkbGeneralPolygon, wkbNoGeneralType), so the GetGeometryType() can then help you interpret the result of GetPoints().

So basically your plan still sums up as a GetPoints() method that return different types, which is not allowed in C++. You could of course return a void*, but that wouldn't help to map it to SWIG.

I thought you can return double* in C++, leaving it for the implementation to decide how many doubles are allocated. But anyway, as I am now saying, one could also just create a common method at top level that returns a double[][][][] for every case, just like I just described now.

The ratio speed improvement/effort is likely low. But this is just my personal opinion and I'm biased since I don't have a crying need for sheer speed. If you wish, you can bring the issue on gdal-dev mailing list, which has more visibility than a bug tracker only scrutinized by a few pair of eyes. Anyway, someone would have to code that...

Hmm...I would agree that your inclusion of getPoints() mapped to OGRLineString::getPoints() already solves the problem significantly. What remains is simply the question - as I said - of countries with dozens of "splinter islands" or river deltas etc. Multipoint features should be the least problem (rare, dot-density-maps do not reflect actual geometry, but are generated on-the-fly).

But what I really mean is not merely performance speed, but also programming speed, i.e. convenience. And convenience improves significantly with an easy-to-use interface that hides the complexity of type-dependent methods of point extraction (from potentially nested geometries).

An easier approach would be to implement (or use an existing one) a wkb decoder on Java side, but I'm not sure of the impact of the extra wkb encoding/decoding.

I am not sure whether I got what you mean by a "wkb decoder". I assume its what I call "getGeneralType()" in my now implemented Java-side solution to my convenience problem:

package utils.gdal;

import org.gdal.ogr.Geometry;
import org.gdal.ogr.ogrConstants;

public class GeometryUtils {

	public static final byte wkbGeneralPoint = 0;
	
	public static final byte wkbGeneralLine = 1;
	
	public static final byte wkbGeneralPolygon = 2;
	
	public static final byte wkbNoGeneralType = 3;
	
	public static final byte wkbNoType = 4;
	
	public static byte getGeneralType(Geometry geometry) {
		
		if (geometry == null) 
			return wkbNoType;
		
		switch (geometry.GetGeometryType()) {
		
		case ogrConstants.wkbPoint :
		case ogrConstants.wkbPoint25D : 
		case ogrConstants.wkbMultiPoint :
		case ogrConstants.wkbMultiPoint25D :
			return wkbGeneralPoint;
		
		case ogrConstants.wkbLineString :
		case ogrConstants.wkbLineString25D :
		case ogrConstants.wkbMultiLineString :
		case ogrConstants.wkbMultiLineString25D :
			return wkbGeneralLine;
		
		case ogrConstants.wkbPolygon :
		case ogrConstants.wkbPolygon25D :
		case ogrConstants.wkbMultiPolygon :
		case ogrConstants.wkbMultiPolygon25D :
			return wkbGeneralPolygon;
		
		default:
			return wkbNoGeneralType;
		
		}
	
	}
	
	public static double[][][][] getPoints(Geometry geometry) throws Exception {
		
		if (geometry == null) throw new IllegalArgumentException("The basic geometry type must be either point, line or polygon.");
		
		switch (geometry.GetGeometryType()) {
		
		case ogrConstants.wkbPoint :
		case ogrConstants.wkbPoint25D : 
		case ogrConstants.wkbLineString :
		case ogrConstants.wkbLineString25D :
			return new double[][][][]{{geometry.GetPoints()}};
		
		case ogrConstants.wkbPolygon :
		case ogrConstants.wkbPolygon25D :
			double [][][] plgnpnts = new double[geometry.GetGeometryCount()][][];
			for (int i = 0; i < geometry.GetGeometryCount(); i++) {
				plgnpnts[i] = geometry.GetGeometryRef(i).GetPoints();
			}
			return new double[][][][]{plgnpnts};
		
		case ogrConstants.wkbMultiPoint :
		case ogrConstants.wkbMultiPoint25D :
			double[][] mltpnts = new double[geometry.GetGeometryCount()][];
			for (int i = 0; i < geometry.GetGeometryCount(); i++) 
				mltpnts[i] = geometry.GetGeometryRef(i).GetPoint(0);
			return new double[][][][]{{mltpnts}};
		
		case ogrConstants.wkbMultiLineString :
		case ogrConstants.wkbMultiLineString25D :
			double[][][] mltlnpts = new double[geometry.GetGeometryCount()][][];
			for (int i = 0; i < geometry.GetGeometryCount(); i++) 
				mltlnpts[i] = geometry.GetGeometryRef(i).GetPoints();
			return new double[][][][]{mltlnpts};
		
		case ogrConstants.wkbMultiPolygon :
		case ogrConstants.wkbMultiPolygon25D :
			double[][][][] mltpgpts =  new double[geometry.GetGeometryCount()][][][];
			for (int i = 0; i < geometry.GetGeometryCount(); i++) {
				Geometry plgn = geometry.GetGeometryRef(i);
				for (int j = 0; j < plgn.GetGeometryCount(); j++)
					mltpgpts[i][j] = plgn.GetGeometryRef(j).GetPoints();
			}
			return mltpgpts;
		
		default :
			throw new IllegalArgumentException("The basic geometry type must be either point, line or polygon.");
		}
		
	}

}

I haven't tested any of this yet and there might be some errors, but generally I believe this should work. Hopefully, it becomes much clearer now what my intention is: You can obtain all points of a point, line or polygon geometry - whether single- or multi-part - with one single call of getPoints() and obtain the general type with another single call of getGeneralType() if necessary. With both the coordinates and the general type meta-information, you can then run loops for systematically drawing the geometries (fill and outline) into a map. Never ever worrying about any single- vs multi-part distinction in geometry types. Thereby reducing the number of needed drawing loop types (point, line and polygon) from 6 to 3. And never ever worrying about traversing hierarchies of method invocations to obtain the coordinates.

I think this is much more convenient and thus much more fun to use.

You are free to use this for the C++ version, should you wish. No warranty, of course, at this point ;-)

comment:7 by warmerdam, 13 years ago

I'm not sure I followed the whole discussion, but I'd like to stress that selection of the OGC simple features geometry model for OGR is strategic, and I'd wish to complicate OGR with an alternate geometry class hierarchy at odds with the OGC Simple Features geometry model.

comment:8 by Even Rouault, 13 years ago

You probably meant "...I would *not* wish to complicate..." ? I also agree. Marvin is completely free on using his implementation based on double[][][][] on Java side if it's more convenient for him, but at that point, there's no compelling reason to include it in OGR.

Note: See TracTickets for help on using tickets.