Changes between Version 75 and Version 76 of rfc24_progressive_data_support
- Timestamp:
- Feb 2, 2010, 12:13:27 PM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
rfc24_progressive_data_support
v75 v76 1 1 = RFC 24: GDAL Progressive Data Support = 2 2 3 Author: Norman Barker [[BR]]4 Contact: nbarker@ittvis.com [[BR]]3 Author: Norman Barker, Frank Warmerdam[[BR]] 4 Contact: nbarker@ittvis.com, warmerdam@pobox.com[[BR]] 5 5 Status: Development 6 6 7 7 == Summary == 8 8 9 Provide an interface for data streaming support in GDAL. The RFC focuses on JPIP, but should be generic enough to apply to other streaming / progressive formats.9 Provide an interface for asynchronous/streaming data access in GDAL. The initial implementation is for JPIP, but should be generic enough to apply to other streaming / progressive approaches. Background on the JPIP (Kakadu) implementation can be found in [wiki:rfc24_jpipkak]. 10 10 11 == Definitions ==11 == Interfaces == 12 12 13 '''JPIP''': JPEG 2000 Interactive Protocol 13 === GDALAsyncRasterIO === 14 14 15 == Objective == 15 This new class is intended to represent an active asynchronous raster imagery request. The request includes information on a source window on the dataset, a target buffer size (implies level of decimation or replication), the buffer type, buffer interleaving, data buffer and bands being requested. Essentially the same sort of information that is passed in a GDALDataset::!RasterIO() request. 16 16 17 T o provide an interface to a streaming data source in a convenient manner for RasterIO operations within GDAL and to expose this interface via swig.17 The !GetNextUpdatedRegion() method can be used to wait for an update to the imagery buffer, and to find out what area was updated. The !LockBuffer() and !UnlockBuffer() methods can be used to temporarily disable updates to the buffer while application code accesses the buffer. 18 18 19 == JPIPKAK - JPIP Streaming == 19 While an implementation of the simple accessors is provided as part of the class, it is intended that the class be subclassed as part of implementation of a particular driver, and custom implementations of !GetNextUpdatedRegion(), !LockBuffer() and !UnlockBuffer() provided. 20 20 21 JPEG 2000 Interactive Protocol (JPIP) flexibility with respect to random access, code stream reordering and incremental decoding is highly exploitable in a networked environment allowing access to remote large files using limited bandwidth connections or high contention networks. 21 {{{ 22 class CPL_DLL GDALAsyncRasterIO 23 { 24 protected: 25 GDALDataset* poDS; 26 int xOff; 27 int yOff; 28 int xSize; 29 int ySize; 30 void * pBuf; 31 int bufXSize; 32 int bufYSize; 33 GDALDataType bufType; 34 int nBandCount; 35 int* pBandMap; 36 int nPixelSpace; 37 int nLineSpace; 38 int nBandSpace; 39 long nDataRead; 40 public: 41 GDALAsyncRasterIO(GDALDataset* poDS = NULL); 42 virtual ~GDALAsyncRasterIO(); 22 43 23 === JPIPKAK - JPIP Overview === 44 GDALDataset* GetGDALDataset(){return poDS;} 45 int GetXOffset(){return xOff;} 46 int GetYOffset(){return yOff;} 47 int GetXSize(){return xSize;} 48 int GetYSize(){return ySize;} 49 void * GetBuffer(){return pBuf;} 50 int GetBufferXSize(){return bufXSize;} 51 int GetBufferYSize(){return bufYSize;} 52 GDALDataType GetBufferType(){return bufType;} 53 int GetBandCount(){return nBandCount;} 54 int* GetBandMap(){return pBandMap;} 55 int GetPixelSpace(){return nPixelSpace;} 56 int GetLineSpace(){return nLineSpace;} 57 int GetBandSpace(){return nBandSpace;} 58 int GetNDataRead(){return nDataRead;} 59 60 /* Returns GARIO_UPDATE, GARIO_NO_MESSAGE (if pending==false and nothing in the queue or if pending==true && timeout != 0 and nothing in the queue at the end of the timeout), GARIO_COMPLETE, GARIO_ERROR */ 61 virtual GDALAsyncStatusType GetNextUpdatedRegion(bool wait, int timeout, 62 int* pnxbufoff, 63 int* pnybufoff, 64 int* pnxbufsize, 65 int* pnybufsize) = 0; 66 /* if pending = true, we wait forever if timeout=0, for the timeout time otherwise */ 67 /* if pending = false, we return immediately */ 68 /* the int* are output values */ 24 69 25 A brief overview of the JPIP event sequence is presented in this section, more information can be found at [http://www.jpeg.org/jpeg2000/j2kpart9.html JPEG 2000 Interactive Protocol (Part 9 – JPIP)] and the specification can (and should) be purchased from [http://www.iso.org/ ISO]. 70 // lock a whole buffer. 71 virtual void LockBuffer() = 0; 26 72 27 An earlier version of JPEG 2000 Part 9 is available here [http://www.jpeg.org/public/fcd15444-9v2.pdf], noting the ISO copyright, diagrams are not replicated in this documentation. 73 // lock only a block 74 // the caller must relax a previous lock before asking for a new one 75 virtual void LockBuffer(int xbufoff, int ybufoff, int xbufsize, int ybufsize) = 0; 76 virtual void UnlockBuffer() = 0; 77 78 friend class GDALDataset; 79 }; 80 }}} 28 81 29 The JPIP protocol has been abstracted in this format driver, requests are made at the 1:1 resolution level.82 The async status list is as follows, and will be declared in gdal.h. 30 83 31 [[Image(sequence.png)]] 84 {{{ 85 typedef enum 86 { 87 GARIO_PENDING = 0, 88 GARIO_UPDATE = 1, 89 GARIO_ERROR = 2, 90 GARIO_COMPLETE = 3, 91 GARIO_TypeCount = 4 92 } GDALAsyncStatusType; 93 }}} 32 94 33 1. Initial JPIP request for a target image, a target id, a session over http, data to be returned as a jpp-stream are requested and a maximum length is put on the response. In this case no initial window is requested, though it can be. Server responds with a target identifier that can be used to identify the image on the server and a JPIP-cnew response header which includes the path to the JPIP server which will handle all future requests and a cid session identifier. A session is required so that that the server can model the state of the client connection, only sending the data that is required. 34 1. Client requests particular view windows on the target image with a maximum response length and includes the session identifier established in the previous communication. 'fsiz' is used to identify the resolution associated with the requested view-window. The values 'fx' and 'fy' specify the dimensions of the desired image resolution. 'roff' is used to identify the upper left hand corner off the spatial region associated with the requested view-windw. 'rsiz' is used to identify the horizontal and vertical extents of the spatial region associated with the requested view-window. 95 === GDALDataset === 35 96 36 === JPIPKAK - approach === 97 The GDALDataset class is extended with methods to create an asynchronous reader, and to cleanup the asynchronous reader. It is intended that these methods would be subclassed by drivers implementing asynchronous data access. 37 98 38 The JPIPKAK driver uses an approach that was first demonstrated here, [http://www.drc-dev.ohiolink.edu/browser/J2KViewer J2KViewer], by Juan Pablo Garcia Ortiz of separating the communication layer (socket / http) from the Kakadu kdu_cache object. Separating the communication layer from the data object is desirable since it allows the use of optimized http client libraries such as libcurl, Apache HttpClient (note that jportiz used a plain Java socket) and allows SSL communication between the client and server. 99 {{{ 100 virtual GDALAsyncRasterIO* 101 BeginAsyncRasterIO(int xOff, int yOff, 102 int xSize, int ySize, 103 void *pBuf, 104 int bufXSize, int bufYSize, 105 GDALDataType bufType, 106 int nBandCount, int* bandMap, 107 int nPixelSpace, int nLineSpace, 108 int nBandSpace, 109 char **papszOptions); 110 virtual void EndAsyncRasterIO(GDALAsyncRasterIO *); 111 }}} 39 112 40 Kakadu's implementation of client communication with a JPIP server uses a socket, and this socket connection holds the state for this client session. A client session with Kakadu can be recreated using the JPIP cache operations between client and server, but no use of traditional HTTP cookies is supported since JPIP is neutral to the transport layer.113 It is expected that as part of gdal/gcore a default !GDALAsyncRasterIO implementation will be provided that just uses GDALDataset::!RasterIO() to perform the request as a single blocking request. However, this default implementation will ensure that applications can use the asynchronous interface without worrying whether a particular format will actually operate asynchronously. 41 114 42 The JPIPKAK driver is written using a HTTP client library with the Kakadu cache object and supports optimized communication with a JPIP server (which may or may not support HTTP sessions) and the high performance of the kakadu kdu_region_decompressor. 115 === GDALDriver === 43 116 44 [[Image(components.PNG)]] 117 In order to provide a hint to applications whether particular formats support asynchronous IO, we will add a new metadata item on the GDALDriver of implementing formats. The metadata item will be "DCAP_ASYNCIO" (macro GDAL_DCAP_ASYNCIO) and will have the value "YES" if asynchronous IO is available. 45 118 46 === JPIPKAK - implementation === 119 Implementing drivers will do something like this in their driver setup code: 47 120 48 The implementation supports the GDAL C++ and C API, and provides an initial SWIG wrapper for this driver with a Java ImageIO and Python example using SWIG. 121 {{{ 122 poDriver->SetMetadataItem( GDAL_DCAP_ASYNCIO, "YES" ); 123 }}} 49 124 50 [[Image(demoviewer.PNG)]] 125 === GDALRasterBand === 51 126 52 The driver uses a simple threading model to support requesting reads of the data and remote fetching. This threading model supports two separate client windows, with just one connection to the server. Requests to the server are multiplexed to utilize available bandwidth efficiently. The client identifies these windows by using “0” (low) or “1” (high) values to a “PRIORITY” metadata request option.127 There are no changes to the GDALRasterBand interface for asynchronous raster IO. Asynchronous IO requests can only be made at the dataset level, not the band. 53 128 54 ''Note: SSL support'' 129 === CPLHTTPFetch() === 55 130 56 ''If the client is built with support for SSL, then driver determines whether to use SSL if the request is a jpips:// protocol as opposed to jpip://.'' 57 ''Note that the driver does not verify server certificates using the Curl certificate bundle and is currently set to accept all SSL server certificates.'' 131 The initial JPIPKAK implementation of asynchronous IO requests will use !CPLHTTPFetch() for the JPIP network transport of requests. This will require two improvements to the implementation without changing the call sequence. 58 132 59 ''Note: libCurl'' 133 1. A new boolean option can be passed in the options list called "PERSISTENT". When it is true (ie. value "YES") persistent connection handle will be used. 134 2. A new option named "HEADERS" can be used to send an additional header in the HTTP request. The JPIPKAK driver will pass accept headers this way. 60 135 61 ''JPIP sets client/server values using HTTP headers, modifications have been made to the GDAL HTTP portability library to support this.''62 63 [[Image(gdalsequence.PNG)]]64 65 1. GDALGetDatasetDriver66 67 Fetch the driver to which this dataset relates.68 69 2. Open70 71 If the filename contained in the `GDALOpenInfo` object has a case insensitive URI scheme of jpip or jpips the `JPIPKAKDataset` is created and initialised, otherwise NULL is returned.72 73 3. Initialize74 75 Initialisation involves making an initial connection to the JPIP Server to establish a session and to retrieve the initial metadata about the image (ref. JPIP Sequence Diagram).76 77 If the connection fails, the function returns false and the Open function returns NULL indicating that opening the dataset with this driver failed.78 79 If the connection is successful, then subsequent requests to the JPIP server are made to retrieve all the available metadata about the image. Metadata items are set using the `GDALMajorObject->SetMetadataItem` in the "JPIP" domain.80 81 If the metadata returned from the server includes GeoJP2 UUID box, or a GMLJP2 XML box then this metadata is parsed and sets the geographic metadata of this dataset.82 83 4. GDALGetMetadata84 85 C API to `JPIPKAKDataset->GetMetadata`86 87 5. !GetMetadata88 89 returns metadata for the "JPIP" domain, keys are "JPIP_NQUALITYLAYERS", "JPIP_NRESOLUTIONLEVELS", "JPIP_NCOMPS" and "JPIP_SPRECISION"90 91 6. GDALEndAsyncRasterIO92 93 If the asynchronous raster IO is active and not required, the C API calls `JPIPKAKDataset->EndAsyncRasterIO`94 95 7. EndAsyncRasterIO96 97 The JPIPKAKAsyncRasterIO object is deleted98 99 8. delete100 101 9. GDALBeginAsyncRasterIO102 103 C API to `JPIPKAKDataset->BeginAsyncRasterIO`104 105 10. BeginAsyncRasterIO106 107 The client has set the requested view window at 1:1 and have optionally set the discard level, quality layers and thread priority metadata items.108 109 11. Create110 111 Creates a JPIPKAKAsyncRasterIO Object112 113 12. Start114 115 Configures the kakadu machinery and starts a background thread (if not already running) to communicate to the server the current view window request. The background thread results in the kdu_cache object being updated until the JPIP server sends an "End Of Response" (EOR) message for the current view window request.116 117 13. GDALLockBuffer118 119 C API to !LockBuffer120 121 14. !LockBuffer122 123 Not implemented in `JPIPKAKAsyncRasterIO`, a lock is acquired in `JPIPKAKAsyncRasterIO->GetNextUpdatedRegion`124 125 15. GDALGetNextUpdatedRegion126 127 C API to !GetNextUpdatedRegion128 129 16. !GetNextUpdatedRegion130 131 The function decompresses the available data to generate an image (according to the dataset buffer type set in `JPIPKAKDataset->BeginAsyncRasterIO`) The window width, height (at the requested discard level) decompressed is returned in the region pointer and can be rendered by the client. The status of the rendering operation is one of `GARIO_PENDING`, `GARIO_UPDATE`, `GARIO_ERROR`, `GARIO_COMPLETE` from the `GDALAsyncStatusType` structure. `GARIO_UPDATE`, `GARIO_PENDING` require more reads of `GetNextUpdatedRegion` to get the full image data, this is the progressive rendering of JPIP. `GARIO_COMPLETE` indicates the window is complete.132 133 `GDALAsyncStatusType` is a structure used by `GetNextUpdatedRegion` to indicate whether the function should be called again when either kakadu has more data in its cache to decompress, or the server has not sent an End Of Response (EOR) message to indicate the request window is complete.134 135 The region passed into this function is passed by reference, and the caller can read this region when the result returns to find the region that has been decompressed. The image data is packed into the buffer, e.g. RGB if the region requested has 3 components.136 137 17. GDALUnlockBuffer138 139 C API to !UnlockBuffer140 141 18. !UnlockBuffer142 143 Not implemented in `JPIPKAKAsyncRasterIO`, a lock is acquired in `JPIPKAKAsyncRasterIO->GetNextUpdatedRegion`144 145 19. Draw146 147 Client renders image data148 149 20. GDALLockBuffer150 151 21. !LockBuffer152 153 22. GDALGetNextUpdatedRegion154 155 23. !GetNextUpdatedRegion156 157 24. GDALUnlockBuffer158 159 25. !UnlockBuffer160 161 26. Draw162 163 === JPIPKAK - installation requirements ===164 165 * [http://curl.haxx.se/ Libcurl 7.9.4]166 * [http://www.openssl.org/ OpenSSL 0.9.8K](if SSL is required, a JPIPS connection)167 * [http://www.kakadusoftware.com/ Kakadu] (tested with v5.2.6 and v6)168 169 Currently only a Windows makefile is provided, however this should compile on Linux as well as there are no Windows dependencies.170 171 See Also:172 173 * [http://www.jpeg.org/jpeg2000/j2kpart9.html JPEG 2000 Interactive Protocol (Part 9 – JPIP)]174 * [http://www.opengeospatial.org/standards/gmljp2 http://www.opengeospatial.org/standards/gmljp2]175 * [http://www.kakadusoftware.com/ Kakadu Software ]176 * [http://iasdemo.ittvis.com/ IAS demo (example JPIP(S) streams)]