Changes between Version 5 and Version 6 of rfc45_virtualmem
- Timestamp:
- Jan 8, 2014, 4:44:41 AM (11 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
rfc45_virtualmem
v5 v6 62 62 persistent storage. 63 63 64 We also offer an alternative way of creating a CPLVirtualMem object, by using 65 memory file mapping mechanisms. This may be used by "raw" datasets (EHdr driver 66 for example) where the organization of data on disk directly matches the 67 organization of a in-memory array. 68 64 69 ==== High-level usage ==== 65 70 … … 84 89 * GDALRasterBandGetTiledVirtualMem(): equivalent of GDALDatasetGetTiledVirtualMem() that operates on a raster band object rather than a dataset object. 85 90 91 * GDALGetVirtualMemAuto(): simplified version of GDALRasterBandGetVirtualMem() where 92 the user only specifies the access mode. The pixel spacing and line spacing are 93 returned by the function. This is implemented as a virtual method at the GDALRasterBand 94 level, so that drivers have a chance of overriding the base implementation. The 95 base implementation justs uses GDALRasterBandGetVirtualMem(). Overriden implementation 96 may use the memory file mapping mechanism instead. Such implementations will be done 97 in the RawRasterBand object and in the GeoTIFF driver. 98 86 99 == Details of new API == 87 100 … … 89 102 90 103 {{{ 91 92 104 /** 93 105 * \file cpl_virtualmem.h … … 103 115 * This exploits low-level mechanisms of the operating system (virtual memory 104 116 * allocation, page protection and handler of virtual memory exceptions). 117 * 118 * It is also possible to create a virtual memory mapping from a file or part 119 * of a file. 105 120 * 106 121 * The current implementation is Linux only. … … 142 157 void* pUserData); 143 158 159 /** Callback triggered when a virtual memory mapping is destroyed. 160 * @param pUserData user data that was passed to CPLVirtualMemNew(). 161 */ 144 162 typedef void (*CPLVirtualMemFreeUserData)(void* pUserData); 145 163 … … 221 239 void *pCbkUserData); 222 240 241 242 /** Return if virtual memory mapping of a file is available. 243 * 244 * @return TRUE if virtual memory mapping of a file is available. 245 * @since GDAL 2.0 246 */ 247 int CPL_DLL CPLIsVirtualMemFileMapAvailable(void); 248 249 /** Create a new virtual memory mapping from a file. 250 * 251 * The file must be a "real" file recognized by the operating system, and not 252 * a VSI extended virtual file. 253 * 254 * In VIRTUALMEM_READWRITE mode, updates to the memory mapping will be written 255 * in the file. 256 * 257 * On Linux AMD64 platforms, the maximum value for nLength is 128 TB. 258 * On Linux x86 platforms, the maximum value for nLength is 2 GB. 259 * 260 * Only supported on Linux for now. 261 * 262 * @param fp Virtual file handle. 263 * @param nOffset Offset in the file to start the mapping from. 264 * @param nLength Length of the portion of the file to map into memory. 265 * @param eAccessMode Permission to use for the virtual memory mapping. This must 266 * be consistant with how the file has been opened. 267 * @param pfnFreeUserData callback that is called when the object is destroyed. 268 * @param pCbkUserData user data passed to pfnFreeUserData. 269 * @return a virtual memory object that must be freed by CPLVirtualMemFree(), 270 * or NULL in case of failure. 271 * 272 * @since GDAL 2.0 273 */ 274 CPLVirtualMem CPL_DLL *CPLVirtualMemFileMapNew( VSILFILE* fp, 275 vsi_l_offset nOffset, 276 vsi_l_offset nLength, 277 CPLVirtualMemAccessMode eAccessMode, 278 CPLVirtualMemFreeUserData pfnFreeUserData, 279 void *pCbkUserData ); 280 281 /** Create a new virtual memory mapping derived from an other virtual memory 282 * mapping. 283 * 284 * This may be usefull in case of creating mapping for pixel interleaved data. 285 * 286 * The new mapping takes a reference on the base mapping. 287 * 288 * @param pVMemBase Base virtual memory mapping 289 * @param nOffset Offset in the base virtual memory mapping from which to start 290 * the new mapping. 291 * @param nSize Size of the base virtual memory mapping to expose in the 292 * the new mapping. 293 * @param pfnFreeUserData callback that is called when the object is destroyed. 294 * @param pCbkUserData user data passed to pfnFreeUserData. 295 * @return a virtual memory object that must be freed by CPLVirtualMemFree(), 296 * or NULL in case of failure. 297 * 298 * @since GDAL 2.0 299 */ 300 CPLVirtualMem CPL_DLL *CPLVirtualMemDerivedNew(CPLVirtualMem* pVMemBase, 301 vsi_l_offset nOffset, 302 vsi_l_offset nSize, 303 CPLVirtualMemFreeUserData pfnFreeUserData, 304 void *pCbkUserData); 305 223 306 /** Free a virtual memory mapping. 224 307 * … … 260 343 size_t CPL_DLL CPLVirtualMemGetSize(CPLVirtualMem* ctxt); 261 344 345 /** Return if the virtal memory mapping is a direct file mapping. 346 * 347 * @param ctxt context returned by CPLVirtualMemNew(). 348 * @return TRUE if the virtal memory mapping is a direct file mapping. 349 * 350 * @since GDAL 2.0 351 */ 352 int CPL_DLL CPLVirtualMemIsFileMapping(CPLVirtualMem* ctxt); 353 354 /** Return the access mode of the virtual memory mapping. 355 * 356 * @param ctxt context returned by CPLVirtualMemNew(). 357 * @return the access mode of the virtual memory mapping. 358 * 359 * @since GDAL 2.0 360 */ 361 CPLVirtualMemAccessMode CPL_DLL CPLVirtualMemGetAccessMode(CPLVirtualMem* ctxt); 362 262 363 /** Return the page size associated to a virtual memory mapping. 263 364 * … … 353 454 354 455 {{{ 355 356 456 357 457 /** Create a CPLVirtualMem object from a GDAL dataset object. … … 367 467 * The pointer to access the virtual memory object is obtained with 368 468 * CPLVirtualMemGetAddr(). It remains valid until CPLVirtualMemFree() is called. 469 * CPLVirtualMemFree() must be called before the dataset object is destroyed. 369 470 * 370 471 * If p is such a pointer and base_type the C type matching eBufType, for default … … 481 582 char **papszOptions ); 482 583 483 484 485 /** Create a CPLVirtualMem object from a GDAL dataset object. 584 ** Create a CPLVirtualMem object from a GDAL raster band object. 486 585 * 487 586 * Only supported on Linux for now. … … 495 594 * The pointer to access the virtual memory object is obtained with 496 595 * CPLVirtualMemGetAddr(). It remains valid until CPLVirtualMemFree() is called. 596 * CPLVirtualMemFree() must be called before the raster band object is destroyed. 497 597 * 498 598 * If p is such a pointer and base_type the C type matching eBufType, for default … … 578 678 * @since GDAL 2.0 579 679 */ 580 581 typedef enum582 {583 /*! Tile Interleaved by Pixel: tile (0,0) with internal band interleaved584 by pixel organization, tile (1, 0), ... */585 GTO_TIP,586 /*! Band Interleaved by Tile : tile (0,0) of first band, tile (0,0) of second587 band, ... tile (1,0) of fisrt band, tile (1,0) of second band, ... */588 GTO_BIT,589 /*! Band SeQuential : all the tiles of first band, all the tiles of following band... */590 GTO_BSQ591 } GDALTileOrganization;592 680 593 681 CPLVirtualMem CPL_DLL* GDALRasterBandGetVirtualMem( GDALRasterBandH hBand, … … 604 692 char **papszOptions ); 605 693 694 typedef enum 695 { 696 /*! Tile Interleaved by Pixel: tile (0,0) with internal band interleaved 697 by pixel organization, tile (1, 0), ... */ 698 GTO_TIP, 699 /*! Band Interleaved by Tile : tile (0,0) of first band, tile (0,0) of second 700 band, ... tile (1,0) of fisrt band, tile (1,0) of second band, ... */ 701 GTO_BIT, 702 /*! Band SeQuential : all the tiles of first band, all the tiles of following band... */ 703 GTO_BSQ 704 } GDALTileOrganization; 606 705 607 706 /** Create a CPLVirtualMem object from a GDAL dataset object, with tiling … … 627 726 * The pointer to access the virtual memory object is obtained with 628 727 * CPLVirtualMemGetAddr(). It remains valid until CPLVirtualMemFree() is called. 728 * CPLVirtualMemFree() must be called before the dataset object is destroyed. 629 729 * 630 730 * If p is such a pointer and base_type the C type matching eBufType, for default … … 718 818 char **papszOptions ); 719 819 720 721 820 /** Create a CPLVirtualMem object from a GDAL rasterband object, with tiling 722 821 * organization … … 740 839 * The pointer to access the virtual memory object is obtained with 741 840 * CPLVirtualMemGetAddr(). It remains valid until CPLVirtualMemFree() is called. 841 * CPLVirtualMemFree() must be called before the raster band object is destroyed. 742 842 * 743 843 * If p is such a pointer and base_type the C type matching eBufType, for default … … 814 914 size_t nCacheSize, 815 915 int bSingleThreadUsage, 916 char **papszOptions ); 917 918 }}} 919 920 === Implemented by gdalrasterband.cpp === 921 922 {{{ 923 924 /** \brief Create a CPLVirtualMem object from a GDAL raster band object. 925 * 926 * Only supported on Linux for now. 927 * 928 * This method allows creating a virtual memory object for a GDALRasterBand, 929 * that exposes the whole image data as a virtual array. 930 * 931 * The default implementation relies on GDALRasterBandGetVirtualMem(), but specialized 932 * implementation, such as for raw files, may also directly use mechanisms of the 933 * operating system to create a view of the underlying file into virtual memory 934 * ( CPLVirtualMemFileMapNew() ) 935 * 936 * At the time of writing, the GeoTIFF driver and "raw" drivers (EHdr, ...) offer 937 * a specialized implementation with direct file mapping, provided that some 938 * requirements are met : 939 * - for all drivers, the dataset must be backed by a "real" file in the file 940 * system, and the byte ordering of multi-byte datatypes (Int16, etc.) 941 * must match the native ordering of the CPU. 942 * - in addition, for the GeoTIFF driver, the GeoTIFF file must be uncompressed, scanline 943 * oriented (i.e. not tiled). Strips must be organized in the file in sequential 944 * order, and be equally spaced (which is generally the case). Only power-of-two 945 * bit depths are supported (8 for GDT_Bye, 16 for GDT_Int16/GDT_UInt16, 946 * 32 for GDT_Float32 and 64 for GDT_Float64) 947 * 948 * The pointer returned remains valid until CPLVirtualMemFree() is called. 949 * CPLVirtualMemFree() must be called before the raster band object is destroyed. 950 * 951 * If p is such a pointer and base_type the type matching GDALGetRasterDataType(), 952 * the element of image coordinates (x, y) can be accessed with 953 * *(base_type*) ((GByte*)p + x * *pnPixelSpace + y * *pnLineSpace) 954 * 955 * This method is the same as the C GDALGetVirtualMemAuto() function. 956 * 957 * @param eRWFlag Either GF_Read to read the band, or GF_Write to 958 * read/write the band. 959 * 960 * @param pnPixelSpace Output parameter giving the byte offset from the start of one pixel value in 961 * the buffer to the start of the next pixel value within a scanline. 962 * 963 * @param pnLineSpace Output parameter giving the byte offset from the start of one scanline in 964 * the buffer to the start of the next. 965 * 966 * @param papszOptions NULL terminated list of options. 967 * If a specialized implementation exists, defining USE_DEFAULT_IMPLEMENTATION=YES 968 * will cause the default implementation to be used. 969 * When requiring or falling back to the default implementation, the following 970 * options are available : CACHE_SIZE (in bytes, defaults to 40 MB), 971 * PAGE_SIZE_HINT (in bytes), 972 * SINGLE_THREAD ("FALSE" / "TRUE", defaults to FALSE) 973 * 974 * @return a virtual memory object that must be unreferenced by CPLVirtualMemFree(), 975 * or NULL in case of failure. 976 * 977 * @since GDAL 2.0 978 */ 979 980 CPLVirtualMem *GDALRasterBand::GetVirtualMemAuto( GDALRWFlag eRWFlag, 981 int *pnPixelSpace, 982 GIntBig *pnLineSpace, 983 char **papszOptions ): 984 985 CPLVirtualMem CPL_DLL* GDALGetVirtualMemAuto( GDALRasterBandH hBand, 986 GDALRWFlag eRWFlag, 987 int *pnPixelSpace, 988 GIntBig *pnLineSpace, 816 989 char **papszOptions ); 817 990 }}} … … 850 1023 CPLVirtualMemIsAccessThreadSafe() has been introduced for that purpose. 851 1024 1025 As far as CPLVirtualMemFileMapNew() is concerned, memory file mapping on POSIX 1026 systems with mmap() should be portable. Windows has CreateFileMapping() and 1027 MapViewOfFile() API that have similar capabilities as mmap(). 1028 852 1029 == Performance == 853 1030 … … 866 1043 dealt by 2 different threads, but one after the other one. 867 1044 1045 The overhead of virtual memory objects returned by GetVirtualMemAuto(), when 1046 using the memory file mapping, should be lesser than the manual management of 1047 page faults. However, GDAL has no control of the strategy used by the operating 1048 system to cache pages. 1049 868 1050 == Limitations == 869 1051 … … 884 1066 == Related thoughts == 885 1067 886 With an uncompressed GeoTIFF file (where strips or tiles are sequentially written887 to disk), GDALDatasetGetVirtualMem() and GDALDatasetGetTiledVirtualMem(), with888 appropriate input parameters, could potentially just mmap() the file itself, which889 would save any GDAL overhead. It is no clear however how old accessed pages can890 be evicted from RAM since Linux does not seem to discard them, which tend to cause891 undesirable disk swapping when the memory mapping is bigger than RAM.892 893 1068 Some issues with system calls such as read() or write(), or easier multi-threading 894 1069 could potentially be solved by making a FUSE (File system in USEr space) driver that 895 would expose a GDAL dataset as a file, and the mmap()'ing the file itself. The 896 issue raised in the previous paragraph would still apply. Plus the fact that 1070 would expose a GDAL dataset as a file, and the mmap()'ing the file itself. However 897 1071 FUSE drivers are only available on POSIX OS, and need root priviledge to be 898 1072 mounted (a FUSE filesystem does not need root priviledge to run, but the mounting … … 919 1093 xsize=None, ysize=None, bufxsize=None, bufysize=None, 920 1094 datatype = None, band_list = None, band_sequential = True, 921 cache_size = 10 * 1024 * 1024, page_size_hint = 0 ):1095 cache_size = 10 * 1024 * 1024, page_size_hint = 0, options = None): 922 1096 """Return a NumPy array for the dataset, seen as a virtual memory mapping. 923 1097 If there are several bands and band_sequential = True, an element is … … 926 1100 accessed with array[y][x][band]. 927 1101 If there is only one band, an element is accessed with array[y][x]. 1102 Any reference to the array must be dropped before the last reference to the 1103 related dataset is also dropped. 928 1104 """ 929 1105 }}} … … 935 1111 xsize=None, ysize=None, tilexsize=256, tileysize=256, 936 1112 datatype = None, band_list = None, tile_organization = gdalconst.GTO_BSQ, 937 cache_size = 10 * 1024 * 1024 ):1113 cache_size = 10 * 1024 * 1024, options = None): 938 1114 """Return a NumPy array for the dataset, seen as a virtual memory mapping with 939 1115 a tile organization. … … 945 1121 accessed with array[band][tiley][tilex][y][x]. 946 1122 If there is only one band, an element is accessed with array[tiley][tilex][y][x]. 1123 Any reference to the array must be dropped before the last reference to the 1124 related dataset is also dropped. 947 1125 """ 948 1126 }}} 949 1127 950 And the Band object has the following 2methods :1128 And the Band object has the following 3 methods : 951 1129 952 1130 {{{ … … 954 1132 xsize=None, ysize=None, bufxsize=None, bufysize=None, 955 1133 datatype = None, 956 cache_size = 10 * 1024 * 1024, page_size_hint = 0 ):1134 cache_size = 10 * 1024 * 1024, page_size_hint = 0, options = None): 957 1135 """Return a NumPy array for the band, seen as a virtual memory mapping. 958 1136 An element is accessed with array[y][x]. 1137 Any reference to the array must be dropped before the last reference to the 1138 related dataset is also dropped. 959 1139 """ 1140 1141 def GetVirtualMemAutoArray(self, eAccess = gdalconst.GF_Read, options = None): 1142 """Return a NumPy array for the band, seen as a virtual memory mapping. 1143 An element is accessed with array[y][x]. 960 1144 961 1145 def GetTiledVirtualMemArray(self, eAccess = gdalconst.GF_Read, xoff=0, yoff=0, 962 1146 xsize=None, ysize=None, tilexsize=256, tileysize=256, 963 1147 datatype = None, 964 cache_size = 10 * 1024 * 1024 ):1148 cache_size = 10 * 1024 * 1024, options = None): 965 1149 """Return a NumPy array for the band, seen as a virtual memory mapping with 966 1150 a tile organization. 967 1151 An element is accessed with array[tiley][tilex][y][x]. 1152 Any reference to the array must be dropped before the last reference to the 1153 related dataset is also dropped. 968 1154 """ 969 1155 }}}