wiki:USGS_PDF_Topo

Summary

  • Some users may prefer to have the Topo map as a geotif rather than PDF
  • Here are some tips to convert PDFs (specifically USGS Topo maps) to geotif

Details

GDAL must be built with Geospatial PDF support, see the above link for details. Confirm with:

$ gdalinfo --formats
Supported Formats:
...
  HF2 (rwv): HF2/HFZ heightfield raster
  PDF (rov): Geospatial PDF
  OZI (rov): OziExplorer Image File
...

python script

This cutline.py python script was recommended on the email list.

from osgeo import gdal
import os
import sys

ds = gdal.Open(sys.argv[1])
neatline_wkt = ds.GetMetadataItem("NEATLINE")
ds = None

f = open('cutline.csv', 'wt')
f.write('id,WKT\n')
f.write('1,"%s"\n' % neatline_wkt)
f.close()

os.system('gdalwarp %s %s.tif ' % (sys.argv[1], sys.argv[1]) +
          '-crop_to_cutline -cutline cutline.csv -overwrite')

Usage: python cutline.py your.pdf

Manual method

If you for some reasons want to do things more manually, here is how:

  1. Use gdalinfo to get NEATLINE:
    $gdalinfo OR_Newport_North_20110824_TM_geo.pdf 
    Driver: PDF/Geospatial PDF
    Files: OR_Newport_North_20110824_TM_geo.pdf
    ...
    Metadata:
      NEATLINE=POLYGON ((420904.431944419047795 4955726.310806447640061,420721.94461186096305 4941719.078548463992774,410694.503490555507597 4941849.71684651542455,410876.990823113534134 4955856.949104498140514,420904.431944419047795 4955726.310806447640061))
    Corner Coordinates:
    ...
    
  1. Make an OGR datasource of the NEATLINE to use as a cutline in gdalwarp. This can be two files, a .vrt and .csv

wkt_cutline_file.vrt:

<OGRVRTDataSource>
             <OGRVRTLayer name="NEATLINE">
                <SrcDataSource >NEATLINE.csv</SrcDataSource>
                <GeometryType>wkbPolygon</GeometryType>
                <FID>Record_Id</FID>
                <GeometryField encoding="WKT" field="wkb_Polygon"/>
                <LayerSRS>EPSG:26910</LayerSRS>
            </OGRVRTLayer>
</OGRVRTDataSource> 

NEATLINE.csv:

"Record_Id","wkb_Polygon"
"1","POLYGON ((420904.431944419047795 4955726.310806447640061,420721.94461186096305 4941719.078548463992774,410694.503490555507597 4941849.71684651542455,410876.990823113534134 4955856.949104498140514,420904.431944419047795 4955726.310806447640061))"
  1. Use gdal_translate or gdalwarp to convert to geotif and if using gdalwarp, -cutline to the NEATLINE.
    gdalwarp -cutline wkt_cutline_file.vrt -cl NEATLINE -crop_to_cutline OR_Newport_North_20110824_TM_geo.pdf OR_Newport_North_20110824_TM_geo.tif
    

Notes

  • There is probably a more elegant way to grab the NEATLINE for use. (see this thread for some ideas, GeoPDF thread on nabble). The above python script does this.
  • This could be combined with gdalinfo, wget, or curl to determine which file to download. (This is not the same file listed above but a nearby historical Topo)
    wget "http://usgs-catalog4.srv.mst.edu/cgi-bin/gda?SERVICE=WMS&VERSION=1.1.1&REQUEST=GetFeatureInfo&BBOX=-125.317789,43.823383,-123.427332,45.547829&SRS=EPSG:4326&WIDTH=912&HEIGHT=831&LAYERS=gdagfigeopdfs&STYLES=&FORMAT=image/jpeg&QUERY_LAYERS=gdagfigeopdfs&INFO_FORMAT=text/plain&X=629&Y=335"
    
curl -s "http://usgs-catalog4.srv.mst.edu/cgi-bin/gda?SERVICE=WMS&VERSION=1.1.1&REQUEST=GetFeatureInfo&BBOX=-125.317789,43.823383,-123.427332,45.547829&SRS=EPSG:4326&WIDTH=912&HEIGHT=831&LAYERS=gdagfigeopdfs&STYLES=&FORMAT=image/jpeg&QUERY_LAYERS=gdagfigeopdfs&INFO_FORMAT=text/plain&X=629&Y=335" | grep DOWNLOAD

Not sure how to GetFeatureInfo? with gdalinfo, here is a general GetCapabilities?:

gdalinfo "WMS:http://usgs-catalog4.srv.mst.edu/cgi-bin/gda?SERVICE=WMS&VERSION=1.1.1&REQUEST=GetFeatureInfo&BBOX=-125.317789,43.823383,-123.427332,45.547829&SRS=EPSG:4326&LAYERS=gdagfigeopdfs&STYLES=&QUERY_LAYERS=gdagfigeopdfs&INFO_FORMAT=text/plain&X=629&Y=335"
Last modified 7 years ago Last modified on Jan 28, 2013 11:34:04 AM