wiki:GSoC/2017/GRASSGISLocationsfromPublicData

GSoC 2017 | GRASS GIS Locations Created from Public Data

Student Name: Zechariah Krautwurst
Student Institution: North Carolina State University
GSoC Organization: OSGeo - Open Source Geospatial Foundation
GSoC Mentors: Anna Petrasova
Vaclav Petras
Project Title: GRASS GIS Locations Created from Public Data
Project Proposal: view proposal
Project Repository: GitHub

Abstract

This project focuses on creating scripts that convert widely-used open data sets into standardized formats for a given location or coordinate projection. The scripts, modules, project documentation, and data sets generated from the project are available as a service and development framework to GRASS GIS users.

Goal

Novice GRASS users often have difficulty with the complexities of re-formatting data, solving map projection issues, and working with centralized data organization. Many existing solutions require users to create their own custom scripts geared towards a specific use-case, which can be difficult for some users to adapt other data sources or coordinate projection systems to their specific needs.

With the scripts created for this project, users will be able to download multiple data sets and appropriately format the data according to a user-defined location. Automated data-formatting will allow users to more efficiently utilize information and create automatically formatted data sets. Furthermore, the underlying framework generated by the project allows users to develop their own scripts for a given service and/or location.

Timeline

Community Bonding PeriodStatus
MAY 4 - MAY 26 Complete preliminary GSoC, OSGeo, mentor, and proposal requirements
Phase 01
MAY 30 - JUNE 2 Explore existing efforts, primary community needs, and potential project difficulties
JUNE 5 - JUNE 9 Identify data sources and formatting conventions
JUNE 12 - 16 Psuedocode scripts
JUNE 19 - 23 Draft python scripts
JUNE 26 - 30 Phase 1 formatting scripts completed
JUNE 30 PHASE 01 EVALUATION DEADLINE
Phase 02
JULY 3 - 7 Integrate existing APIs into scripts
JULY 10 - 14 Unavailable
JULY 17 - 21 Develop data storage, web services, and cross-platform compatibility
JULY 24 - 28 Create use-case documentation and GUI mock-up (wxPython)
JULY 28 PHASE 02 EVALUATION DEADLINE
Phase 03
JULY 31 - AUGUST 4 Develop formatting scripts that format multiple data-types and coordinate systems
AUGUST 7 - 11 Further develop data storage, web services, and cross-platform compatibility
AUGUST 14 - 18 Finalize each of the data formatting pipelines, GUI, services, and processes
AUGUST 21 - 25 Finalize each of the data formatting pipelines, GUI, services, and processes
AUGUST 28 - 29 FINAL WEEK
Create documentation and submit completed work
SEPTEMBER 5 FINAL EVALUATION SUBMITTED

Requirements

GRASS 7.3

Development

Weekly reports

COMMUNITY BONDING: MAY 04 - MAY 29
ACCOMPLISHED
  • Meetings with Anna and Vashek
  • OSGeo Wiki and Trac pages
  • GSoC Weekly reports
  • GRASS Development pipeline
  • GRASS, Linux, Python resources
  • Compiled and install GRASS SVN on Linux virtual machine.
  • Reviewed tutorials and resources
  • Created Trac and Wiki pages.
  • Reviewed coordinate systems and projection issues
  • Researched existing data sources and download methods

WEEK 01: MAY 30 - JUNE 02
ACCOMPLISHED
  • 05/30 Meeting with Anna, Vashek, and Paul
  • Refined project scope
  • Selected target data sources and scale
  • Discussed coordinate projection issues and solutions
  • Researched USGS NED data download process and web services
  • Reviewed GRASS Location Wizard functionality
  • Created user scenarios for various software possibilities
  • Set up local development tools and coding environment
  • Reviewed existing methods and Python scripts for data download into GIS formats
NEXT WEEK
  • Create initial data download script for NED data
  • Establish user defined boundaries for import data
  • Reconcile coordinate projection difficulties
  • Review wxPython
  • Reach out to GRASS community for thoughts and resources
BLOCKS None

WEEK 02: JUNE 05 - JUNE 09
ACCOMPLISHED
  • Reviewed Python scripting for GRASS documentation and existing GRASS data import modules
  • Wrote SRTM 30 download script using Python Elevation library and GRASS output
  • Researched USGS data download methods, REST API’s and HTTP protocol
  • Reviewed The National Map (TNM) API documentation and tested coordinate output methods from g.region as input to TNM API for SRTM ⅓ arc sec NED tiles
NEXT WEEK
  • How to import and process subsets of IMG tiles into single layer with TNM API through GRASS
  • Review wxPython
  • Figure out basemap service for user region selection
BLOCKS None

WEEK 03: JUNE 11 - JUNE 17
ACCOMPLISHED
  • Worked through different methods of importing and processing NED data with GDAL's vsizip/vsicurl, TNM API, python requests lib, urllib2
  • Wrote DRAFT version of r.in.usgsned python script that downloads 1/3 arc-sec NED tiles from the USGS TNM API based on GRASS computational region
  • Updated git repo
NEXT WEEK
  • Continue working on r.in.usgsned module script
  • Develop GUI and GRASS integration
  • Finish basic script functionality to import, patch downloaded tiles
  • Get feedback on script functionality, formatting, and implementation
CHALLENGES
  • Thinking through broader implementations of this module
  • What are the most useful and stable data repositories to access?
  • How to address initial problem of coordinate projections and accuracy across multiple SRSs'

WEEK 04: JUNE 19 - JUNE 23
ACCOMPLISHED
  • Developed GRASS GUI through script flags and options
  • Revised script to integrate dynamic paths and user input
  • Improved SRS translation
  • Began creating format for further USGS data selection
NEXT WEEK
  • Complete r.in.usgsned
  • Begin work on r.in.usgs for other datasets
  • Test accuracy of SRS conversion for elevation
  • Get feedback on script functionality, formatting, and implementation
CHALLENGES
  • Improving workflow

WEEK 05: JUNE 26 - JUNE 30
ACCOMPLISHED
  • Refined GUI options and flags
  • Added GRASS messages, exits, and error handling
  • Created "i" flag information report for module
  • Converted 'requests' module and JSON handling to built-in urlllib2 and JSON libraries
  • Developed data dictionary format for NED data type
  • Improved git workflow, issues, and added r.in.usgsned project tasks
NEXT WEEK
  • Integrate clean-up function for temp files from hardware or network interruptions
  • Develop automated reprojection handling
  • Refine zip archive indexing and download methodl
  • Test for conditions not caught by current error handling
CHALLENGES
  • None

WEEK 06: JULY 03 - JULY 07
ACCOMPLISHED
  • Added functionality to handle unique formatting across multiple NED source dates. [1]
  • Created dynamic data dictionary for all USGS data sources from TNM API GET request [not pushed to repo]
  • Created universal method of indexing and downloading .img file from USGS zip archive[1]
  • Refined error handling and GRASS messages for single vs multiple tiles [1]
  • Discussed clean-up functionality and project development with mentors
  • [1] Most recent git commit
NEXT WEEK
  • Finalize stable version of r.in.usgsned
  • Create hard-coded data dictionaries for r.in.usgs data products
  • Alter clean-up function to remove .img tiles, keep zip
  • Integrate best interpolation methods for data types during r.import
CHALLENGES
  • NOTE: Most recent version of script is not functioning due to incomplete changes. Will make sure from now on to always push a universally functional version of the script.
  • NOTE 2: I will be unavailable 07/12 - 07/21. I will submit next week's report EOD 07/11 and will not submit a report for the following week.

WEEK 07: JULY 10 - JULY 14
ACCOMPLISHED
  • Made significant changes to error handling to improve clean-up functionality, script crashes, and several other structural changes[1]
  • Implemented new method of pulling .img files from zip archive[1]
  • Code now checks if complete zip files already exist and does not download files that are already available[1]
  • Refined error handling and GRASS messages for handling incomplete partially downloaded files[1]
  • Created clean-up function
  • Changed 'r' flag to 'k' flag that keeps zip archives if checked, removes all temp files by default

  • [1] Most recent git commit
NEXT WEEK
  • UNAVAILABLE
CHALLENGES
  • None

WEEK 08: JULY 17 - JULY 21
ACCOMPLISHED
  • UNAVAILABLE
NEXT WEEK
  • Review and complete r.in.usgsned
  • Begin converting r.in.usgsned into r.in.usgs
  • Create hardcoded USGS data dictionaries
  • Continue working on dynamic dictionaries
  • Refine clean-up, cache, and zip detection code
  • Finalize module with mentors
CHALLENGES
  • None

WEEK 09: JULY 24 - JULY 28
ACCOMPLISHED
  • Refined r.in.usgsned clean-up, cache, and zip detection code [1]
  • Reviewed and completed r.in.usgsned module [2]
  • Began converting r.in.usgsned into r.in.usgs [3]
  • Began creating hardcoded USGS data dictionaries
  • Discussed possibilities of dynamic GUI data lists
NEXT WEEK
  • Create hardcoded data dictionaries for each USGS product
  • Identify differences for download formats and storage capabilities
  • Rewrite module to accommodate usgs product variations
  • Begin implementing full NLCD capabilities
  • Create GRASS import and handling script for NLCD
  • Plan further module capabilities and project scope
CHALLENGES
  • None

WEEK 10: JULY 31 - AUGUST 04
ACCOMPLISHED
  • Planned r.in.usgs scope and functionality with mentors
  • Created dynamic data-dictionary generator from API query for available USGS data products[1]
  • Began working on NLCD download configuration and formatting[1][2]
  • Integrated data-dictionary formatting into script[1]
  • Began expanding module syntax to include multiple data types and formats[2]
NEXT WEEK
  • Focus on NED, NLCD, NAIP datasets for r.in.usgs
  • Identify and integrate differences for download formats and storage capabilities
  • Continue implementing full NLCD capabilities
  • Solve download issues with formatting and different source locations
  • Begin finalizing scripting process
CHALLENGES
  • None

WEEK 11: AUGUST 07 - AUGUST 11
ACCOMPLISHED
  • Module completely functional for NED, NLCD data [3]
  • Created hard-coded data dictionaries for NED, NLCD, NAIP, UStopo datasets [1]
  • Implement download syntax and clean-up function for individual uncompressed tiles [2]
  • Module renames downloaded files by hard-coded separators [2]
  • Worked through NAIP import and projection issues
  • Tested and improved module error capturing for partially downloaded files [3]
NEXT WEEK
  • Troubleshoot NAIP, UStopo projection issues
  • Continue to finalize module, comments, and project documentation
  • Begin implementing Natural Earth Data basemap
CHALLENGES
  • None

WEEK - 12 FINAL REPORT: AUGUST 14 - AUGUST 18
Project Title GRASS GIS Locations from Public Data

Organiztion Google Summer of Code 2017
Open Source Geospatial Foundation (OSGeo)
GRASS GIS

Abstract r.in.usgs is an add-on module for GRASS GIS that greatly simplifies the process of downloading and using USGS raster datasets.

Pre-GSoC Before r.in.usgs was created, USGS raster imagery was selected through a web-based interface, manually downloaded, and manually imported into GRASS GIS through a multi-step process. The process requires prior knowledge of USGS dataset parameters, spatial reference systems, coordinate reprojection, computational regions, and the appropriate GRASS GIS tools and methods.

Added Value r.in.usgs provides a GRASS GIS GUI that suggests appropriate default parameters, as well as provides advanced options for downloading available USGS datasets. The module assembles user-input information with the required GRASS GIS parameters and tools to automatically download, import, reproject, and patch complex USGS raster data in a single process.

Continued Work r.in.usgs currently handles all three products from the USGS National Elevation Dataset (NED) as well as all three products from the National Land Cover Dataset (NLCD). Several other USGS datasets are made available for download but each requires custom formatting and further modifications to the r.in.usgs script processes.

Further development of the module should include continued incorporation of USGS datasets, as well as creating accessible tools for sources of international data. Ultimately, creating a module that allows GRASS GIS users to contribute to a centralized, automated repository of properly formatted publicly available datasets would provide a huge service to the open source GIS community.

r.in.usgs will be moved into the official GRASS GIS add-ons repository in the coming week.

Links and Documentation OSGeo Project Wiki
Git Repository
Raw Code
Raw HTML Documentation
Google Docs Version of HTML Documentation


Last modified 7 years ago Last modified on Sep 7, 2017, 10:17:07 AM
Note: See TracWiki for help on using the wiki.