wiki:GoogleSummerCode2021

Google Summer of Code 2021

Introduction

So you are interested in becoming a Google Summer of Code student. What should you do to improve your chances of being selected? We recommend reading

Improving your chances

For most projects involving PostGIS you will eventually need the following:

  • Know how to install PostgreSQL
  • Know how to install PostGIS in PostgreSQL
  • Know how to compile PostgreSQL code
  • Know how to compile PostGIS code and run tests
  • Some basic knowledge of git -- at least how to do a git clone, git push, git pull and pull requests

While you can learn to do these things and ask questions, we would prefer students to know these before starting on a PostGIS project.

Idea 1: Augment PostGIS 3.2 with GIST support added to PG14

Expected outcome: Speed up GiST index building in PostGIS

Skills required: C or willing to learn, ability to compile PostgreSQL code, ability to compile PostGIS code, some familiarity with PostGIS / PostgreSQL is preferable

Mentors: Giuseppe Broccolo, Regina Obe

Difficulty: Medium

Student Test:

  1. git clone PostGIS code from one of Git repos
  2. git clone code from PostgreSQL git repo (master branch) -
  3. Compile both and install PostGIS 3.2 (master branch) extension into PostgreSQL 14 dev database
  4. Setup a public fork of PostGIS repo for your work

Additional detail:

Recently this patch <https://commitfest.postgresql.org/29/2276/> which adds more infrastructure to the GiST has been included in PostgreSQL 14. It should speed up the build of a GiST index after some (fast) pre-sorting of the data which needs to be indexed. Some tests for the PostgreSQL's internal type point (that uses Zsort of the points as fast pre-sorting of the data) showed that the build is up to 5 times faster.

We need to find a possible implementation for PostGIS data types as well, finding the best algorithm to be used to preliminary sort the geometries before the build of the geospatial index. Basically, it would require to add this support function <https://github.com/glukhovn/postgres/blob/225a49161fae9388651373d4beb8dcba99059339/src/include/access/gist.h#L37> and this other one <https://github.com/glukhovn/postgres/blob/225a49161fae9388651373d4beb8dcba99059339/src/include/access/gist.h#L38> in the operator classes which use the GiST infrastructure (e.g. this one <https://github.com/postgis/postgis/blob/8b13c3e2f8366d902dbf516ec17de09ae84361f4/postgis/postgis.sql.in#L781>).

Tests are needed in order to quantify the improvements in performance during the build of the index, considering the different geometries.

Last modified 7 months ago Last modified on Mar 8, 2021, 8:38:32 AM