Opened 6 years ago

Closed 6 years ago

#3722 closed task (fixed)

migrate grass svn repositories to git

Reported by: martinl Owned by: grass-dev@…
Priority: major Milestone:
Component: Default Version: unspecified
Keywords: svn, git, migration Cc:
CPU: Unspecified Platform: Unspecified

Description

This ticket summarizes current efforts of migrating GRASS svn repositories (https://svn.osgeo.org/grass/) to git.

Pilot git repository (of main grass svn repo) is available for testing at

http://geo102.fsv.cvut.cz:8090/grass/grass

List of branches:

https://svn.osgeo.org/grass/grass/branches/ compare with http://geo102.fsv.cvut.cz:8090/grass/grass/branches

List of tags:

https://svn.osgeo.org/grass/grass/tags/ compare with http://geo102.fsv.cvut.cz:8090/grass/grass/tags

Change History (25)

comment:1 by martinl, 6 years ago

Open issues:

  • choice of git repository manager (master)
  1. join Github OSGeo group, https://github.com/OSGeo
  2. Gitlab (or Bitbucket, ...) no OSGeo group exists
  3. https://git.osgeo.org/gitea
  • choice of issue tracker
  1. stay with trac
  2. use native git repository manager issue tracker

in reply to:  1 ; comment:2 by martinl, 6 years ago

Replying to martinl:

  • choice of issue tracker
  1. stay with trac

see https://trac.osgeo.org/osgeo/ticket/2233

comment:4 by mankoff, 6 years ago

Not a developer but interested in helping more. My preference: GitLab and Issues in GitLab. I think issues should be local to source because these systems all support nice integration. Pull requests and patches and comments and bugs can all easily reference each other and files and lines in files via # and % and ~ and etc codes.

in reply to:  2 comment:5 by martinl, 6 years ago

Replying to martinl:

Replying to martinl:

  • choice of issue tracker
  1. stay with trac

see https://trac.osgeo.org/osgeo/ticket/2233

first test with trac and git at

http://geo102.fsv.cvut.cz:8091/grass

more specific

http://geo102.fsv.cvut.cz:8091/grass/browser/grass

in reply to:  6 comment:7 by martinl, 6 years ago

Replying to martinl:

Unfortunately issue lacks link to related commit, any idea what could be wrong?

probably something related to https://trac.edgewall.org/wiki/CommitTicketUpdater#Checkpermissions (?)

comment:8 by pmav99, 6 years ago

Hello Martin, If you don't mind a couple of questions.

  1. How did you convert the svn repo to a git one?
  2. The git repo you posted says that it has 36743 commits. Is that the whole repo or just trunk?

in reply to:  8 comment:9 by martinl, 6 years ago

Replying to pmav99:

  1. How did you convert the svn repo to a git one?

https://trac.osgeo.org/grass/browser/grass-addons/tools/svn2git

  1. The git repo you posted says that it has 36743 commits. Is that the whole repo or just trunk?

http://geo102.fsv.cvut.cz:8090/grass/grass/branches

comment:10 by sbl, 6 years ago

Probably useful migration tool to look at (for gitlab): https://github.com/tracboat/tracboat

Also this: https://docs.gitlab.com/ce/user/project/import/svn.html

comment:11 by pmav99, 6 years ago

I don't have any experience with SVN to GIT conversions, but shouldn't there be 70k+ commits?

I did run a couple of tests with subgit. It seems a solid project. They also offer Syncing between GIT and SVN which might be worth checking out, but root access to the SVN repo is needed.

Anyway, if someone wants to try the it out the commands for only converting trunk is (needs 6 hours):

subgit configure https://svn.osgeo.org/grass/grass/trunk --layout directory grass_trunk.git
subgit install grass_full.git

The result is a repo with 36800+ commits which is comparable to 36743. That's why I asked, if that was trunk only.

I also tried to convert the whole SVN repo:

subgit configure https://svn.osgeo.org/grass/grass --layout std grass_full.git
subgit install grass_full.git

but there is some problem with revision 68772. The logs suggest that it has to do an old SVN bug: https://subversion.apache.org/docs/issue4129 . This is the log:

SubGit version 3.3.5 ('Bobique') build #4042

Translating Subversion revisions to Git commits...

INSTALLATION FAILED

error: The Subversion repository is corrupted at revision r68772 because of Subversion issue http://subversion.apache.org/docs/issue4129 .
error: To prevent this error in the futureupdate your Subversion server version to 1.7.5 or to 1.6.18.
error: To recover your repository perform svnadmin dump/load procedure:
error: $ svnadmin dump path/to/svn/repository > repo.dump
error: $ svnadmin create path/for/recovered/svn/repository
error: $ svnadmin load path/for/recovered/svn/repository < repo.dump
error: After that please re-clone the repository.
error: svn: E204900: Checksum mismatch in branches/releasebranch_7_2/lib/db/sqlp/sql.html: expected 7245ea62da02566d2edcab901919d4e9 but found 0061e8e51c3b551ac6123c88a66940cc
error: Checksum mismatch in branches/releasebranch_7_2/lib/db/sqlp/sql.html: expected 7245ea62da02566d2edcab901919d4e9 but found 0061e8e51c3b551ac6123c88a66940cc
error: Unexpected error has occurred; please report along with the logs ('/home/username/Prog/svn/subgit-install-20190128-125015.zip')
error:   to http://issues.tmatesoft.com/, thank you!
Last edited 6 years ago by pmav99 (previous) (diff)

in reply to:  11 ; comment:12 by martinl, 6 years ago

Replying to pmav99:

I don't have any experience with SVN to GIT conversions, but shouldn't there be 70k+ commits?

SVN works differently compared to Git. Not all commits went into trunk in SVN.

$ git branch 
* master
$ git log | grep -c ^commit
36743

$ git checkout releasebranch_7_6 
Branch releasebranch_7_6 set up to track remote branch releasebranch_7_6 from origin.
Switched to a new branch 'releasebranch_7_6'
$ git log | grep -c ^commit
36620

$ git checkout releasebranch_6_4 
Checking out files: 100% (9643/9643), done.
Branch releasebranch_6_4 set up to track remote branch releasebranch_6_4 from origin.
Switched to a new branch 'releasebranch_6_4'
$ git log | grep -c ^commit
22575
Last edited 6 years ago by martinl (previous) (diff)

in reply to:  12 comment:13 by martinl, 6 years ago

Replying to martinl:

$ git branch 
* master
$ git log | grep -c ^commit
36743

In SVN:

$ svn info
URL: https://svn.osgeo.org/grass/grass/trunk
Relative URL: ^/grass/trunk
svn log | grep -c '^r[0-9]'
36900

Number is slightly different since git experimental repo is not up-to-date. Anyway it's not 7e4 commits.

in reply to:  14 comment:15 by martinl, 6 years ago

Replying to martinl:

  • SVN revisions should be rewritten to full URL, eg.

Solved by local modification of PHP code, see

https://github.com/landam/grass-gis-git-migration-test/issues/50#issuecomment-460298867

source: https://trac.osgeo.org/grass/ticket/42#comment:4

Last edited 6 years ago by martinl (previous) (diff)

comment:16 by neteler, 6 years ago

Here the PHP snippet for translating the components to a reduced set of git labels:

   'Compiling' => 'core',
   'Database' => 'core',
   'Default' => 'core',
   'Display' => 'core',
   'Docs' => 'docs',
   'Imagery' => 'core',
   'Installation' => 'packaging',
   'LibGIS' => 'core',
   'LibOpenGL' => 'core',
   'LibRaster' => 'core',
   'LibVector' => 'core',
   'License' => 'core',
   'Packaging' => 'packaging',
   'Parser' => 'core',
   'Projections/Datums' => 'core',
   'Ps.map' => 'core',
   'PyGRASS' => 'python',
   'Python' => 'python',
   'Python ctypes' => 'python',
   'Raster' => 'core',
   'Raster3D' => 'core',
   'Startup' => 'core',
   'Tcl/Tk' => 'core',
   'Tcl/Tk NVIZ' => 'core',
   'Temporal' => 'core',
   'Tests' => 'unittests',
   'Translations' => 'translations',
   'Vector' => 'core',
   'wxGUI' => 'core',
   'Addons' => NULL,
   'Datasets' => NULL,
   'Shell Scripts' => NULL,
   'Website' => NULL

hence, not needed (as addons are in a separate git repo, shell no longer there, and datasets + CMS done differently):

#   'Addons' => NULL,
#   'Datasets' => NULL,
#   'Shell Scripts' => NULL,
#   'Website' => NULL

So, the target git labels are now:

  • core
  • docs
  • python
  • packaging
  • translations
  • unittests
Version 1, edited 6 years ago by neteler (previous) (next) (diff)

in reply to:  14 comment:17 by martinl, 6 years ago

Replying to martinl:

BTW, I did some tests with trac2github, https://github.com/trustmaster/trac2github

For record, my fork which includes GRASS related changes is available at https://github.com/landam/trac2github/tree/grass

comment:18 by pmav99, 6 years ago

AFAI can tell, all the test git repos are currently down. Is there a link to a working one?

in reply to:  18 comment:19 by martinl, 6 years ago

Replying to pmav99:

AFAI can tell, all the test git repos are currently down. Is there a link to a working one?

right. Since most of questioners in GRASS git migration survey (https://docs.google.com/forms/d/1BoTFyZRNebqVX98A3rh5GpUS2gKFfmuim78gbradDjc/viewanalytics) prefer github as target platform I have disabled gitlab/trac service.

comment:20 by martinl, 6 years ago

First part of github migration is available for review.

grass repo containing only GRASS7+ development is available at https://github.com/grass-svn2git/grass:

branches: https://github.com/grass-svn2git/grass/branches/all

tags: https://github.com/grass-svn2git/grass/branches/all

Log messages have been rewritten by source:grass-addons/tools/svn2git/rewrite.py

ALL LOGS ARE AVAILABLE FOR REVIEW AT

http://geo102.fsv.cvut.cz/~landa/grass-svn2git/

PLEASE CONTRIBUTE! HELP WITH REVIEWING.

Basically the rewrite script should:

I will push other repos (grass-legacy, grass-addons and grass-promo) when log message will be reviewed and correct.

comment:21 by pmav99, 6 years ago

Thank you Martin, Awesome work

comment:22 by neteler, 6 years ago

Thank you for your huge efforts to write the converter scripts, Martin!

Just a small glitch (maybe we can happily ignore it):

The commit message

comment:23 by neteler, 6 years ago

News: I have migrated legacy GRASS GIS versions to github (thanks to Antonio Galea for teaching me special git tricks). The migration script is here:

https://trac.osgeo.org/grass/browser/grass-addons/tools/svn2git/git_processing_legacy_branches.sh

Legacy code repository of GRASS GIS versions 3.2, 4.0, 4.1, 4.2, 4.3 (1987-1999):

The versions GRASS GIS 5.x and 6.x can now be merged on top of that.

Note: For now the legacy GRASS GIS versions are stored under the temporary grass-svn2git owner (in the end we want to move it to the "OSGeo" organization).

Wish: If anyone has an idea to do further commit message rewriting in grass-legacy (e.g. extracting the first author from the various main.c etc files and update GITAUTHOR with that), please speak up!

comment:25 by martinl, 6 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.