Wednesday, 16 September 2015

Migrating from SVN to Git

Our project team had been using SVN for the last 8 months as its source control management tool. All development was done on trunk and releases were created from a release branch. After 8 months, when the project (new responsive website on new multi-channel CMS platform) went live, we continued developing on trunk and any hot-fixes were made on the release branch and then merged into trunk. Merges were intentionally kept small and frequent and we avoided feature branches altogether.

Strategically the business wanted to adopt a number of new tools including Slack (team messaging, collaboration), Asana (Project and Task management) and Git, so I was tasked with migrating our project from SVN to Git. The repository would be hosted on github and thereby minimise maintenance as there was no server to manage to ensure there was sufficient disk space, or upgrading the scm tool to a later version.

This post describes the process.

The atlassian tutorial [1] provides 5 simple steps to help migrate users from SVN to Git.

  1. Preparing the environment
  2. Converting the SVN repository to a local Git repository
  3. Synchronizing the local Git repository when the SVN repository changes
  4. Sharing the Git repository with developers.
  5. Migrating development effort from SVN to Git.

1. Preparing the Environment

After downloading and executing the migration script, it was found that the versions of Git and SVN that were available with the Linux distribution [2] were not of the required versions. It was therefore necessary to install the latest version of Git (2.5.2) from source [3].

Once again, after downloading the Git source, it was found that the necessary dependencies such as the gcc compiler and make tool were missing. These were installed using the yum Development Tools package.

The atlassian migration-script was still complaining about the SVN version so the latest version (1.8.14) was also installed by configuring an additional yum repository [4].

The final piece of the jigsaw was to install the correct subversion perl bindings for the updated SVN by executing yum install subversion-perl

Finally, the svn-migration-scripts verify command was able to successfully execute.

[root@devapp1 git-migration]# java -jar svn-migration-scripts.jar verify
svn-migration-scripts: using version 0.1.56bbc7f
Git: using version 2.5.2
Subversion: using version 1.8.14
git-svn: using version 2.5.2
[root@devapp1 git-migration]#

2. Converting the SVN repository to a local Git repository

The tutorial mentions running git svn clone --stdlayout --authors-file=authors.txt
 <svn-repo>/<project> <git-repo-name>

This fails with the following error if the SVN repository requires authentication.

Password for 'myuser': Can't locate Term/ in @INC (@INC contains: /usr/lib/perl/site_perl/5.14
/usr/lib/perl5/site_perl/5.14/x86_64-cygwin-threads /usr/lib/perl5/vendor_perl
/5.14/x86_64-cygwin-threads /usr/lib/perl5/vendor_perl/5.14 /usr/lib/perl5/5.14/x86_64-cygwin-
threads /usr/lib/perl5/5.14 .) at /usr/lib/perl5/vendor_perl/5.14/ line 565.

Therefore it is necessary to first run the svn checkout command so that svn caches the credentials and then also install the Perl Term ReadKey package:
perl -MCPAN -e shell

After the shell initiation, install the ReadKey package:
install Term::ReadKey

It should now be possible to create a local Git repository from the SVN repository.

The cloning of the SVN repository creates the SVN branches and tags as remote branches. The clean-git script included in the svn-migration-scripts.jar kept on deleting the branches and tags that were being created so git branch only showed the master branch.

Executing "java -Dfile.encoding=utf-8 -jar ../svn-migration-scripts.jar clean-git --force --no-delete" allowed us to create the local git branches including all the obsolete and deleted ones.

Executing git branch now showed the local git branches, but git tag was still not returning any tags.

The following script was used to create the git tags changes the path to the tags.[5]

Create Git Tags

3. Synchronizing the local Git repository when SVN changes

The synchronizing of the local Git repository with SVN changes works as described in the documentation by simply fetching the latest changes, rebasing the Git repository and then doing a clean up.

git svn fetch

java -Dfile.encoding=utf-8 -jar ~/svn-migration-scripts.jar sync-rebase

java -Dfile.encoding=utf-8 -jar ~/svn-migration-scripts.jar clean-git --force

4. Sharing the Git repository

This step involves pushing the local Git repository to a public Git repository server and then allowing developers to pull and push commits to and from the public repository.

To facilitate pushing changes to a remote repository, Git uses a 'remote' reference as an alias for the remote repository URL. The remote reference can be anything one chooses but if the remote repository is going to serve as the official codebase for the project, it is conventionally referred to as 'origin',
git remote add origin https://\\/\.git

Once the remote has been set up, the changes can be pushed to the remote repository.

git push -u origin --all
git push --tags

Instead of bitbucket, our project team used github to host our remote repository. The uploading of the git repository failed as there were some files with a history of being larger than 100MB in size which github doesn't allow. The file contained the binary data of images that were pruned once development matured (as the images were moved to be managed in the production environment and not within the codebase). However, to allow the uploading of the Git repository, it would be necessary to remove the history of the offending file. This is where the BFG utility came to the rescue [6].

To Be Continued....


[2] Red Hat Enterprise Linux Server release 6.6 (Santiago)