Migrating Specific Folders from Subversion to Git

First a bit of background: I’m currently a sophomore studying Computer Science (CS) at the University of Illinois at Urbana-Champaign (UIUC). In all of our CS classes, we use Subversion to submit our labs and programming assignments (referred to as MP’s). For convenience for the professors and teaching assistants there is a separate folder within the class repository for each student to submit their work. Effectively, that folder is our individual repository: we check it out it once and then continue to update and check in files throughout the semester. In my personal use, I prefer to use Git and it makes sense to convert the entire repository to git before archiving it.

Fortunately this has gotten rather easy to do in the last couple of years, but there is a step that I always forget because I want to migrate a specific folder instead of a whole repository. I’m documenting this procedure for myself so it is possible that it won’t work exactly as written with your specific Subversion setup.

Credits: I’d like to thank John Albin and SleePy for providing me with the various pieces I needed to get this working.

  • First, we need to get the list of all committers from Subversion’s logs. This can be achieved with John Albin’s handy little regex:

    svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt

That will grab all the log messages, pluck out the usernames, eliminate any duplicate usernames, sort the usernames and place them into a “authors-transform.txt” file.

  • We need to edit each line in that file to add the rich metadata that Git expects from it’s committers. For example, convert:

    rkapoor = rkapoor

    into:

    rkapoor = Rohan Kapoor <[email protected]>

  • Next, we use git svn to clone the specific directory from the repository.

    git svn clone http://example.com/svn/project/folder --no-minimize-url --no-metadata -A authors-transform.txt folder

    The --no-minimize-url makes sure that git svn only clones the specific directory without trying to clone the root of the repository.

  • Finally (optional), we can add a remote to the git repository and push it out to a remote provider like GitHub or Bitbucket.

    git remote add origin ssh://[email protected]/path/to/your/repo

    git push origin

Leave a Reply

Your email address will not be published. Required fields are marked *