Clean Room Run, trial #4

Done

  • rsync fresh copy of m-c (completed in 8m)

  • find & remember packs:

    $ ls -1 mozilla-cvs-history/objects/pack/
    pack-0f19f543bfed791cc99f3168547e28cd86598c3e.idx
    pack-0f19f543bfed791cc99f3168547e28cd86598c3e.pack
    $ pack_id=pack-0f19f543bfed791cc99f3168547e28cd86598c3e
  • allow access to CVS:

    $ pushd mozilla-central-git/.git/objects/pack
    $ ln -s ../../../../mozilla-cvs-history/objects/pack/$pack_id.pack .
    $ ln -s ../../../../mozilla-cvs-history/objects/pack/$pack_id.idx .
    $ popd
  • define graft points - note that there is overlap (double commits) for some period after the tarball was created - we don’t want those in the final repo. So look back in cvs history to find the first commit in hg, and use it’s parent as the last cvs commit:

    • in mozilla-cvs-history, find first commit in hg:

      $ cd mozilla-cvs-history
      $ git log --oneline --grep 374866
      e230b03 Bug 374866. Reftests for text-transform. r=dbaron
    • find its parent:

      $ git log -n 1 e230b03^
      commit 3ec464b55782fb94dbbb9b5784aac141f3e3ac01
      Author: ccooper%deadsquid.com <ccooper%deadsquid.com>
      Date:   Thu Mar 22 21:53:05 2007 +0000
      
          - update headers
          - remove pango workarounds
          - update wget files paths
    • remember that:

      $ last_cvs=3ec464b55782fb94dbbb9b5784aac141f3e3ac01
      $ popd
  • pick a first hg commit. 2 choices: database load (r1), or first real commit (r2):

    • first real commit based on bug number from comment http://hg.mozilla.org/mozilla-central/rev/2:

      $ cd mozilla-central-git
      $ git log --grep 374866
      commit 4b3fd916f29286af4e257f17226063a0435df54a
      Author: roc+@cs.cmu.edu <roc+@cs.cmu.edu>
      Date:   Thu Mar 22 16:01:14 2007 -0700
      
          Bug 374866. Reftests for text-transform. r=dbaron
    • first load (r1), most easily as parent of r2:

      $ git log -n1 4b3fd916f29286af4e257f17226063a0435df54a^
      commit 2514a423aca5d1273a842918589e44038d046a51
      Author: hg@mozilla.com <hg@mozilla.com>
      Date:   Thu Mar 22 10:30:00 2007 -0700
      
          Free the (distributed) Lizard! Automatic merge from CVS:
          Module mozilla: tag HG_REPO_INITIAL_IMPORT at

    We seem to get the best results using r1, even though that produced poorer results in current repos. The key difference is that we are choosing a different point in the CVS history as “last cvs commit”.

    $ first_hg=2514a423aca5d1273a842918589e44038d046a51

    With this choice, the following is observed:

    • the diff for the Bug 374866 commit matches that in bugzilla.
    • the “Free the lizard” commit consists soley of deletions of files in cvs, but not in hg. (There are 5 lines of additions, but these are all in files being deleted. We believe these come from double commits between the “true last cvs commit” and the “last cvs commit that matters to mozilla-central” as computed above.)
    • no “git blame” anomalies are observed when using 3ec464b5 as the last cvs commit. (These can be seen in the prior version by using “git blame client.mk” on, e.g. line 82.)
  • graft:

    $ pushd mozilla-central-git/.git
    $ echo $first_hg $last_cvs > info/grafts
    $ popd
  • run the filter-branch on branch master using the modified git-filter-branch script:

    $ pushd mozilla-central-git
    $ type -a git-filter-branch-keep-rewrites
    git-filter-branch-keep-rewrites is /opt/vcs2vcs/bin/git-filter-branch-keep-rewrites
    $ git filter-branch-keep-rewrites -- $last_cvs..HEAD
    # ^^ completed successfully ; started at: 2012-11-10T18:36+0000
    $ mv .git-rewrite ../mc-git-rewrite
    $ popd ..

To Do

  • update the git-mapfile
  • run filter-branch for the remaining branches in mozilla-central

Misc

  • output from current github repo, showing the first hg commit being double committed (one from the hg conversion, one from the cvs conversion):

    $ git log --oneline --grep 374866
    0054412 Bug 374866. Reftests for text-transform. r=dbaron
    6c45cfc Bug 374866. Reftests for text-transform. r=dbaron
    $ git describe --contains
    0054412 6c45cfc
    MOZILLA_1_9_a4_BASE~843
    MOZILLA_1_9_a4_BASE~28617

    This shows the commit appearing twice: 843 commits before the MOZILLA_1_9_a4_BASE tag, and 28,617 commits before the tag. The intervening 27,774 commits constitute the overlap the prior repos contained.

    With the older approach of grafting using a “last cvs” as previously computed, problems could appear with “git blame” when run on any file modified in those intervening 27,774 commits. One file that shows this behavior is “client.mk” (first occurance at line 82). Note that not all 27,774 commits had anything to do with Firefox, as the cvs repository contained all the repositories at that time.

  • first filter-branch deleted .git-rewrite dir (unknow why, but it’s unix, so either PATH or MODE), so renamed our copy of filter to git-filter-branch-keep-rewrites and restarted

Table Of Contents

Previous topic

<no title>

Next topic

Clean Room Run, trial #5