Skip to content
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.

blame fails with "contents and commits have different length" #725

Closed
krylovsk opened this issue Jan 20, 2018 · 2 comments
Closed

blame fails with "contents and commits have different length" #725

krylovsk opened this issue Jan 20, 2018 · 2 comments
Labels

Comments

@krylovsk
Copy link
Contributor

krylovsk commented Jan 20, 2018

Calling blame fails on some files and seems to be easily repositories on any non-trivial repository, an example attached.

The problem seems to be in taking not the final blame graph slice (hence number of lines mismatches the number of commits per line). More specifically, the graph seems to be sorted in wrong order and the sorting currently relies on commit timestamp (https://github.com/src-d/go-git/blob/master/references.go#L50).

Any ideas why this could go wrong?

Here is an example on this (src-d/go-git) repository:

package main

import (
	"fmt"
	"os"

	"gopkg.in/src-d/go-billy.v4/osfs"
	git "gopkg.in/src-d/go-git.v4"
	"gopkg.in/src-d/go-git.v4/plumbing/object"
	"gopkg.in/src-d/go-git.v4/storage/filesystem"
)

func main() {
	var err error
	fsStorage := osfs.New(os.TempDir() + "/go-git/")
	gitStorage, err := filesystem.NewStorage(fsStorage)
	checkErr(err)

	// open repository
	gitRepo, err := git.Clone(gitStorage, nil, &git.CloneOptions{
		URL:      "https://github.com/src-d/go-git",
		Progress: os.Stdout,
	})
	checkErr(err)

	// iterate over commits
	cIter, err := gitRepo.CommitObjects()
	checkErr(err)
	err = cIter.ForEach(func(c *object.Commit) error {
		fmt.Println("==========================")
		fmt.Println("commit: ", c.Hash)
		// iterate over files
		cTree, err := c.Tree()
		checkErr(err)
		fIter := cTree.Files()
		err = fIter.ForEach(func(fileC *object.File) error {
			fmt.Println("file: ", fileC.Name)
			// get blame
			_, err := git.Blame(c, fileC.Name) // <---- returns "contents and commits have different length"
			checkErr(err)
			return nil
		})
		checkErr(err)
		return nil
	})
	checkErr(err)
}

func checkErr(err error) {
	if err != nil {
		panic(err)
	}
}

Output:

$ go run blame.go
Counting objects: 8402, done.
Compressing objects: 100% (6/6), done.
Total 8402 (delta 0), reused 3 (delta 0), pack-reused 8396
==========================
commit:  4d43799bf111a66b204312c79f1d0dd0d96108b1
file:  .gitignore
file:  .travis.yml
....
file:  plumbing/format/config/fixtures_test.go
file:  plumbing/format/config/option.go
panic: contents and commits have different length

goroutine 1 [running]:
main.checkErr(0x1626bc0, 0xc420252f80)
	/tmp/blame.go:53 +0x4a
main.main.func1.1(0xc4201629b0, 0xc4201629b0, 0x0)
	/tmp/blame.go:42 +0xeb
gopkg.in/src-d/go-git.v4/plumbing/object.(*FileIter).ForEach(0xc422cdfd50, 0xc422cdfcc0, 0x0, 0x0)
	/Users/krylovsk/src/gopkg.in/src-d/go-git.v4/plumbing/object/file.go:125 +0x91
main.main.func1(0xc423d160f0, 0xc420062420, 0x162f600)
	/tmp/blame.go:37 +0x32c
gopkg.in/src-d/go-git.v4/plumbing/object.(*storerCommitIter).ForEach.func1(0x162f600, 0xc422737580, 0xc422737580, 0x0)
	/Users/krylovsk/src/gopkg.in/src-d/go-git.v4/plumbing/object/commit.go:401 +0x8e
gopkg.in/src-d/go-git.v4/plumbing/storer.ForEachIterator(0x162a980, 0xc42016ed00, 0xc42016ed40, 0x0, 0x0)
	/Users/krylovsk/src/gopkg.in/src-d/go-git.v4/plumbing/storer/object.go:280 +0xa5
gopkg.in/src-d/go-git.v4/plumbing/storer.(*MultiEncodedObjectIter).ForEach(0xc42016ed00, 0xc42016ed40, 0xc420062420, 0x162c900)
	/Users/krylovsk/src/gopkg.in/src-d/go-git.v4/plumbing/storer/object.go:251 +0x41
gopkg.in/src-d/go-git.v4/plumbing/object.(*storerCommitIter).ForEach(0xc42016ed20, 0x144edf0, 0xc42016ed20, 0x0)
	/Users/krylovsk/src/gopkg.in/src-d/go-git.v4/plumbing/object/commit.go:395 +0x7c
main.main()
	/tmp/blame.go:29 +0x239
exit status 2
@krylovsk
Copy link
Contributor Author

Submitted a pull request that addresses the edge case described above where committer timestamps (Comitter.When) are equal. Comparing the author timestamps (Author.When) in this case allows to sort the commits correctly(?).

I seem to be facing another issue with blame though where the order of revisions is apparently correct and yet the file in the tree does not match the final revision of it returned by the blame graph algorithm (number of lines mismatch). Not sure it's related to this (sorting) or not, will do more tests.

@mcuadros mcuadros added the bug label Jan 25, 2018
@zkry
Copy link
Contributor

zkry commented Feb 26, 2018

I was looking into this as well. In the blame_test.go file, if you add the test for file {"https://github.com/spinnaker/spinnaker.git", "f39d86f59a0781f130e8de6b2115329c1fbe9545", "README.adoc", concat( it will return this error.

To resolve this I tried to add the merge commits to the list of revisions and this did fix the error from occuring but it produces the wrong list of authors when doing the test. I'm pretty sure this is because the algorithm looks at the merge commit and attributes all of the changes that the merge introduced to the author of the merge, when in reality, it should look at both parents and see which parts came from which parent.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants