-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extension CommitsLOC sometimes counts wrong. #120
Comments
An update: Any ideas how we could solve this? |
I have been concerned with this issue for a while as well and have created a similar query (albeit a bit more crude perhaps) to check the reliability of the hunks data. The query is:
While most results are identical, within a small sample (produced with a slightly outdated version of cvsanaly) it does report around 30% more results, some due to deleted files. Unfortunately, I have no specific idea how to fix this at the moment, but I'd be very interested to see how this gets resolved as well
|
I asked at stackoverflow and got an answer. This may work: http://stackoverflow.com/questions/7122833/how-to-tell-git-log-numshortstat-to-count-empty-lines I'll have a closer look at that on monday. |
I just wrote an extension Using the following sql-query, you can see how many mismatches hunks has down to the file level:
Querying voldemort reveals 8 mismatches. I'll look into it in more details the next days. |
I wrote a sql-query to better understand the data quality of the extension Hunks. For this I'm summing up the added and removed lines of a hunk per commit and comparing it with the output of extension CommentsLOC (which parses
git log --shortstat
).This is the query:
While investigating, why some commits don't add up, I already published some patches to increase the data quality:
One thing, that is really annoying, is that CommitsLOC sometimes counts wrong up to 5 lines. I investigated the issue and found, that this is a bug with git itself. I already send a bug report to the git mailing list, but so far, no answer.
Here is, what I observed with repo https://github.com/voldemort/voldemort.git :
The command
git log --numstat c21ad764
shows for the commitc21ad764
and file.../readonly/mr/HadoopStoreBuilderReducer.java
25 lines added and 22 lines removed.But the patch of HadoopStoreBuilderReducer.java that I get with
git show c21ad764 -- contrib/hadoop-store-builder/src/java/voldemort/store/readonly/mr/HadoopStoreBuilderReducer.java
adds 30 lines and removes 27.So 5 added and 5 removed lines are missing with
git log --shortstat
!More commits where I observed this problem on the same repository:
Maybe someone has an idea or C skills to build a working patch for git.
The text was updated successfully, but these errors were encountered: