-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem if # commits == 100 or 1000 #139
Comments
Hello Mark. Greetings |
Hi Alex, Sorry for the vagueness. It was the end of a long day. So, I think there are two ways to recreate this. The first is how I found it: in repositoryhandler/backends/git.py GitRepository.log(), hack cmd to include an argument to git log that limits the number of extracted commits to 100 (that is, have GitRepository.log() call git log -100) [this is useful if you need to test stuff in the context of a big code base, but don't want to wait for lots of commits to be processed while testing]. The second would be to develop a repository with exactly 100 commits. Ugh. Anyway, once you do that, if you run cvsanaly2, you will get an error in DBContentHandler: def __get_person(self, person):
# <snip>
name = to_utf8(person.name) that person.name has the value None which to_utf8 doesn't like. This only occurs for the last (100th) commit being processed. If you follow the chain of calls backwards, the list of commits (from which person originates) is populated in DBTempLog. Within DBTempLog, the information about the last commit is missing (see the dictionary above that is mostly filled with values of None). My guess is that last lines of the commit log aren't making it (1) to the parser and/or (2) out of the parser. So, the information doesn't get into to DBTempLog. I'm currently working around the problem by not using 100 or 1000 as my commit limit. But, I'll be happy to help with bug finding, confirming, and squashing. Best, |
So, if I do a little black magic (edit repositoryhandler.git.log to request only 100 commits .... implying git log -100 for a "limited history"), the last commit does not get properly inserted into DBTempLog in __writer. In fact, it comes in "ill - formed". The revision field is filled in, but no "data" (author, log message, etc.).
Adding a little debugging code to __writer shows the issue:
Gives the following (for the oldest commit ... sort of HEAD~100):
I receive the same sort of error with a history limited to 100 and 1000. With a history of 10, 50, 105, 200, 500 -- it works just fine. I'm guessing that one of the AsyncQ's (maybe in parser?), isn't flushing out its lines properly and leaving one at the tail end.
Input is very welcome.
Best,
Mark
The text was updated successfully, but these errors were encountered: