-
Notifications
You must be signed in to change notification settings - Fork 32
Incorrect path used to obtain last modified time #12
Comments
With Klondike, you can monitor the logs in My best guess is that the network share may be causing problems, either by misreported the last modified date of packages and just by general slowness. On one example set up we use with over 13,000 packages on a local disk, a full synchronize takes under 30 seconds to complete and does not reindex packages which have not changed. If possible, you can keep everything in sync more effectively by pushing packages over http to Klondike so the file watcher isn't needed. Additionally, keeping the packages on a local disk will improve I/O dramatically. If the logs indicate that old packages are being reindexed despite not being modified, there is either a bug in how NuGet.Lucene determines the last modified date of package files, or the network share is lying about the last modified dates. |
When the app stops and starts you'll see messages like this in the log:
|
I see a few shutdown messages, but not as often as I'd expect. I do see this however:
Which is way more packages updating than I should expect. I could try moving the storage to local to see if that resolves the timestamp issue. Clock drift has been pretty bad within our organization, but not enough to account for that many updates. I would expect it to be zero... |
Perhaps the file watcher isn't working correctly. I'm not sure how reliable file watchers are when observing network shares. |
This is definitely still an issue for me when running against a set of packages locally. |
Seeing all packages as always updated. Might be worthwhile to add an option to allow skipping of updated packages, as we never actually have updates once a package is published. If I added a PR for that, would it be an option worth having for others? |
I would prefer to understand why the code is misbehaving before adding a work around to address it. When you manually inspect one of your file packages, do the timestamps look how you think they should (for date created, last modified). The code in question is in IndexDifferenceCalculator. It seems that if the existing package does not have a PublishDate, or if that date is off by more than one second from the last modified date of the file, the package will be reindexed. There could be a timezone bug here or some other problem. What locale is your server configured to use? |
Still digging into this. The PhysicalFileSystem lib from Nuget is reporting a modified timestamp less than that of the creation/publish time that Lucene sees. Not sure where this would come from. It's not a timesync issue between the publishing server and the nuget server either. Moreover the OS thinks that Lucene is correct: A quick sanity check using scriptcs to test the std File lib shows that the way NuGet is getting the timestamp is consistent with the NuGet PhysicalFileSystem behavior: Any idea what the difference is between how Lucene/the freaking OS interpret the modified date vs. .NET's File lib? |
Both servers report the same locale (EST or UTC-05:00) and the time servers are both the same. Both servers report the same current time. |
LucenePackageRepository also has a workaround for when the filesystem reports a wacky last modified time and could be using the create date instead of last modified date in these cases. Maybe step into that code to see which one it is using? Even if the dates are wrong, once a package is indexed they should agree. |
Have you tried completely deleting your Lucene index and rebuilding it from scratch? Probably won't help but would add some evidence to the situation. Also, just to make sure: there are no background jobs outside of Klondike/NuGet.Lucene that are doing anything to the package files right? Except adding new ones as part of your build process? |
I have rebuilt multiple times during this process. As far as I know, there are no other processes interacting with the packages other than to read them. There was originally the thought that backups of the system might be touching the packages but I've since moved the packages to a separate system and the files continue to always report all packages as modified. |
I did debug and it appears that when using LucenePackageRepository it is always using the created time. While this shouldn't be a problem, it also appears that there is a modified date (even though it's not being reported via the FileSystem call) which may be where the mismatch is coming from. |
http://msdn.microsoft.com/en-us/library/system.io.file.getlastwritetime(v=vs.110).aspx says that it may return the date I wonder if the permissions on your files and directories are preventing the process from having access to read attributes on files. Perhaps resetting permissions and using something more lax would help? |
Actually, looking at it again, the path being given to GetLastModified is incorrect, so it can't find it on the first pass. Looking to see where that comes from now. |
yeah, so in GetLastModified for the Repo, it gets an odd path (ProjectName\PackageName.nupkg) whereas in the DiffCalculator it gets the filepath relative to the |
This appears to be where it loses contact with the actual FS path: https://github.com/themotleyfool/NuGet.Lucene/blob/master/source/NuGet.Lucene/LucenePackageRepository.cs#L319 |
Ah, this is making sense now. Some assumptions are made about a package location that won't always be true when packages are added to the file system by an external process (or if settings like GroupPackageFilesById are changed). This sounds like a bug in LucenePackageRepository. |
In the short term, the best work around is to organize you package files to be consistent with how NuGet.Lucene would place them if you were adding them that way. |
I can't change that for now, due to requirements of the TFS build system and the retention policy system. I can wait a bit for a fix though. |
This will be fixed in NuGet.Lucene 4.6 and the issue will be closed after releasing to nuget.org. |
🤘 Thanks for the fix. I looked at it but didn't really have enough knowledge of the system to figure out where that would have been. I'll probably run a test build from source in the mean time just to test it works for our setup. |
Klondike 1.3.3 has been released and includes the fix for this issue. |
We're using OctoDeploy for site deployment along with Klondike to serve our NuGet site packages. We have upwards of 9 sites we're deploying, each churning out about 5-8 builds a day. This all adds up to about 150 packages within NuGet on average.
We're currently using the filesystem package watcher to keep the package listing up to date (as we publish the packages to a share FS). The indexing takes 30-60 minutes everytime it needs to be indexed. I've tried to keep IIS from shutting the app pool down so that the indexer stays on, but it seems to require a re-index every few hours, causing our ability to deploy latest packages to grind to a halt while we wait for the index to update.
The text was updated successfully, but these errors were encountered: