Skip to content
This repository has been archived by the owner on Sep 13, 2023. It is now read-only.

Incorrect path used to obtain last modified time #12

Closed
plukevdh opened this issue Jan 28, 2014 · 25 comments
Closed

Incorrect path used to obtain last modified time #12

plukevdh opened this issue Jan 28, 2014 · 25 comments

Comments

@plukevdh
Copy link

We're using OctoDeploy for site deployment along with Klondike to serve our NuGet site packages. We have upwards of 9 sites we're deploying, each churning out about 5-8 builds a day. This all adds up to about 150 packages within NuGet on average.

We're currently using the filesystem package watcher to keep the package listing up to date (as we publish the packages to a share FS). The indexing takes 30-60 minutes everytime it needs to be indexed. I've tried to keep IIS from shutting the app pool down so that the indexer stays on, but it seems to require a re-index every few hours, causing our ability to deploy latest packages to grind to a halt while we wait for the index to update.

  1. Is there a better way to do this?
  2. The indexer appears to reindex ALL packages. The way I read the code, it seems it should skip any packages that are not modified (we never modify a package after it's been pushed to the FS) so the majority of the packages should get skipped on a reindex, but this does not appear to be the case. Am I incorrect?
  3. Thanks for this project, it's a great idea, I'm just concerned about the perf we're seeing here.
@chriseldredge
Copy link

With Klondike, you can monitor the logs in ~/App_Data/Logs/Klondike.log which should give you some additional insight into when the app is starting up or shutting down. If you are manually synchronizing the index, the log should also say how many packages it detected as modified, new or deleted.

My best guess is that the network share may be causing problems, either by misreported the last modified date of packages and just by general slowness.

On one example set up we use with over 13,000 packages on a local disk, a full synchronize takes under 30 seconds to complete and does not reindex packages which have not changed.

If possible, you can keep everything in sync more effectively by pushing packages over http to Klondike so the file watcher isn't needed. Additionally, keeping the packages on a local disk will improve I/O dramatically.

If the logs indicate that old packages are being reindexed despite not being modified, there is either a bug in how NuGet.Lucene determines the last modified date of package files, or the network share is lying about the last modified dates.

@chriseldredge
Copy link

When the app stops and starts you'll see messages like this in the log:

2014-01-28 11:20:19,290 INFO   225 LuceneRepositoryConfigurator Dispose Stopping Lucene indexing services.
2014-01-28 11:20:19,352 INFO   71 PackageIndexer IndexUpdateLoop Update task shutting down.
2014-01-28 11:20:28,805 INFO   1 LucenePackageRepository Initialize Refreshing index. Package count: 13461

@plukevdh
Copy link
Author

I see a few shutdown messages, but not as often as I'd expect. I do see this however:

2014-01-24 10:24:20,646 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 79
2014-01-24 13:14:45,398 INFO   49 PackageIndexer SynchronizeIndexWithFileSystem Updates to process: 47 packages added, 78 packages updated, 1 packages removed.
2014-01-24 13:14:45,400 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:14:45,415 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:14:50,255 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:14:50,348 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:14:53,898 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:14:53,947 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:14:57,591 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:14:57,648 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:15:00,740 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:15:00,773 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:15:11,232 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:15:11,292 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:15:15,319 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:15:15,639 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:15:17,878 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:15:17,929 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:15:21,161 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:15:21,217 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:15:25,008 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.
2014-01-24 13:15:25,065 INFO   6 LucenePackageRepository <RegisterCacheWarmingCallback>b__1b Refreshing index. Package count: 78
2014-01-24 13:15:38,713 INFO   6 PackageIndexer ApplyUpdates Processing 1 updates.

Which is way more packages updating than I should expect. I could try moving the storage to local to see if that resolves the timestamp issue. Clock drift has been pretty bad within our organization, but not enough to account for that many updates. I would expect it to be zero...

@chriseldredge
Copy link

Perhaps the file watcher isn't working correctly. I'm not sure how reliable file watchers are when observing network shares.

@plukevdh
Copy link
Author

plukevdh commented Feb 2, 2014

This is definitely still an issue for me when running against a set of packages locally.

@plukevdh
Copy link
Author

plukevdh commented Feb 2, 2014

Seeing all packages as always updated. Might be worthwhile to add an option to allow skipping of updated packages, as we never actually have updates once a package is published. If I added a PR for that, would it be an option worth having for others?

@chriseldredge
Copy link

I would prefer to understand why the code is misbehaving before adding a work around to address it. When you manually inspect one of your file packages, do the timestamps look how you think they should (for date created, last modified).

The code in question is in IndexDifferenceCalculator.

It seems that if the existing package does not have a PublishDate, or if that date is off by more than one second from the last modified date of the file, the package will be reindexed.

There could be a timezone bug here or some other problem. What locale is your server configured to use?

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

Still digging into this. The PhysicalFileSystem lib from Nuget is reporting a modified timestamp less than that of the creation/publish time that Lucene sees.

klondike debugging - 2014-02-03 11-40-07

Not sure where this would come from. It's not a timesync issue between the publishing server and the nuget server either. Moreover the OS thinks that Lucene is correct:

sup 2014-02-03 11-40-58

A quick sanity check using scriptcs to test the std File lib shows that the way NuGet is getting the timestamp is consistent with the NuGet PhysicalFileSystem behavior:

scriptcs 111-f39gi 2014-02-03 11-46-55

Any idea what the difference is between how Lucene/the freaking OS interpret the modified date vs. .NET's File lib?

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

Both servers report the same locale (EST or UTC-05:00) and the time servers are both the same. Both servers report the same current time.

@chriseldredge
Copy link

LucenePackageRepository also has a workaround for when the filesystem reports a wacky last modified time and could be using the create date instead of last modified date in these cases. Maybe step into that code to see which one it is using?

Even if the dates are wrong, once a package is indexed they should agree.

@chriseldredge
Copy link

Have you tried completely deleting your Lucene index and rebuilding it from scratch? Probably won't help but would add some evidence to the situation.

Also, just to make sure: there are no background jobs outside of Klondike/NuGet.Lucene that are doing anything to the package files right? Except adding new ones as part of your build process?

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

I have rebuilt multiple times during this process. As far as I know, there are no other processes interacting with the packages other than to read them. There was originally the thought that backups of the system might be touching the packages but I've since moved the packages to a separate system and the files continue to always report all packages as modified.

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

I did debug and it appears that when using LucenePackageRepository it is always using the created time. While this shouldn't be a problem, it also appears that there is a modified date (even though it's not being reported via the FileSystem call) which may be where the mismatch is coming from.

@chriseldredge
Copy link

http://msdn.microsoft.com/en-us/library/system.io.file.getlastwritetime(v=vs.110).aspx says that it may return the date 12:00 midnight, January 1, 1601 A.D. when the path does not exist.

I wonder if the permissions on your files and directories are preventing the process from having access to read attributes on files. Perhaps resetting permissions and using something more lax would help?

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

Actually, looking at it again, the path being given to GetLastModified is incorrect, so it can't find it on the first pass. Looking to see where that comes from now.

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

yeah, so in GetLastModified for the Repo, it gets an odd path (ProjectName\PackageName.nupkg) whereas in the DiffCalculator it gets the filepath relative to the NuGet.Lucene.Web:PackagesPath set in the webconfig. Looks like the former is based off of the GroupPackageFilesById setting but does not provide the full path. Just the filename. So the Repo is not accounting for packages that might be nested within a directory structure deeper than one level.

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

This appears to be where it loses contact with the actual FS path: https://github.com/themotleyfool/NuGet.Lucene/blob/master/source/NuGet.Lucene/LucenePackageRepository.cs#L319

@chriseldredge
Copy link

Ah, this is making sense now. Some assumptions are made about a package location that won't always be true when packages are added to the file system by an external process (or if settings like GroupPackageFilesById are changed).

This sounds like a bug in LucenePackageRepository.

@chriseldredge
Copy link

In the short term, the best work around is to organize you package files to be consistent with how NuGet.Lucene would place them if you were adding them that way.

@plukevdh
Copy link
Author

plukevdh commented Feb 3, 2014

I can't change that for now, due to requirements of the TFS build system and the retention policy system. I can wait a bit for a fix though.

@chriseldredge
Copy link

Found two issues that were causing synchronization to work incorrectly when detecting modified package files: 056e7c2 and 2e36fbc.

@chriseldredge
Copy link

This will be fixed in NuGet.Lucene 4.6 and the issue will be closed after releasing to nuget.org.

@plukevdh
Copy link
Author

plukevdh commented Feb 7, 2014

🤘 Thanks for the fix. I looked at it but didn't really have enough knowledge of the system to figure out where that would have been. I'll probably run a test build from source in the mean time just to test it works for our setup.

@chriseldredge
Copy link

@chriseldredge
Copy link

Klondike 1.3.3 has been released and includes the fix for this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

2 participants