Skip to content
This repository has been archived by the owner on Jun 8, 2022. It is now read-only.

[Feature Request] OSX / FSEvents #54

Open
robfig opened this issue Jul 22, 2013 · 30 comments
Open

[Feature Request] OSX / FSEvents #54

robfig opened this issue Jul 22, 2013 · 30 comments

Comments

@robfig
Copy link
Contributor

robfig commented Jul 22, 2013

Hi,

One chap submitted a pull request to watch the entire GOPATH worth of source:
https://github.com/robfig/revel/pull/153

That seems to work in inotify (and I believe windows as well) -- however, kqueue chokes on it due to the number of file descriptors required.

This article suggests that FSEvents would be the mechanism to use for directory-level monitoring:
http://stackoverflow.com/questions/1772209/file-level-filesystem-change-notification-in-mac-os-x

Have you thought about integrating this sort of functionality into your package? I think it would be valuable (for revel in particular!).

Thanks again!
Rob

@DisposaBoy
Copy link

FWIW, I believe the inotify limit is per user which means it's shared by all other programs run by the user.

[ `sysctl fs.inotify.max_user_watches` | done: 48.58157ms ]
    fs.inotify.max_user_watches = 8192

that's on Ubuntu and I think that's the default

@howeyc
Copy link
Owner

howeyc commented Jul 30, 2013

Unfortunately I can't add FSEvents functionality as I don't have an OSX machine and from what I can tell BSD doesn't have FSEvents.

@pmonks
Copy link

pmonks commented Aug 15, 2013

I have an OSX machine and would be happy to help test. I'm an utter go n00b however, so probably won't be much use in writing the code (yet).

@paulhammond
Copy link
Contributor

I just spent a few minutes looking into this. I think there are two main challenges here (aside from your lack of a OS X machine):

Firstly, the FSEvents API is designed to support easily monitoring an entire directory tree, including subdirectories, not just a single directory. This is good; it's the reason FSEvents would work for watching an entire GOPATH, but it doesn't map exactly to the existing fsnotify API. Implementing FSEvents without watching subdirectories isn't going to be much more useful than the existing code. #56 is possibly relevant here.

Secondly, FSEvents is not a set of syscalls like kqueue, it's only available as a C/Objective C framework. Behind the scenes, as with many things on the mac, it's managed by a separate fseventsd process. That in turn listens on /dev/fsevents which you could in theory read from, but the mac kernel developers have said that the non-framework api is unsupported and likely to cause problems. But using the framework means that any code that uses fsnotify on a mac will have to be compiled under cgo, which might cause other problems elsewhere.

At this point I gave up - hopefully someone can find a way to make this work...

@robfig
Copy link
Contributor Author

robfig commented Aug 28, 2013

I definitely agree with #56 that a recursive watch is super common and should be supported.

To your second point, is there a problem that it requires a framework, or just that it may be difficult to make it work? This project has supposedly done it successfully:
https://github.com/samjacobson/go.fsevents

Using cgo doesn't seem like a problem, especially since it would only be used for the platform that requires/uses it.

@paulhammond
Copy link
Contributor

Good point - I usually avoid cgo based code because it makes cross compilation much harder, if not impossible. That's not a good reason to not offer FSEvents when available.

That said, I'm not going to be building this (working through the issues helped me find an alternate solution to the specific problem I was hoping to solve with FSEvents)

@nathany
Copy link
Contributor

nathany commented Sep 3, 2013

I have OS X and would be willing to help out once the recursive watcher is in place. If @samjacobson adds an open source license to his project, it could be a good basis for adding FSEvents.

@samjacobson
Copy link

I was contacted by @nathany regarding this issue. I can't add an open source license to go.fsevents because it's not mine; it's a fork of Steven Degutis' code. I made the fork when I was working on a project that required this functionality. I ended up implementing my own version, which I would be happy to provide under a suitable license.

@nathany
Copy link
Contributor

nathany commented Sep 8, 2013

@samjacobson That would be great! Fsnotify is under a 3-clause BSD license, if that works for you.

@samjacobson
Copy link

I've forked fsnotify, and put together a rough implementation. See https://github.com/samjacobson/fsnotify. This is based on my "EventStream" code, and is not perfectly matched to fsnotify semantics yet. At present the Watcher requires a list of paths to watch when it is created. In order to support Watch and RemoveWatch the (internal) FSEventStream will need to be recreated. I'll send a pull request when it's closer to ready.

I changed fsnotify_bsd.go to be used on darwin if cgo is not available. fsnotify_osx will be used on darwin if cgo is available. Any comments / assistance would be appreciated. I understand the FSEventStream (OSX API) well, and of course my original cgo interface, but I don't fully understand the fsnotify API yet.

@howeyc
Copy link
Owner

howeyc commented Sep 8, 2013

@samjacobson That looks very promising, thank you!

This was referenced Sep 8, 2013
@nathany
Copy link
Contributor

nathany commented Sep 9, 2013

@samjacobson This is a great start. Thanks for pitching in.

So there will be some overhead each time a Watch is added to the Watcher?

It may be worth having an internal API that can watch multiple paths at once for adding a recursive watch. Here's some "pseudo-code" of what I have in mind so far, though I haven't worked out the details of internal state (watches, fsnFlags) across subdirectories, and there may be a better OS-specific way to do this.

@samjacobson
Copy link

There are a couple of places where the FSEvents API is not particularly compatible with fsnotify. Firstly; if you monitor a directory with FSEvents then the monitor is recursive; there is no non-recursive option. Secondly; when you setup a FSEventStream you specify the paths to watch when the stream is created (the stream is subsequently added to a runloop and started).

I had a look at @nathany's proposed API changes, and the new WatchPath method with options looks like a good fit for OSX. To support this fully I think it would need to work like so:

  • For each call to WatchPath a new FSEventStream would be created (they should all be able to be scheduled on the same runloop). A map from path -> stream should be kept so that RemoveWatch can just stop and invalidate the FSEventStream.
  • For non-recursive watches the callback handler (fsevtCallback) should filter the returned events dropping any necessary events.

I haven't taken the time to understand fsnFlags yet; hopefully that doesn't invalidate the above.

@nathany
Copy link
Contributor

nathany commented Sep 9, 2013

@samjacobson That makes perfect sense to me. This is a 2-day old "rough sketch" of a pipeline to which we could add a step to filter events for non-recursive watches and only enable it on OS X.

From what I can tell, fsnFlags are indexed by ev.Name (filename) and therefore copied around for every watch. Besides renaming them to triggers, my thought is to replace fsnFlags map[string]uint32 with a map from ev.Name -> *pipeline, where the pipeline struct stores the options and intermediary data that the pipeline steps need.

I will do a pull request (in the next few days) that just adds a single pipeline step (triggers/fsnFlags) and the Event interface. That way we have some infrastructure in place while working on the other pipeline steps and public API.

@samjacobson
Copy link

@nathany I should note that above, one of the reasons I suggested a separate FSEventStream per watch was to support a separate throttle for each watch (as supported by your API). The FSEvents API has a latency parameter which is used to throttle the delivery of events. It allows "compression" of events, ie two modify events of the same path could become one if they happen within 'latency' seconds.

@nathany
Copy link
Contributor

nathany commented Sep 9, 2013

Oh. That is one of the parameters that I thought might make more sense as global to the Watcher. Perhaps w.SetDuration(1 * time.Second); w.WatchPath(...? Please do suggest changes to that API. We may be stuck with it for a while, so it's important to get right.

@samjacobson
Copy link

If the duration was global to the watcher then there would be more flexibility in the implementation of the FSEvents interface, ie it could use either one FSEventStream or many.

Wrt to the API in general: my only concerns (being pedantic) are:

  • I've not played with kqueue before, but it looks like you watch a file descriptor? If so, then moving the directory that you are watching will probably not effect the watch (ie the fd is still valid, just at a different path)? This means that you watch a directory rather than a path. FSEvents watches a path instead of a directory; for this reason it has a "watch root" option (kFSEventStreamCreateFlagWatchRoot) which will trigger events if the path (or any one of it's parents) is altered in such a way as to remove/change the watch.
  • FSEvents will watch a path recursively, except that the recursion will not extend across filesystems (think find -x).

The above 2 issues could be hidden by OSX specific code, of course, but there is a cost involved (in runtime, code size, development time, etc). I think it's worth considering whether a few alterations to fit OSX better is worthwhile?

Could recursive always be one filesystem (-x), or could it be optional with a note saying that one filesystem is optimal on OSX?

How should "watch root" be handled to get similar behaviour between platforms? Bearing in mind that if watch root is enabled, and the watched path is moved, then to transparently act like kqueue the fseventstream would need to be recreated pointing at the new path. I'm not entirely sure about the "cost" of this, or if there are any icky corner cases, which is why I wonder whether it would be better to expose it to be handled by the application (the user of fsnotify).

@nathany
Copy link
Contributor

nathany commented Sep 9, 2013

Lovely leaky abstractions. @howeyc will be able to answer better than I, but I do think it's fine to have some OS-specific caveats (one file system on OS X, file descriptor limits with kqueue).

If we do fallback to kqueue on OS X without cgo, I think apps will need to be very clear about which adapter is being used. Otherwise the differences could be very confusing.

@samjacobson
Copy link

Good point regarding fallback.

@howeyc
Copy link
Owner

howeyc commented Sep 9, 2013

I'd really like to keep the behavior consistent across platforms, keeping in mind that BSD watches based on file descriptors and movement of the folder being watched does not change the file descriptor. There will likely be some testing required to ensure consistency of what happens when the watched folder is moved and the behavior is the same for all OSs and the various methods of kqueue/FSEvents on OSX. If this can't be the same, we would certainly have to document it so users know.

I am intrigued by the FSEvents API that throttles events and "compresses" based on the same path. If this is per watch, perhaps this is the level we replicate the throttling for other OSs?

As for FsnFlags, those are so the user can only see certain types of events they are interested in and other events are discarded internally before being passed on.

@samjacobson
Copy link

@howeyc FYI From the FSEvents help:

latency
The number of seconds the service should wait after hearing about an event from the kernel before passing it along to the client via its callback. Specifying a larger value may result in more effective temporal coalescing, resulting in fewer callbacks and greater overall efficiency.

kFSEventStreamCreateFlagNoDefer
Affects the meaning of the latency parameter. If you specify this flag and more than latency seconds have elapsed since the last event, your app will receive the event immediately. The delivery of the event resets the latency timer and any further events will be delivered after latency seconds have elapsed. This flag is useful for apps that are interactive and want to react immediately to changes but avoid getting swamped by notifications when changes are occurring in rapid succession. If you do not specify this flag, then when an event occurs after a period of no events, the latency timer is started. Any events that occur during the next latency seconds will be delivered as one group (including that first event). The delivery of the group of events resets the latency timer and any further events will be delivered after latency seconds. This is the default behavior and is more appropriate for background, daemon or batch processing apps.

To answer your question: it's per FSEventStream. Based on my current implementation; this would be per watch.

@nathany
Copy link
Contributor

nathany commented Sep 11, 2013

@samjacobson In other words, with kFSEventStreamCreateFlagNoDefer = YES, the event is forwarded on the leading edge, whereas the default is to forward the event on the trailing edge of latency seconds. (#62)

PrettyAutoTest forwards an event on the leading edge and then swallows events for a second.

It may be worth implementing both cases and adding another Option.

@howeyc As much consistency as possible, at least. :-)

We may need to use the new WatchPath API as an excuse to change past behaviours to get the consistency we want across all OSes. If so, it may be a little cumbersome to keep Watch and WatchFlags behaving as is, but at least that can be cleaned up eventually with a major release.

@samjacobson
Copy link

@nathany yes, that's the behaviour of kFSEventStreamCreateFlagNoDefer. The advantage of using the trailing edge is that close events can be coalesced. If the event is delivered on the leading edge then latency is best, but double events are almost guaranteed.

@nathany
Copy link
Contributor

nathany commented Sep 13, 2013

I'm not sure why there would be double events in the leading edge case? It still waits latency seconds before delivering an event. Do FSEvents still deliver all the events during that period? It sounded like it didn't.

For the tools I'm building, I would prefer the leading edge. But perhaps others will find the trailing edge useful, requiring that we implement both.

I sent a pull request for the bare bones pipeline code. Maybe we should start a branch to collaborate on before merging to master?

@robfig
Copy link
Contributor Author

robfig commented Sep 13, 2013

(It certainly seems like being able to specify platform-specific flags to a watch call would be useful and not inappropriate. )

@samjacobson
Copy link

@nathany my understanding is that in the leading edge case; evt1 happens and is delivered; the latency timer is started evt2 happens and is not delivered (yet); if evt3 happens it may be coalesced with evt2; latency timer expires and evt2 is delivered.

In the trailing edge case evt1 happens and is not delivered (yet); latency timer is started; evt2 happens and is coalesced; if evt3 happens it'd be coalesced too; latency timer expires and coalesced events are delivered.

FSEventStream can't really drop evt2 on the floor in the leading edge case because that would mean that change1 happens, and is handled; change2 happens (within latency seconds), and is never handled.

I'm curious whether people's required use cases are tolerant to a small delay (e.g 1 second) or not? By the sounds of it both will need to be supported.

@nathany
Copy link
Contributor

nathany commented Sep 13, 2013

@samjacobson The leading edge case in PrettyAutoTest is slightly different. If FSEvents is the only one with direct support, then I suppose we emulate that behaviour on other systems for consistency.

@nathany
Copy link
Contributor

nathany commented Sep 21, 2013

@samjacobson Today I'm working on #65 to have a way to enable different pipeline steps. Rather than have OS-specific code call in to "enable" something like filtering out recursive directories, I'm thinking of calling the OS-specific code from the addWatch/newPipeline code.

Eg. hey FSEventStream code, the user requested throttling with latency x on the trailing edge, can you take care of that for me? If no, then it enables the pipeline step to manually handle throttling.

Recursive and non-recursive will both have to be a thing.

Eg. hey FSEventStream code, the user requested a non-recursive watch, what to do? Your code says it doesn't handle it, so filterRecursive kicks in. Every other implementation says it can handle it.

Does that sounds like it will work?

@nathany
Copy link
Contributor

nathany commented Sep 22, 2013

@samjacobson I gave you the commit bit to my fork of fsnotify and created an fsevents branch over there. It pulls in your code, merges in all the event pipeline stuff I've done so far, and has a few fixes to get it building again. Hopefully that will make it easier for us to collaborate. I'm not familiar with FSEventStream yet, but let me know what I can do to help (email, twitter, whatever).

Tip: I'm doing all my coding under a github.com/howeyc/fsnotify directory so that I can test it from other libraries that import it (see contributing).

@nathany
Copy link
Contributor

nathany commented Oct 28, 2013

Just added a Wiki page to list any reference materials that would be helpful. Please help fill it in. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants