Skip to content

feat: rework PathsMut to be more consistent #705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

riberk
Copy link
Contributor

@riberk riberk commented Jul 18, 2025

The feature to add / remove paths to watch in batch was introduced in #692 but there was a problem with the implementation: if you drop the instance without commit it may leave the watcher in an inconsistent state (especially fsevents).

This commit changes behavior and api and introduces update_watches method of the Watcher trait, which do the same stuff but in more consistent way. The paths_mut method was kept, but returning struct is just a builder for the method update_watches that will be called by PathsMut::commit

related #653
closes #704

The feature to add / remove paths to watch in batch was introduced in notify-rs#692
There was a problem with the implementation: if you drop the instance without `commit` it may leave the watcher in an inconsistent state (especially fsevents).

This commit changes behavior and api and introduces `update_watches` method of the `Watcher` trait, which do the same stuff but in more consistent way.
The `paths_mut` method was kept, but returning struct is just a builder for the method `update_watches` that will be called by `PathsMut::commit`

related notify-rs#653
closes notify-rs#704
@riberk riberk force-pushed the paths-mut-rework branch from 95476f1 to aaaac45 Compare July 18, 2025 12:22
@dfaust
Copy link
Member

dfaust commented Jul 18, 2025

if you drop the instance without commit it may leave the watcher in an inconsistent state (especially fsevents).

So if I'm getting this right: When the FsEventPathsMut is used to add/remove paths and then dropped without calling commit, it leaves the watcher in an inconsistent state because the paths have been changed, but the watcher is still watching the old paths. It looks like the paths are not used again (eg. to build a full path when an event is emitted). So this should not cause any errors. But it is confusing and should therefore be fixed.

The paths_mut method was kept, but returning struct is just a builder for the method update_watches that will be called by PathsMut::commit

I would like to keep the API lean. Therefore I would deprecate paths_mut.

@branchseer, @JohnTitor Do you have an opinion on this?

@riberk
Copy link
Contributor Author

riberk commented Jul 18, 2025

it leaves the watcher in an inconsistent state because the paths have been changed

The main problem is not having paths changed but is having watcher stopped if we don't call commit

struct FsEventPathsMut<'a>(&'a mut FsEventWatcher);
impl<'a> FsEventPathsMut<'a> {
    fn new(watcher: &'a mut FsEventWatcher) -> Self {
        watcher.stop(); // <----- If we doesn't call commit the watcher is stopped and the only way to start it is call watch or unwatch
        // watch and unwatch runs the watcher whether encounters with error or not

        Self(watcher)
    }
}
impl PathsMut for FsEventPathsMut<'_> {
    fn add(&mut self, path: &Path, recursive_mode: RecursiveMode) -> Result<()> {
        self.0.append_path(path, recursive_mode)
    }

    fn remove(&mut self, path: &Path) -> Result<()> {
        self.0.remove_path(path)
    }

    fn commit(self: Box<Self>) -> Result<()> {
        // ignore return error: may be empty path list
        let _ = self.0.run();
        Ok(())
    }
}

@riberk
Copy link
Contributor Author

riberk commented Jul 18, 2025

I would like to keep the API lean

I agree. User can do it by themself if they want.

Ok, I'll delete it

@dfaust
Copy link
Member

dfaust commented Jul 18, 2025

Ok, I'll delete it

By deprecating it, I meant marking it with the deprecated attribute:

#[deprecated(since = "9.0.0", note = "paths_mut has been replaced with update_watches")]

Btw. since we are using the term path in other places, I would rename the function to update_paths.

@dfaust
Copy link
Member

dfaust commented Jul 18, 2025

Also, if the PathsMut compatibility API is the same as the current one, then we can release it as a feature update (v. 8.3.0).

I would wait a day and give the others a chance to add their opinions. Then I can cut a release tomorrow.

@riberk
Copy link
Contributor Author

riberk commented Jul 18, 2025

Also, if the PathsMut compatibility API is the same as the current one, then we can release it as a feature update (v. 8.3.0).

It's possible, i'll try to do that

`WatchPathConfig` provides a way to configure single watch.
Now it's just a wrapper around a `RecursiveMode` but it'll help us
to add some functionality, like filter watched paths or specify a mask
for events without breaking changes
@riberk
Copy link
Contributor Author

riberk commented Jul 18, 2025

Ok, in addition to the renaming and saving API compatibility I changed PathOp argument to us own struct - WatchPathConfig. It'll help us to keep compatibility when new watch-related features are added. It's like a Config but for a single watch

@@ -345,40 +372,53 @@ pub trait Watcher {
/// fails.
fn unwatch(&mut self, path: &Path) -> Result<()>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rolldown we add paths to watch from a FxDashSet<ArcStr>: https://github.com/rolldown/rolldown/blob/d227c81088b30e2357af602881716f3e3a189b3e/crates/rolldown/src/watch/watcher_task.rs#L122-L147

Could the API be designed in a way that doesn't require allocating a Vec with the same size of paths? (There are might be tens of thousands of paths)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now implementations are taking Path by ref and canonicalize them, sometimes with to_path_buf like there. Eventually, any &Path will be turned into PathBuf and will be stored into HashMap.

IMO a design with allocated Vec is more predictable, because we don't need to have a watcher in inconsistent state between some calls, only inside an impl of update_paths

I have measured some things on mac M1 with fsevents

  • A single watch (with stop / append_path / start) is 4ms
  • A single unwatch is the same
  • 10 000 allocations on mac M1 is 0.3ms.
  • 10 000 pushes with no additional allocations is 0.1ms .
  • 10 000 with stop / all append_path / run - like paths_mut does - 300ms
  • 10 000 with update_watches with all allocations 280ms
#[test]
fn test() {
    use std::time::Instant;
    let dir = tempfile::tempdir().unwrap();
    let paths: Vec<PathBuf> = (0..10000).map(|v| dir.path().join(v.to_string())).collect();

    paths.iter().for_each(|v| std::fs::create_dir(v).unwrap());

    let mut watcher = FsEventWatcher::new(|_| {}, Default::default()).unwrap();
    let instant = Instant::now();
    for i in 0..1000 {
        watcher.watch(&paths[i], RecursiveMode::Recursive).unwrap();
    }
    println!("watches: {:?}", instant.elapsed()); // 4392 ms

    let instant = Instant::now();
    for i in 0..1000 {
        watcher.unwatch(&paths[i]).unwrap();
    }
    println!("unwatch: {:?}", instant.elapsed()); // 4388 ms

    let instant = Instant::now();
    let mut vec = Vec::new();
    for _ in 0..10_000 {
        vec.push(PathBuf::from("/path/to/smth"));
    }
    println!("allocs: {:?}", instant.elapsed()); // 0.324 ms

    let mut vec_alloc = Vec::new();
    let instant = Instant::now();
    for _ in 0..10_000 {
        vec_alloc.push(Path::new("/path/to/smth"));
    }
    println!("resize: {:?}", instant.elapsed()); // 0.112 ms

    let instant = Instant::now();
    watcher.stop();
    for i in 0..10_000 {
        watcher
            .append_path(&paths[i], RecursiveMode::Recursive)
            .unwrap();
    }
    watcher.run().unwrap();
    println!("by paths: {:?}", instant.elapsed()); // 300 ms

    let instant = Instant::now();
    watcher.stop();
    for i in 0..10_000 {
        watcher.remove_path(&paths[i]).unwrap();
    }
    watcher.run().unwrap_err();
    println!("remove paths: {:?}", instant.elapsed()); // 3353 ms

    let instant = Instant::now();
    watcher
        .update_paths(
            paths
                .iter()
                .map(|v| v.to_path_buf())
                .map(|v| crate::PathOp::Watch(v, crate::WatchPathConfig::new(RecursiveMode::Recursive)))
                .collect(),
        )
        .unwrap();
    println!("update_watches watch: {:?}", instant.elapsed()); // 290ms


    let instant = Instant::now();
    watcher
        .update_paths(
            paths
                .iter()
                .map(|v| v.to_path_buf())
                .map(|v| crate::PathOp::Unwatch(v))
                .collect(),
        )
        .unwrap();
    println!("update_watches unwatch: {:?}", instant.elapsed()); // 3339ms
}

Allocations there doesn't matter, because the majority time is spent to internal fsevents stuff and the allocated memory just will free on drop

Comment on lines -286 to -290
fn commit(self: Box<Self>) -> Result<()> {
// ignore return error: may be empty path list
let _ = self.0.run();
Ok(())
}
Copy link
Contributor

@branchseer branchseer Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added commit to give future implementations a chance to return errors at the time of commit. It was an intentional design choice with its own downside. Now on second thought, inconsistent states are indeed more harmful.

How about just replacing commit with drop?

impl Drop for FsEventPathsMut<'_>  {
  fn drop(&mut self) {
        // ignore return error: may be empty path list
        let _ = self.0.run();
  }
}

Would this change solve the inconsistent state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, but it seems a bit confusing, isn't it?

I tried to rework it to be more clear and less verbose.

Compare

For known number of paths

let mut paths = watcher.paths_mut();
paths.add(path1, recursive)?;
paths.add(path2, recursive)?;
paths.add(path3, recursive)?;
patch.commit()?;

and

watcher.update_paths(vec![
    PathOp::watch_recursive(path1),
    PathOp::watch_recursive(path2),
    PathOp::watch_recursive(path3),
])?;

For unknown number of paths

let mut paths = watcher.paths_mut();
for path in my_paths {
    paths.add(path, recursive)?;
}
paths.commit()?;

and

watcher.update_paths(my_paths.iter().map(PathOp::watch_recursive).collect());

Copy link
Contributor

@branchseer branchseer Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that commit is move to drop, there would be no commit call:

let mut paths = watcher.paths_mut();
for path in my_paths {
    paths.add(path, recursive)?;
}

It seems to me the commit-less version of paths_mut is now just a lower-level version of update_paths. It's now a problem of verbosity rather than inconsistency.

I appreciate your effort to make the API more friendly. But personally, I'd always prefer to expose a lower-level API than a friendly API in a rust library. You can always create a update_paths function externally based on paths_mut if you feel paths_mut is too verbose.

I know the performance is not really relevant here, but as a Rust developer, it just feels weird to see an API requiring a Vec, whereas paths_mut is a commonly known "builder" pattern (example: bindgen::Builder).

If you still strongly feel update_paths is essential in notify, please at least consider keeping paths_mut as well for people like me who prefer a builder-style API.

cc @dfaust

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Builder api is a good choice there and it can be made without lib-support, you can just build a Vec with some UpdatePathsBuilder you made. I think example proves it - they take any Into<String> into builder method header (for example) and make Box<str> that is added into the Vec<Box<str>>. Watcher just can't take generics because of dyn compability but the effect is the same as in your example - build some in-memory stuff to call the main method (build, update_paths, etc).

To be honest, PathsMut just pretends to be a builder, but it is not, because it doesn't build anything new, it mutates the watcher

You can always create a update_paths function externally based on paths_mut

Yes, but I just try to explain my opinion, that an API that allows user to have notify objects in a state that can't be come in any other ways is not a good choice. The verbosity reduction is just a side effect.

If we remove commit this state is still there - after paths_mut call. There are some implicit things that may be really confusing, for me it would be "Could I call long-time running code between add calls?". As a user I wouldn't expect, that I don't get any events after paths_mut call. This argument may work with my API choice either, but at least update_paths doesn't allow user to do anything while service is being stopped and forces user to wait the call finishes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@branchseer I think about it in a descriptive manner: how I explain that thing to a user.

PathsMut

  • Call paths_mut method and store the object you got. The method may fail in cases, that depend on the watcher you use (but the cases are not the same as in watch / unwatch calls)
  • Call add and / or remove methods on the given object. They may fail some cases, that similar to the watch unwatch but not the same
  • Drop the given object to apply the changes. It may be noop or do some things like running the macos fsevents service

update_paths

  • Make the changes you want to apply as a Vec of PathOp (PathOp::Watch or PathOp::Unwatch)
  • Call the update_paths method with the Vec. The method will fail in the exactly the same cases as watch and unwatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like update_watch_paths(paths: impl Iterator<Item = (impl AsRef<str>, WatchOptions)>)?

It would be the best way if it kept the Watcher dyn safe, but having generics in a trait restrict the trait from using in cases like &dyn Watcher

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rolldown doesn't currenly remove paths but yeah dyn safely is a major constraint. How about accepting a callback: update_watch_paths(&mut self, update_fn: Box<dyn FnOnce(&dyn PathsMut) -> Result<()>>) -> Result<()>? It will be used like:

watcher.update_watch_paths(Box::new(|mut paths_mut| {
   paths_mut.add(...)?;
   paths_mut.remove(...)?;
})?;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting approach - it does avoid the main issue of inconsistent watcher state externally.

That said, the resulting API feels a bit unusual.

And the allocation is still present - a Box is required either way.

Do you think avoiding allocations here justifies making the API substantially less clear? I was under the impression that, compared to filesystem calls, memory allocation was considered negligible.

Copy link
Contributor

@branchseer branchseer Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the callback style looks very natural to me. Here's a similar API from std: std::thread::scope.

Besides the builder parameter, I like every method returning the same type of plain error that can be easily handled using ?. To me UpdatePathsError is a high cognitive overhead.

This is just expressing my personal state, not to undermine your work :) Really appreciate you putting time and thoughts into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a similar API from std: std::thread::scope

Yes, but thread::scope takes a generic parameter and you don't have to box it

Besides the builder parameter, I like every method returning the same type of plain error that can be easily handled using ?.

Good point. I've added impl From<UpdatePathsError> for Error

To me UpdatePathsError is a high cognitive overhead

Yeh, it is, but we do need a way to handle paths, that weren’t successfully added. In some cases if you don't care about errors you can call update_paths in a loop

let mut paths_to_add = ...;
while !paths_to_add.is_empty() {
    if let Err(e) = watcher.update_paths(std::mem::take(&mut paths_to_add)) {
        paths_to_add = e.remaining
    }
}

with Box<dyn FnMut(PathsMut) -> Result<(), Error>> we have to do something like that:

let mut paths_to_add = ...;
loop {
    let _ = watcher.update_paths(Box::new(|mut paths_mut| {
        let mut iter = std::mem::take(paths_to_add).drain();
        while let Some(operation) = iter.next() {
            if paths_mut.add(operation).is_err() {
                paths_to_add.extend(iter);
                return Ok(())
            }
        }
        Ok(())
    }));
}

This is just expressing my personal state, not to undermine your work :) Really appreciate you putting time and thoughts into this.

No worries at all - I didn’t take it that way.

That said, I feel like in our efforts to improve the PathsMut API, we might be losing some of the broader context. Specifically, beyond just adjusting how we handle paths, we also need to account for other inputs — like supporting not just RecursiveMode, but also more general configuration via WatchConfig.

Changing just PathsMut would be a breaking change, so we’ll need to evolve the API structure anyway. As I mentioned earlier, I see a few possible directions to address this.

Another option would be:

  • Adding a method to the PathsMut trait, e.g. watch(&Path, WatchConfig). Since it's not really user-implementable, we can technically do this. But this would still suffer from the same core issue - inconsistency: either methods that never fail but return Result, or strange pre-commit state. In this case, update_paths becomes unnecessary, but to be honest, I'm strongly against this direction - it feels like an awkward and messy API.
  • Keeping PathsMut as an internal "mutation builder", and using something like Box<dyn FnMut(&mut PathsMut) -> Result<(), Error>>. Personally, I find this approach less clear overall. It also wouldn’t solve the inconsistency between calling paths_mut() and commit() means we’d still need to deprecate Watcher::paths_mut eventually.
  • Deprecating paths_mut and PathsMut entirely, update_paths takes a Vec<PathOp>. This is what the current PR does. I like this approach best - which is why I implemented it.

@riberk
Copy link
Contributor Author

riberk commented Jul 18, 2025

@branchseer my main idea wasn't to remake it just because I want to, but to make the code more ready to #632 . I want this feature, because it allows inotify-watcher to skip big directories that a user doesn't care about and it will cause drastically increasing performance in some cases (inotify spent ALL the time to scan directories, not to syscalls).

@riberk riberk force-pushed the paths-mut-rework branch from 41e9bf1 to cc1bc73 Compare July 21, 2025 13:24
@riberk
Copy link
Contributor Author

riberk commented Jul 27, 2025

@dfaust

Hi! Sorry to bother you — would you mind taking another look at the PR when you get a chance? I'd really appreciate your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PathsMut breaking changes discussion
3 participants