-
Notifications
You must be signed in to change notification settings - Fork 245
feat: rework PathsMut to be more consistent #705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The feature to add / remove paths to watch in batch was introduced in notify-rs#692 There was a problem with the implementation: if you drop the instance without `commit` it may leave the watcher in an inconsistent state (especially fsevents). This commit changes behavior and api and introduces `update_watches` method of the `Watcher` trait, which do the same stuff but in more consistent way. The `paths_mut` method was kept, but returning struct is just a builder for the method `update_watches` that will be called by `PathsMut::commit` related notify-rs#653 closes notify-rs#704
So if I'm getting this right: When the
I would like to keep the API lean. Therefore I would deprecate @branchseer, @JohnTitor Do you have an opinion on this? |
The main problem is not having paths changed but is having watcher stopped if we don't call struct FsEventPathsMut<'a>(&'a mut FsEventWatcher);
impl<'a> FsEventPathsMut<'a> {
fn new(watcher: &'a mut FsEventWatcher) -> Self {
watcher.stop(); // <----- If we doesn't call commit the watcher is stopped and the only way to start it is call watch or unwatch
// watch and unwatch runs the watcher whether encounters with error or not
Self(watcher)
}
}
impl PathsMut for FsEventPathsMut<'_> {
fn add(&mut self, path: &Path, recursive_mode: RecursiveMode) -> Result<()> {
self.0.append_path(path, recursive_mode)
}
fn remove(&mut self, path: &Path) -> Result<()> {
self.0.remove_path(path)
}
fn commit(self: Box<Self>) -> Result<()> {
// ignore return error: may be empty path list
let _ = self.0.run();
Ok(())
}
} |
I agree. User can do it by themself if they want. Ok, I'll delete it |
By deprecating it, I meant marking it with the #[deprecated(since = "9.0.0", note = "paths_mut has been replaced with update_watches")] Btw. since we are using the term |
Also, if the PathsMut compatibility API is the same as the current one, then we can release it as a feature update (v. 8.3.0). I would wait a day and give the others a chance to add their opinions. Then I can cut a release tomorrow. |
It's possible, i'll try to do that |
`WatchPathConfig` provides a way to configure single watch. Now it's just a wrapper around a `RecursiveMode` but it'll help us to add some functionality, like filter watched paths or specify a mask for events without breaking changes
Ok, in addition to the renaming and saving API compatibility I changed |
@@ -345,40 +372,53 @@ pub trait Watcher { | |||
/// fails. | |||
fn unwatch(&mut self, path: &Path) -> Result<()>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rolldown we add paths to watch from a FxDashSet<ArcStr>
: https://github.com/rolldown/rolldown/blob/d227c81088b30e2357af602881716f3e3a189b3e/crates/rolldown/src/watch/watcher_task.rs#L122-L147
Could the API be designed in a way that doesn't require allocating a Vec
with the same size of paths? (There are might be tens of thousands of paths)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now implementations are taking Path
by ref and canonicalize
them, sometimes with to_path_buf
like there. Eventually, any &Path
will be turned into PathBuf and will be stored into HashMap
.
IMO a design with allocated Vec
is more predictable, because we don't need to have a watcher in inconsistent state between some calls, only inside an impl of update_paths
I have measured some things on mac M1 with fsevents
- A single watch (with stop / append_path / start) is 4ms
- A single unwatch is the same
- 10 000 allocations on mac M1 is 0.3ms.
- 10 000 pushes with no additional allocations is 0.1ms .
- 10 000 with
stop
/ allappend_path
/run
- likepaths_mut
does - 300ms - 10 000 with
update_watches
with all allocations 280ms
#[test]
fn test() {
use std::time::Instant;
let dir = tempfile::tempdir().unwrap();
let paths: Vec<PathBuf> = (0..10000).map(|v| dir.path().join(v.to_string())).collect();
paths.iter().for_each(|v| std::fs::create_dir(v).unwrap());
let mut watcher = FsEventWatcher::new(|_| {}, Default::default()).unwrap();
let instant = Instant::now();
for i in 0..1000 {
watcher.watch(&paths[i], RecursiveMode::Recursive).unwrap();
}
println!("watches: {:?}", instant.elapsed()); // 4392 ms
let instant = Instant::now();
for i in 0..1000 {
watcher.unwatch(&paths[i]).unwrap();
}
println!("unwatch: {:?}", instant.elapsed()); // 4388 ms
let instant = Instant::now();
let mut vec = Vec::new();
for _ in 0..10_000 {
vec.push(PathBuf::from("/path/to/smth"));
}
println!("allocs: {:?}", instant.elapsed()); // 0.324 ms
let mut vec_alloc = Vec::new();
let instant = Instant::now();
for _ in 0..10_000 {
vec_alloc.push(Path::new("/path/to/smth"));
}
println!("resize: {:?}", instant.elapsed()); // 0.112 ms
let instant = Instant::now();
watcher.stop();
for i in 0..10_000 {
watcher
.append_path(&paths[i], RecursiveMode::Recursive)
.unwrap();
}
watcher.run().unwrap();
println!("by paths: {:?}", instant.elapsed()); // 300 ms
let instant = Instant::now();
watcher.stop();
for i in 0..10_000 {
watcher.remove_path(&paths[i]).unwrap();
}
watcher.run().unwrap_err();
println!("remove paths: {:?}", instant.elapsed()); // 3353 ms
let instant = Instant::now();
watcher
.update_paths(
paths
.iter()
.map(|v| v.to_path_buf())
.map(|v| crate::PathOp::Watch(v, crate::WatchPathConfig::new(RecursiveMode::Recursive)))
.collect(),
)
.unwrap();
println!("update_watches watch: {:?}", instant.elapsed()); // 290ms
let instant = Instant::now();
watcher
.update_paths(
paths
.iter()
.map(|v| v.to_path_buf())
.map(|v| crate::PathOp::Unwatch(v))
.collect(),
)
.unwrap();
println!("update_watches unwatch: {:?}", instant.elapsed()); // 3339ms
}
Allocations there doesn't matter, because the majority time is spent to internal fsevents
stuff and the allocated memory just will free on drop
fn commit(self: Box<Self>) -> Result<()> { | ||
// ignore return error: may be empty path list | ||
let _ = self.0.run(); | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added commit
to give future implementations a chance to return errors at the time of commit. It was an intentional design choice with its own downside. Now on second thought, inconsistent states are indeed more harmful.
How about just replacing commit
with drop
?
impl Drop for FsEventPathsMut<'_> {
fn drop(&mut self) {
// ignore return error: may be empty path list
let _ = self.0.run();
}
}
Would this change solve the inconsistent state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would, but it seems a bit confusing, isn't it?
I tried to rework it to be more clear and less verbose.
Compare
For known number of paths
let mut paths = watcher.paths_mut();
paths.add(path1, recursive)?;
paths.add(path2, recursive)?;
paths.add(path3, recursive)?;
patch.commit()?;
and
watcher.update_paths(vec![
PathOp::watch_recursive(path1),
PathOp::watch_recursive(path2),
PathOp::watch_recursive(path3),
])?;
For unknown number of paths
let mut paths = watcher.paths_mut();
for path in my_paths {
paths.add(path, recursive)?;
}
paths.commit()?;
and
watcher.update_paths(my_paths.iter().map(PathOp::watch_recursive).collect());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that commit
is move to drop
, there would be no commit
call:
let mut paths = watcher.paths_mut();
for path in my_paths {
paths.add(path, recursive)?;
}
It seems to me the commit
-less version of paths_mut
is now just a lower-level version of update_paths
. It's now a problem of verbosity rather than inconsistency.
I appreciate your effort to make the API more friendly. But personally, I'd always prefer to expose a lower-level API than a friendly API in a rust library. You can always create a update_paths
function externally based on paths_mut
if you feel paths_mut
is too verbose.
I know the performance is not really relevant here, but as a Rust developer, it just feels weird to see an API requiring a Vec
, whereas paths_mut
is a commonly known "builder" pattern (example: bindgen::Builder
).
If you still strongly feel update_paths
is essential in notify
, please at least consider keeping paths_mut
as well for people like me who prefer a builder-style API.
cc @dfaust
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Builder api is a good choice there and it can be made without lib-support, you can just build a Vec
with some UpdatePathsBuilder
you made. I think example proves it - they take any Into<String>
into builder method header
(for example) and make Box<str>
that is added into the Vec<Box<str>>
. Watcher
just can't take generics because of dyn compability but the effect is the same as in your example - build some in-memory stuff to call the main method (build
, update_paths
, etc).
To be honest, PathsMut
just pretends to be a builder, but it is not, because it doesn't build anything new, it mutates the watcher
You can always create a update_paths function externally based on paths_mut
Yes, but I just try to explain my opinion, that an API that allows user to have notify objects in a state that can't be come in any other ways is not a good choice. The verbosity reduction is just a side effect.
If we remove commit
this state is still there - after paths_mut
call. There are some implicit things that may be really confusing, for me it would be "Could I call long-time running code between add
calls?". As a user I wouldn't expect, that I don't get any events after paths_mut
call. This argument may work with my API choice either, but at least update_paths
doesn't allow user to do anything while service is being stopped and forces user to wait the call finishes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@branchseer I think about it in a descriptive manner: how I explain that thing to a user.
PathsMut
- Call
paths_mut
method and store the object you got. The method may fail in cases, that depend on the watcher you use (but the cases are not the same as inwatch
/unwatch
calls) - Call
add
and / orremove
methods on the given object. They may fail some cases, that similar to thewatch
unwatch
but not the same - Drop the given object to apply the changes. It may be noop or do some things like running the macos fsevents service
update_paths
- Make the changes you want to apply as a
Vec
ofPathOp
(PathOp::Watch
orPathOp::Unwatch
) - Call the
update_paths
method with theVec
. The method will fail in the exactly the same cases aswatch
andunwatch
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like
update_watch_paths(paths: impl Iterator<Item = (impl AsRef<str>, WatchOptions)>)
?
It would be the best way if it kept the Watcher
dyn safe, but having generics in a trait restrict the trait from using in cases like &dyn Watcher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rolldown doesn't currenly remove paths but yeah dyn safely is a major constraint. How about accepting a callback: update_watch_paths(&mut self, update_fn: Box<dyn FnOnce(&dyn PathsMut) -> Result<()>>) -> Result<()>
? It will be used like:
watcher.update_watch_paths(Box::new(|mut paths_mut| {
paths_mut.add(...)?;
paths_mut.remove(...)?;
})?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an interesting approach - it does avoid the main issue of inconsistent watcher state externally.
That said, the resulting API feels a bit unusual.
And the allocation is still present - a Box
is required either way.
Do you think avoiding allocations here justifies making the API substantially less clear? I was under the impression that, compared to filesystem calls, memory allocation was considered negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the callback style looks very natural to me. Here's a similar API from std: std::thread::scope
.
Besides the builder parameter, I like every method returning the same type of plain error that can be easily handled using ?
. To me UpdatePathsError
is a high cognitive overhead.
This is just expressing my personal state, not to undermine your work :) Really appreciate you putting time and thoughts into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a similar API from std: std::thread::scope
Yes, but thread::scope
takes a generic parameter and you don't have to box it
Besides the builder parameter, I like every method returning the same type of plain error that can be easily handled using
?
.
Good point. I've added impl From<UpdatePathsError> for Error
To me
UpdatePathsError
is a high cognitive overhead
Yeh, it is, but we do need a way to handle paths, that weren’t successfully added. In some cases if you don't care about errors you can call update_paths
in a loop
let mut paths_to_add = ...;
while !paths_to_add.is_empty() {
if let Err(e) = watcher.update_paths(std::mem::take(&mut paths_to_add)) {
paths_to_add = e.remaining
}
}
with Box<dyn FnMut(PathsMut) -> Result<(), Error>>
we have to do something like that:
let mut paths_to_add = ...;
loop {
let _ = watcher.update_paths(Box::new(|mut paths_mut| {
let mut iter = std::mem::take(paths_to_add).drain();
while let Some(operation) = iter.next() {
if paths_mut.add(operation).is_err() {
paths_to_add.extend(iter);
return Ok(())
}
}
Ok(())
}));
}
This is just expressing my personal state, not to undermine your work :) Really appreciate you putting time and thoughts into this.
No worries at all - I didn’t take it that way.
That said, I feel like in our efforts to improve the PathsMut
API, we might be losing some of the broader context. Specifically, beyond just adjusting how we handle paths, we also need to account for other inputs — like supporting not just RecursiveMode
, but also more general configuration via WatchConfig
.
Changing just PathsMut
would be a breaking change, so we’ll need to evolve the API structure anyway. As I mentioned earlier, I see a few possible directions to address this.
Another option would be:
- Adding a method to the
PathsMut
trait, e.g.watch(&Path, WatchConfig)
. Since it's not really user-implementable, we can technically do this. But this would still suffer from the same core issue - inconsistency: either methods that never fail but returnResult
, or strange pre-commit state. In this case,update_paths
becomes unnecessary, but to be honest, I'm strongly against this direction - it feels like an awkward and messy API. - Keeping PathsMut as an internal "mutation builder", and using something like
Box<dyn FnMut(&mut PathsMut) -> Result<(), Error>>
. Personally, I find this approach less clear overall. It also wouldn’t solve the inconsistency between callingpaths_mut()
andcommit()
means we’d still need to deprecateWatcher::paths_mut
eventually. - Deprecating
paths_mut
andPathsMut
entirely,update_paths
takes aVec<PathOp>
. This is what the current PR does. I like this approach best - which is why I implemented it.
@branchseer my main idea wasn't to remake it just because I want to, but to make the code more ready to #632 . I want this feature, because it allows |
Hi! Sorry to bother you — would you mind taking another look at the PR when you get a chance? I'd really appreciate your thoughts. |
The feature to add / remove paths to watch in batch was introduced in #692 but there was a problem with the implementation: if you drop the instance without
commit
it may leave the watcher in an inconsistent state (especially fsevents).This commit changes behavior and api and introduces
update_watches
method of theWatcher
trait, which do the same stuff but in more consistent way. Thepaths_mut
method was kept, but returning struct is just a builder for the methodupdate_watches
that will be called byPathsMut::commit
related #653
closes #704