Skip to content
This repository was archived by the owner on Jul 17, 2024. It is now read-only.

fix: Use daemon threads for SolverManager #69

Conversation

Christopher-Chianelli
Copy link
Collaborator

Before, if a SolverManager creates a Solver, it creates
a non-daemon thread, which can prevent the Python process from
EVER exiting unless forced.

Now, the SolverManager spawns only daemon threads, which allows the
Python process to exit. This also allows us to remove some test
configuration code that was used to force the JVM to exit.

Additionally, SolverManager can take SolverConfig directly and may
take a SolutionManagerConfig too.

Tests were flaky before, since the problem fact change may affect
the input problem if the change was applied before solving was
started.

Before, if a SolverManager creates a Solver, it creates
a non-daemon thread, which can prevent the Python process from
EVER exiting unless forced.

Now, the SolverManager spawns only daemon threads, which allows the
Python process to exit. This also allows us to remove some test
configuration code that was used to force the JVM to exit.

Additionaly, SolverManager can take SolverConfig directly and may
take a SolutionManagerConfig too.
- Tests were flaky before, since the problem fact change may affect
  the input problem if the change was applied before solving was
  started.
Copy link
Contributor

@triceo triceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure about the daemon threads.

Why should the process terminate if the solver jobs haven't? The jobs are important payload. To me, this doesn't sound like the least surprising behavior.

More importantly - what does the Java solver do? Does it also create deamon threads? If not, then for consistency, neither should the Python solver.

@zepfred
Copy link
Contributor

zepfred commented Jun 13, 2024

More importantly - what does the Java solver do? Does it also create deamon threads? If not, then for consistency, neither should the Python solver.

I understand the Python SolverManager::create now configures the thread factory class with DaemonThreadFactory. The new thread factory only creates daemon threads, and this would apply to the Java side.

        java_solver_manager_config = solver_manager_config._to_java_solver_manager_config()  # noqa
        java_solver_manager_config.setThreadFactoryClass(DaemonThreadFactory.class_)

@triceo
Copy link
Contributor

triceo commented Jun 13, 2024

Yes, but I'm asking something else. Forget Python Solver. What's the default behavior of the Java solver, without any fancy configuration? My argument is that the Python solver's default behavior should be exactly the same.

(In other words: if the Java solver by default doesn't do daemon threads, neither should the Python solver.)

@Christopher-Chianelli
Copy link
Collaborator Author

I'm not entirely sure about the daemon threads.

Why should the process terminate if the solver jobs haven't? The jobs are important payload. To me, this doesn't sound like the least surprising behavior.

More importantly - what does the Java solver do? Does it also create deamon threads? If not, then for consistency, neither should the Python solver.

If you control-C the Java solver and don't configure a signal handler, Java terminates without daemon threads.
If you control-C the Python solver, even AFTER solving has finished, and deamon threads are not used, it hangs forever, since Python is waiting for Java to shutdown, but Python holds a reference to SolverManager (and thus its threadpool, which keep threads alive until it is garbage collected, never shutdown).

@triceo
Copy link
Contributor

triceo commented Jun 13, 2024

If you control-C the Python solver, even AFTER solving has finished, and deamon threads are not used, it hangs forever, since Python is waiting for Java to shutdown, but Python holds a reference to SolverManager (and thus its threadpool, which keep threads alive until it is garbage collected, never shutdown).

Which is a problem that should be solved by properly shutting down the executor when no longer needed, no?

@Christopher-Chianelli
Copy link
Collaborator Author

If you control-C the Python solver, even AFTER solving has finished, and deamon threads are not used, it hangs forever, since Python is waiting for Java to shutdown, but Python holds a reference to SolverManager (and thus its threadpool, which keep threads alive until it is garbage collected, never shutdown).

Which is a problem that should be solved by properly shutting down the executor when no longer needed, no?

Nope; that was done in tests, and tests also hang and needed specially handling code

@triceo
Copy link
Contributor

triceo commented Jun 13, 2024

In my opinion, the key here is to figure out why does Java terminate in one case, and not in the other. Solving that through daemon threads is IMOM a hack, and not a solution - there will be a root cause there somewhere.

@Christopher-Chianelli
Copy link
Collaborator Author

In my opinion, the key here is to figure out why does Java terminate in one case, and not in the other. Solving that through daemon threads is IMOM a hack, and not a solution - there will be a root cause there somewhere.

Because Python hold a reference and a ThreadPoolExecutor keeps it threads alive once spawn, creating a catch-22:

  • In order for Python to terminate, it needs JVM to shutdown, after which it free globals/locals.
  • In order for the JVM to shutdown, it needs to have no more references to the Executor, which requires Python to terminate.

@triceo
Copy link
Contributor

triceo commented Jun 13, 2024

And Python cannot shut down the executor and drop the reference? In Java, you'd set the field to null and problem solved.

@Christopher-Chianelli
Copy link
Collaborator Author

And Python cannot shut down the executor and drop the reference? In Java, you'd set the field to null and problem solved.

Nope; consider user experience; "Why isn't my program terminating? Why is my test hanging?" That is terrible UX.
This is the code users WANT to write

solver_manger = SolverManager.create(solver_config)

@app.get("/root/solve")
async def solve(timetable: Timetable) :
    solver_manager.solve('ID', timetable, ...)

And users do not want that code to hang on Control-C

@triceo
Copy link
Contributor

triceo commented Jun 13, 2024

I fully agree with that. It just seems logical that a framework that does this (see @app.get) also has some kind of a destructor method that we can hook up into to properly free resources.

Daemon threads are a solution to the problem which we may need to accept, but the real solution would be to somehow free the resources we've taken and allow a clean shutdown on its own.

@Christopher-Chianelli
Copy link
Collaborator Author

I fully agree with that. It just seems logical that a framework that does this (see @app.get) also has some kind of a destructor method that we can hook up into to properly free resources.

Daemon threads are a solution to the problem which we may need to accept, but the real solution would be to somehow free the resources we've taken and allow a clean shutdown on its own.

That the responsibility of JPype, which adds its own shutdown hook to Python, which hangs forever w/o daemon threads.

Copy link

Copy link
Contributor

@zepfred zepfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@Christopher-Chianelli Christopher-Chianelli merged commit 57ac260 into TimefoldAI:main Jun 13, 2024
5 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants