-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Is your feature request related to a problem? Please describe.
During the writing step of the build process, even while using -j auto
, there are some time-consuming operations that are performed serially. It might be possible to parallelize them and further improve performances.
Describe the solution you'd like
Last night I went down a rabbit hole and tried to optimize _write_parallel
:
sphinx/sphinx/builders/__init__.py
Lines 571 to 607 in 8c4865c
def _write_parallel(self, docnames: Sequence[str], nproc: int) -> None: | |
def write_process(docs: List[Tuple[str, nodes.document]]) -> None: | |
self.app.phase = BuildPhase.WRITING | |
for docname, doctree in docs: | |
self.write_doc(docname, doctree) | |
# warm up caches/compile templates using the first document | |
firstname, docnames = docnames[0], docnames[1:] | |
self.app.phase = BuildPhase.RESOLVING | |
doctree = self.env.get_and_resolve_doctree(firstname, self) | |
self.app.phase = BuildPhase.WRITING | |
self.write_doc_serialized(firstname, doctree) | |
self.write_doc(firstname, doctree) | |
tasks = ParallelTasks(nproc) | |
chunks = make_chunks(docnames, nproc) | |
# create a status_iterator to step progressbar after writing a document | |
# (see: ``on_chunk_done()`` function) | |
progress = status_iterator(chunks, __('writing output... '), "darkgreen", | |
len(chunks), self.app.verbosity) | |
def on_chunk_done(args: List[Tuple[str, NoneType]], result: NoneType) -> None: | |
next(progress) | |
self.app.phase = BuildPhase.RESOLVING | |
for chunk in chunks: | |
arg = [] | |
for docname in chunk: | |
doctree = self.env.get_and_resolve_doctree(docname, self) | |
self.write_doc_serialized(docname, doctree) | |
arg.append((docname, doctree)) | |
tasks.add_task(write_process, arg, on_chunk_done) | |
# make sure all threads have finished | |
tasks.join() | |
logger.info('') |
I tried to naively apply the following changes:
diff --git a/sphinx/builders/__init__.py b/sphinx/builders/__init__.py
index 2aede5c24..d27df5fdd 100644
--- a/sphinx/builders/__init__.py
+++ b/sphinx/builders/__init__.py
@@ -569,9 +569,11 @@ class Builder:
self.write_doc(docname, doctree)
def _write_parallel(self, docnames: Sequence[str], nproc: int) -> None:
- def write_process(docs: List[Tuple[str, nodes.document]]) -> None:
- self.app.phase = BuildPhase.WRITING
- for docname, doctree in docs:
+ def write_process(docs: List[str]) -> None:
+ for docname in docs:
+ doctree = self.env.get_and_resolve_doctree(docname, self)
+ self.app.phase = BuildPhase.WRITING
+ self.write_doc_serialized(docname, doctree)
self.write_doc(docname, doctree)
# warm up caches/compile templates using the first document
@@ -595,12 +597,7 @@ class Builder:
self.app.phase = BuildPhase.RESOLVING
for chunk in chunks:
- arg = []
- for docname in chunk:
- doctree = self.env.get_and_resolve_doctree(docname, self)
- self.write_doc_serialized(docname, doctree)
- arg.append((docname, doctree))
- tasks.add_task(write_process, arg, on_chunk_done)
+ tasks.add_task(write_process, chunk, on_chunk_done)
# make sure all threads have finished
tasks.join()
This resulted in a build time reduction of ~25%.
Without patch, using Sphinx v4.5.0 to build the python/cpython docs:
real 2m13,386s
user 4m35,046s
sys 0m7,205s
With patch:
real 1m36,804s
user 4m42,314s
sys 0m5,909s
Despite the performance improvements, this solution is wrong for (at least) a few reasons:
- while it mostly works, some things (e.g. images) break;
write_doc_serialized
was explicitly created for executing code that can't be parallelized, so it shouldn't be called in a process executed in parallel;
However, write_doc_serialized
was added ~10 years ago in 5cd0841 to fix "fix parallel build globals problems". At the time the parallelization also relied on threading
, whereas now it seems entirely based on multiprocessing
. In builders/__init__.py
it's defined as an empty method to be overridden, and the html
builder overrides it with:
sphinx/sphinx/builders/html/__init__.py
Lines 673 to 678 in 8c4865c
def write_doc_serialized(self, docname: str, doctree: nodes.document) -> None: | |
self.imgpath = relative_uri(self.get_target_uri(docname), self.imagedir) | |
self.post_process_images(doctree) | |
title_node = self.env.longtitles.get(docname) | |
title = self.render_partial(title_node)['title'] if title_node else '' | |
self.index_page(docname, doctree, title) |
In addition, I noticed that _read_parallel
seems to do some post-processing/merging in
sphinx/sphinx/builders/__init__.py
Lines 467 to 475 in 8c4865c
def merge(docs: List[str], otherenv: bytes) -> None: | |
env = pickle.loads(otherenv) | |
self.env.merge_info_from(docs, env, self.app) | |
next(progress) | |
tasks = ParallelTasks(nproc) | |
for chunk in chunks: | |
tasks.add_task(read_process, chunk, merge) |
At this point, my lack of knowledge of Sphinx internals prevented me to dig deeper, but I was left wondering:
- if the fix added in 5cd0841 is still relevant after moving away from
threading
, or if it can be revisited/reverted (at least partially); - if the issue fixed by the above commit can be addressed with some post-processing similar to the one used in
_read_parallel
; - if any of the operations currently executed serially by
write_doc_serialized
can be parallelized;
Perhaps someone more familiar with the Sphinx internal can take a look and confirm whether there is room for improvement or not?
Addressing this (especially 2. above) might also help fix the following issue (cc @tk0miya, since you worked on this code):