-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
_check_toc_parents
should consider only the descendants of root_doc and n…
#13038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
f371bbb
cc5f3f8
26b216e
85b092c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -771,7 +771,7 @@ def check_consistency(self) -> None: | |
# Call _check_toc_parents here rather than in _get_toctree_ancestors() | ||
# because that method is called multiple times per document and would | ||
# lead to duplicate warnings. | ||
_check_toc_parents(self.toctree_includes) | ||
_check_toc_parents(self.toctree_includes, self.config.root_doc) | ||
|
||
# call check-consistency for all extensions | ||
self.domains._check_consistency() | ||
|
@@ -819,19 +819,28 @@ def _traverse_toctree( | |
traversed.add(sub_docname) | ||
|
||
|
||
def _check_toc_parents(toctree_includes: dict[str, list[str]]) -> None: | ||
def _check_toc_parents(toctree_includes: dict[str, list[str]], root_doc: str) -> None: | ||
"""Checks if document is referenced in multiple toctrees. | ||
Based on the current implementation of `global_toctree_for_doc`, | ||
it considers only the descendants of root_doc and not the whole graph. | ||
""" | ||
toc_parents: dict[str, list[str]] = {} | ||
for parent, children in toctree_includes.items(): | ||
for child in children: | ||
toc_parents.setdefault(child, []).append(parent) | ||
|
||
def _find_toc_parents_dfs(node: str) -> None: | ||
for child in toctree_includes.get(node, []): | ||
toc_parents.setdefault(child, []).append(node) | ||
is_child_already_visited = len(toc_parents[child]) > 1 | ||
if not is_child_already_visited: | ||
_find_toc_parents_dfs(child) | ||
|
||
_find_toc_parents_dfs(root_doc) | ||
for doc, parents in sorted(toc_parents.items()): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A personal opinion: the code might be easier to understand if the iteration source of the subsequent Explaining why: to me, function calls that have side-effects that affect outer-scoped variables are slightly hard to follow. I think that another potential benefit could be that it'd be easier to write test coverage for the helper function (although I admit that it's a small one, and that perhaps the enclosing function is a better candidate for testing here). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I didn't worry too much about side-effects as it being more simplistic this way. Here is the DFS without side-effect: def _find_toc_parents_dfs(node: str, toc_parents: dict[str, list[str]] = {}) -> dict[str, list[str]]:
for child in toctree_includes.get(node, []):
already_visited = child in toc_parents
toc_parents.setdefault(child, []).append(node)
if already_visited:
continue
_find_toc_parents_dfs(child, toc_parents)
return toc_parents Personally I found it slightly more complicated than needed because of Edit: There exists another DFS implementation, without taking |
||
if len(parents) > 1: | ||
logger.info( | ||
__( | ||
'document is referenced in multiple toctrees: %s, selecting: %s <- %s' | ||
), | ||
parents, | ||
sorted(parents), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was the sorting added here for debugging/investigation purposes? (and should we include it with these changes?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The helper function uses preorder traversal which does not guarantee sorted parents as was before. Sorting is kept for consistency reasons (independent of the helper function traversal order) in the logged output, this way it is also easier to write the corresponding tests instead of dry running the traversal order and depending on the helper functions implementation. Further, it also makes it easier for the user to spot the pattern that the lexicographically greatest parent is being selected. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My initial sense here is that I'm not too keen on the practice of modifying application code in order to make test expectations easier to write. I do understand that it helps in this case, but I think that unit test coverage of different tree/graph structures would be more robust over time. (apologies for taking a while to add further review commentary) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I do not understand how the application code is modified since the function
I don’t have a strong opinion on writing unittests for helper functions, in my opinion we should be testing based on functionality and not the implementation of a function which in this current case would mean that the tests should only care about consistent warning/logging and not about whatever method of traversal is used internally to achieve so. For example the helper method which used node-wise traversal previously, and now inorder traversal in this PR, should ideally NOT break the existing tests and hence having a determined order of parents regardless of the traversal algorithm helps achieve it. I’d like to know more about what you think of this. If you still believe that we shouldn’t guarantee sorted order of parents anymore, I’d happily remove it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Logging is sometime more than a side-effect, and it can either express or hide internal application state. Sometimes that's risky, and sometimes it's useful. In the context of trying to improve the consistency of build output, I think it's likely to be useful to log the iteration order of the parent document names without sorting applied to them. As I understand it, commit 8351936 does sort the keys of the toctree includes -- but it doesn't sort the values (the parent document list). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the feedback, I'll remove sorting from here then.
Note that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ok, this makes sense, thank you. I didn't understand this until I began attempting to write some unit tests locally. What I'd started on was a test to provide three or four permutations of the same graph -- sometimes with child value lists randomized, sometimes with parent keys randomized -- with the intent of asserting on consistent inverted-graph structure as output. (in particular, refreshing my memory about the lexicographic sorting took me some time -- and now I understand from that plus your message here, that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
While it is true that the order of nodes visited during the reversal of the graph edges Was your idea about testing whether the Edit: The dfs algorithm presented here relies on children-values ordering of |
||
max(parents), | ||
doc, | ||
location=doc, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
test-toctree-multiple-parents | ||
============================= | ||
|
||
.. literalinclude:: relation_graph.txt | ||
|
||
.. toctree:: | ||
delta |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
index | ||
/ \ | ||
index pdfindex | ||
/ \ / | ||
alpha delta | ||
\ / | ||
bravo / | ||
|
Uh oh!
There was an error while loading. Please reload this page.