Skip to content

lint: Add cpp_docstring_lint #13461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

EricCousineau-TRI
Copy link
Contributor

@EricCousineau-TRI EricCousineau-TRI commented Jun 1, 2020

I would like to not care anything about docstring whitespace styles, and I remember there being a chat about using a consistent format.

Towards #13491

Requires:


This change is Reviewable

@EricCousineau-TRI EricCousineau-TRI force-pushed the feature-cc-transform-doxygen-comments branch 5 times, most recently from 3db0f8a to 99430dc Compare June 3, 2020 07:13
@EricCousineau-TRI EricCousineau-TRI marked this pull request as ready for review June 3, 2020 07:13
@EricCousineau-TRI EricCousineau-TRI changed the title doc: Transform all docstring comments FTW lint: Add cpp_docstring_lint Jun 3, 2020
Copy link
Contributor Author

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@sherm1 for feature review, please!

Reviewable status: LGTM missing from assignee sherm1(platform), needs at least two assigned reviewers, labeled "do not merge", commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @sherm1)

Copy link
Contributor Author

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, rather - review on the unittest functionality
+@ggould-tri could I ask for your feature on the code itself over the next couple of days?

Context is, I would like to not care anything about docstring styles, and I remember there being a chat about using a consistent format.

Only briefly searched to see if there was a tool to do this.

Reviewable status: LGTM missing from assignees sherm1(platform),ggould-tri(platform), commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @ggould-tri and @sherm1)

Copy link
Contributor Author

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested testing:

$ git fetch upstream pull/13461/head && git checkout FETCH_HEAD
$ bazel build //tools/lint:cpp_docstring_lint

$ bazel-bin/tools/lint/cpp_docstring_lint --all --lint
# See a slew of errors.

$ bazel-bin/tools/lint/cpp_docstring_lint --all
# Fix said errors.

$ bazel-bin/tools/lint/cpp_docstring_lint --all --lint
# See no more errors.

$ git diff --ignore-all-space --word-diff
# See the diffs on the errors

Reviewable status: LGTM missing from assignees ggould-tri(platform),sherm1(platform), commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @ggould-tri and @sherm1)

@jwnimmer-tri
Copy link
Collaborator

jwnimmer-tri commented Jun 3, 2020

Style change should begin their review with a PR to the styleguide with all of platform team approving the new policy, not an implementation. It is difficult to review a linter implementation without knowing what specification it is targeting.

I'd believe in a proposal to auto-detect (and fix?) mistakes related to mkdoc (mixing // between doxygen and declaration). I'm not sure what the rest here is about.

@jwnimmer-tri
Copy link
Collaborator

I remember there being a chat about using a consistent format.

My styleguide WIP for comment formatting is here:
https://github.com/jwnimmer-tri/styleguide/commits/comment-formatting

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can look over this code a bit today and mostly tomorrow (it is very large and I am heavily loaded today). However I agree strongly with Jeremy that this requires platform group buy-in and a clear statement of what the standard is (perhaps in the matching revision to the styleguide ??). This is particularly acute as this new linter seems to lack any form of line-level nolint/noqa mechansim, making unusually prescriptive.

Consider suspending this PR and doing its styleguide PR first. That will also let your reference that material in your comments, which will make them clearer.

Reviewed 7 of 10 files at r1.
Reviewable status: 1 unresolved discussion, LGTM missing from assignees ggould-tri(platform),sherm1(platform), commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @EricCousineau-TRI, @ggould-tri, and @sherm1)


tools/lint/cpp_docstring_lint.py, line 642 at r1 (raw file):

def lint_multiline_token(lint_errors, token, new_lines, verbose):
    """Detects if there are any changes between `token` and the re-processed
    lines from `new_lines."""

minor: unclosed backtick

@EricCousineau-TRI
Copy link
Contributor Author

Sounds good to me! Will try to get to styleguide PR in ~2 weeks.
Thanks!

Copy link
Contributor Author

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 unresolved discussions, LGTM missing from assignees ggould-tri(platform),sherm1(platform), commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @EricCousineau-TRI, @ggould-tri, and @sherm1)

a discussion (no related file):
Requires styleguide PR + discussion + review with platform.


Copy link
Contributor Author

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno how to get the red dots off of y'all in the Reviewable dashboard, so for now:
-@ggould-tri -@sherm1

Reviewable status: 2 unresolved discussions, needs platform reviewer assigned, needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @EricCousineau-TRI)

Copy link
Contributor

@ggould-tri ggould-tri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That keeps us red-dotted, so I'm tossing in some notes and marking reviewed to clear up my dots that way.

You can ignore the comments below if other rewrite issues come up of course.

Reviewed 3 of 10 files at r1.
Reviewable status: 12 unresolved discussions, needs platform reviewer assigned, needs at least two assigned reviewers, commits need curation (https://drake.mit.edu/reviewable.html#curated-commits) (waiting on @EricCousineau-TRI)

a discussion (no related file):
Meta: There is an assumption throughout that any file docstring-linteed is already adherent to GSG (for instance, assuming that space is the only valid whitespace, or that C++ code whitespace rules exactly match python data whitespace rules). Which is fine (probably even necessary), but:

  • You need to make that precondition explicit
  • You need to make sure that cpp_docstring_lint respects the same exemption syntax (NOLINTNEXTLINE) that our GSG compliance check does.

a discussion (no related file):
Meta: Your use of the word "token" is unclear. It's certainly not the C++ lex.token production, nor does it appear to match any of doxygen's lexer productions https://github.com/doxygen/doxygen/blob/master/src/scanner.l .

If you're defining a grammar you're going to need to state it clearly.


a discussion (no related file):
Metameta: Another way of viewing that previous meta is: You're writing a lexer, by hand, with if statements.

I can hear the voices of a dozen programming language theorist former coworkers and professors screaming in my head. Possibly Saman is breaking into my house right now to loom behind me with a threat of retroactively failing me on my compilers midterm.

I think about 80% of your grammar rules are regular, so a generic regex-based lexer will solve them and be easier to reason about. A correct implementation of your * rule may not be regular since I think it implies a chunk of the C++ lexer by reference; I haven't thought that through. But that rule might not be sound anyway.



tools/lint/cpp_docstring_lint.py, line 40 at r1 (raw file):

        elif text.startswith("/*"):
            start_token = "/*"
        elif text.startswith("* ") or text.strip() == "*":

This will capture operator-first products (foo \n * bar \n * baz) outside of comments.
(per GSG "wrapping all operators at the beginning of the line is also allowed")


tools/lint/cpp_docstring_lint.py, line 416 at r1 (raw file):

    """Reformats a multiline token for applying / checking lint."""
    if isinstance(token, DocstringMultilineToken):
        # Multi-pass for idempotent.

This comment is not clear.


tools/lint/cpp_docstring_lint.py, line 421 at r1 (raw file):

        first_line = token.lines[0]
        prev_lines = None
        for i in range(3):

3 is magic number here. Consider a while instead? Or for that matter "2" should work if you're really idempotent.


tools/lint/cpp_docstring_lint.py, line 664 at r1 (raw file):

def is_ignored_file(relpath):
    # TODO(eric.cousineau): Figure out better heuristic for this.

This heuristic needs to match or, better, reuse the GSG linter heuristic (possibly that is just "let bazel decide"?).


tools/lint/test/cpp_docstring_lint_test.py, line 16 at r1 (raw file):

class TestCppDocstringLint(unittest.TestCase):
    def test_abstract_regex(self):
        """Tests basic pattern matching. Used in reorder_multiline_tokens"""

minor: Complete sentence needs a period.


tools/lint/test/cpp_docstring_lint_test.py, line 207 at r1 (raw file):

        """.rstrip())
        actual_errors = "\n".join(str(x) for x in lint_errors).rstrip()
        self.assertEqual(expected_errors, actual_errors)

You are missing tests for code containing comment characters in noncomment contexts, eg multiplication, division, dereference, etc.


tools/lint/test/cpp_docstring_lint_test.py, line 207 at r1 (raw file):

        """.rstrip())
        actual_errors = "\n".join(str(x) for x in lint_errors).rstrip()
        self.assertEqual(expected_errors, actual_errors)

You are missing edge case tests for unicode (including unicode whitespace, which is general whitespace to python but not to C++), empty files, and unclosed comments and groups.

@jamiesnape
Copy link
Contributor

@drake-jenkins-bot linux-focal-gcc-bazel-experimental-release please

@EricCousineau-TRI
Copy link
Contributor Author

Punting another 2 weeks down the road. Will wait til Q2 wind-down + Q3 wind-up settles.

@EricCousineau-TRI EricCousineau-TRI force-pushed the feature-cc-transform-doxygen-comments branch from 7867b1a to 5f7c040 Compare July 18, 2020 15:54
@EricCousineau-TRI
Copy link
Contributor Author

@jwnimmer-tri
Copy link
Collaborator

This PR doesn't seem like it needs CI, even though there is a little recent discussion newer than 6 months here. Can we close this for now?

@EricCousineau-TRI
Copy link
Contributor Author

Sorry, and yup!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants