Skip to content

Commit 6833076

Browse files
authored
[analyzer][NFC] Introduce framework for checker families (llvm#139256)
The checker classes (i.e. classes derived from `CheckerBase` via the utility template `Checker<...>`) act as intermediates between the user and the analyzer engine, so they have two interfaces: - On the frontend side, they have a public name, can be enabled or disabled, can accept checker options and can be reported as the source of bug reports. - On the backend side, they can handle various checker callbacks and they "leave a mark" on the `ExplodedNode`s that are created by them. (These `ProgramPointTag` marks are internal: they appear in debug logs and can be queried by checker logic; but the user doesn't see them.) In a significant majority of the checkers there is 1:1 correspondence between these sides, but there are also many checker classes where several related user-facing checkers share the same backend class. Historically each of these "multi-part checker" classes had its own hacks to juggle its multiple names, which led to lots of ugliness like lazy initialization of `mutable std::unique_ptr<BugType>` members and redundant data members (when a checker used its custom `CheckNames` array and ignored the inherited single `Name`). My recent commit 2709998 tried to unify and standardize these existing solutions to get rid of some of the technical debt, but it still used enum values to identify the checker parts within a "multi-part" checker class, which led to some ugliness. This commit introduces a new framework which takes a more direct, object-oriented approach: instead of identifying checker parts with `{parent checker object, index of part}` pairs, the parts of a multi-part checker become stand-alone objects that store their own name (and enabled/disabled status) as a data member. This is implemented by separating the functionality of `CheckerBase` into two new classes: `CheckerFrontend` and `CheckerBackend`. The name `CheckerBase` is kept (as a class derived from both `CheckerFrontend` and `CheckerBackend`), so "simple" checkers that use `CheckerBase` and `Checker<...>` continues to work without changes. However we also get first-class support for the "many frontends - one backend" situation: - The class `CheckerFamily<...>` works exactly like `Checker<...>` but inherits from `CheckerBackend` instead of `CheckerBase`, so it won't have a superfluous single `Name` member. - Classes deriving from `CheckerFamily` can freely own multiple `CheckerFrontend` data members, which are enabled within the registration methods corresponding to their name and can be used to initialize the `BugType`s that they can emit. In this scheme each `CheckerFamily` needs to override the pure virtual method `ProgramPointTag::getTagDescription()` which returns a string which represents that class for debugging purposes. (Previously this used the name of one arbitrary sub-checker, which was passable for debugging purposes, but not too elegant.) I'm planning to implement follow-up commits that convert all the "multi-part" checkers to this `CheckerFamily` framework.
1 parent 3033f20 commit 6833076

File tree

13 files changed

+183
-198
lines changed

13 files changed

+183
-198
lines changed

clang/include/clang/Analysis/ProgramPoint.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,8 @@ class ProgramPointTag {
4040
ProgramPointTag(void *tagKind = nullptr) : TagKind(tagKind) {}
4141
virtual ~ProgramPointTag();
4242

43-
/// The description of this program point which will be displayed when the
44-
/// ExplodedGraph is dumped in DOT format for debugging.
43+
/// The description of this program point which will be dumped for debugging
44+
/// purposes. Do not use in user-facing output!
4545
virtual StringRef getTagDescription() const = 0;
4646

4747
/// Used to implement 'isKind' in subclasses.

clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -643,9 +643,10 @@ class BugReporter {
643643
/// reports.
644644
virtual void emitReport(std::unique_ptr<BugReport> R);
645645

646-
void EmitBasicReport(const Decl *DeclWithIssue, const CheckerBase *Checker,
647-
StringRef BugName, StringRef BugCategory,
648-
StringRef BugStr, PathDiagnosticLocation Loc,
646+
void EmitBasicReport(const Decl *DeclWithIssue,
647+
const CheckerFrontend *Checker, StringRef BugName,
648+
StringRef BugCategory, StringRef BugStr,
649+
PathDiagnosticLocation Loc,
649650
ArrayRef<SourceRange> Ranges = {},
650651
ArrayRef<FixItHint> Fixits = {});
651652

clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h

Lines changed: 22 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,7 @@ class BugReporter;
2727

2828
class BugType {
2929
private:
30-
struct CheckerPartRef {
31-
const CheckerBase *Checker;
32-
CheckerPartIdx Idx;
33-
};
34-
using CheckerNameInfo = std::variant<CheckerNameRef, CheckerPartRef>;
30+
using CheckerNameInfo = std::variant<CheckerNameRef, const CheckerFrontend *>;
3531

3632
const CheckerNameInfo CheckerName;
3733
const std::string Description;
@@ -43,7 +39,7 @@ class BugType {
4339
public:
4440
// Straightforward constructor where the checker name is specified directly.
4541
// TODO: As far as I know all applications of this constructor involve ugly
46-
// hacks that could be avoided by switching to a different constructor.
42+
// hacks that could be avoided by switching to the other constructor.
4743
// When those are all eliminated, this constructor should be removed to
4844
// eliminate the `variant` and simplify this class.
4945
BugType(CheckerNameRef CheckerName, StringRef Desc,
@@ -52,18 +48,11 @@ class BugType {
5248
SuppressOnSink(SuppressOnSink) {}
5349
// Constructor that can be called from the constructor of a checker object.
5450
// At that point the checker name is not yet available, but we can save a
55-
// pointer to the checker and later use that to query the name.
56-
BugType(const CheckerBase *Checker, StringRef Desc,
51+
// pointer to the checker and use that to query the name.
52+
BugType(const CheckerFrontend *Checker, StringRef Desc,
5753
StringRef Cat = categories::LogicError, bool SuppressOnSink = false)
58-
: CheckerName(CheckerPartRef{Checker, DefaultPart}), Description(Desc),
59-
Category(Cat), SuppressOnSink(SuppressOnSink) {}
60-
// Constructor that can be called from the constructor of a checker object
61-
// when it has multiple parts with separate names. We save the name and the
62-
// part index to be able to query the name of that part later.
63-
BugType(const CheckerBase *Checker, CheckerPartIdx Idx, StringRef Desc,
64-
StringRef Cat = categories::LogicError, bool SuppressOnSink = false)
65-
: CheckerName(CheckerPartRef{Checker, Idx}), Description(Desc),
66-
Category(Cat), SuppressOnSink(SuppressOnSink) {}
54+
: CheckerName(Checker), Description(Desc), Category(Cat),
55+
SuppressOnSink(SuppressOnSink) {}
6756
virtual ~BugType() = default;
6857

6958
StringRef getDescription() const { return Description; }
@@ -72,8 +61,7 @@ class BugType {
7261
if (const auto *CNR = std::get_if<CheckerNameRef>(&CheckerName))
7362
return *CNR;
7463

75-
auto [Checker, Idx] = std::get<CheckerPartRef>(CheckerName);
76-
return Checker->getName(Idx);
64+
return std::get<const CheckerFrontend *>(CheckerName)->getName();
7765
}
7866

7967
/// isSuppressOnSink - Returns true if bug reports associated with this bug
@@ -82,6 +70,21 @@ class BugType {
8270
bool isSuppressOnSink() const { return SuppressOnSink; }
8371
};
8472

73+
/// Trivial convenience class for the common case when a certain checker
74+
/// frontend always uses the same bug type. This way instead of writing
75+
/// ```
76+
/// CheckerFrontend LongCheckerFrontendName;
77+
/// BugType LongCheckerFrontendNameBug{LongCheckerFrontendName, "..."};
78+
/// ```
79+
/// we can use `CheckerFrontendWithBugType LongCheckerFrontendName{"..."}`.
80+
class CheckerFrontendWithBugType : public CheckerFrontend, public BugType {
81+
public:
82+
CheckerFrontendWithBugType(StringRef Desc,
83+
StringRef Cat = categories::LogicError,
84+
bool SuppressOnSink = false)
85+
: BugType(this, Desc, Cat, SuppressOnSink) {}
86+
};
87+
8588
} // namespace ento
8689

8790
} // end clang namespace

clang/include/clang/StaticAnalyzer/Core/Checker.h

Lines changed: 78 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -484,83 +484,90 @@ class Call {
484484

485485
} // end eval namespace
486486

487-
class CheckerBase : public ProgramPointTag {
488-
/// A single checker class (i.e. a subclass of `CheckerBase`) can implement
489-
/// multiple user-facing checkers that have separate names and can be enabled
490-
/// separately, but are backed by the same singleton checker object.
491-
SmallVector<std::optional<CheckerNameRef>, 1> RegisteredNames;
492-
493-
friend class ::clang::ento::CheckerManager;
487+
/// A `CheckerFrontend` instance is what the user recognizes as "one checker":
488+
/// it has a public canonical name (injected from the `CheckerManager`), can be
489+
/// enabled or disabled, can have associated checker options and can be printed
490+
/// as the "source" of bug reports.
491+
/// The singleton instance of a simple `Checker<...>` is-a `CheckerFrontend`
492+
/// (for historical reasons, to preserve old straightforward code), while the
493+
/// singleton instance of a `CheckerFamily<...>` class owns multiple
494+
/// `CheckerFrontend` instances as data members.
495+
/// Modeling checkers that are hidden from the user but can be enabled or
496+
/// disabled separately (as dependencies of other checkers) are also considered
497+
/// to be `CheckerFrontend`s.
498+
class CheckerFrontend {
499+
/// The `Name` is nullopt if and only if the checker is disabled.
500+
std::optional<CheckerNameRef> Name;
494501

495502
public:
496-
CheckerNameRef getName(CheckerPartIdx Idx = DefaultPart) const {
497-
assert(Idx < RegisteredNames.size() && "Checker part index is too large!");
498-
std::optional<CheckerNameRef> Name = RegisteredNames[Idx];
499-
assert(Name && "Requested checker part is not registered!");
500-
return *Name;
501-
}
502-
503-
bool isPartEnabled(CheckerPartIdx Idx) const {
504-
return Idx < RegisteredNames.size() && RegisteredNames[Idx].has_value();
505-
}
506-
507-
void registerCheckerPart(CheckerPartIdx Idx, CheckerNameRef Name) {
508-
// Paranoia: notice if e.g. UINT_MAX is passed as a checker part index.
509-
assert(Idx < 256 && "Checker part identifiers should be small integers.");
510-
511-
if (Idx >= RegisteredNames.size())
512-
RegisteredNames.resize(Idx + 1, std::nullopt);
513-
514-
assert(!RegisteredNames[Idx] && "Repeated registration of checker a part!");
515-
RegisteredNames[Idx] = Name;
516-
}
517-
518-
StringRef getTagDescription() const override {
519-
// When the ExplodedGraph is dumped for debugging (in DOT format), this
520-
// method is called to attach a description to nodes created by this
521-
// checker _class_. Ideally this should be recognizable identifier of the
522-
// whole class, but for this debugging purpose it's sufficient to use the
523-
// name of the first registered checker part.
524-
for (const auto &OptName : RegisteredNames)
525-
if (OptName)
526-
return *OptName;
527-
528-
return "Unregistered checker";
503+
void enable(CheckerManager &Mgr) {
504+
assert(!Name && "Checker part registered twice!");
505+
Name = Mgr.getCurrentCheckerName();
529506
}
507+
bool isEnabled() const { return Name.has_value(); }
508+
CheckerNameRef getName() const { return *Name; }
509+
};
530510

511+
/// `CheckerBackend` is an abstract base class that serves as the common
512+
/// ancestor of all the `Checker<...>` and `CheckerFamily<...>` classes which
513+
/// can create `ExplodedNode`s (by acting as a `ProgramPointTag`) and can be
514+
/// registered to handle various checker callbacks. (Moreover the debug
515+
/// callback `printState` is also introduced here.)
516+
class CheckerBackend : public ProgramPointTag {
517+
public:
531518
/// Debug state dump callback, see CheckerManager::runCheckersForPrintState.
532519
/// Default implementation does nothing.
533520
virtual void printState(raw_ostream &Out, ProgramStateRef State,
534521
const char *NL, const char *Sep) const;
535522
};
536523

537-
/// Dump checker name to stream.
538-
raw_ostream& operator<<(raw_ostream &Out, const CheckerBase &Checker);
539-
540-
/// Tag that can use a checker name as a message provider
541-
/// (see SimpleProgramPointTag).
542-
class CheckerProgramPointTag : public SimpleProgramPointTag {
524+
/// The non-templated common ancestor of all the simple `Checker<...>` classes.
525+
class CheckerBase : public CheckerFrontend, public CheckerBackend {
543526
public:
544-
CheckerProgramPointTag(StringRef CheckerName, StringRef Msg);
545-
CheckerProgramPointTag(const CheckerBase *Checker, StringRef Msg);
527+
/// Attached to nodes created by this checker class when the ExplodedGraph is
528+
/// dumped for debugging.
529+
StringRef getTagDescription() const override;
546530
};
547531

548-
template <typename CHECK1, typename... CHECKs>
549-
class Checker : public CHECK1, public CHECKs..., public CheckerBase {
532+
/// Simple checker classes that implement one frontend (i.e. checker name)
533+
/// should derive from this template and specify all the implemented callbacks
534+
/// (i.e. classes like `check::PreStmt` or `eval::Call`) as template arguments
535+
/// of `Checker`.
536+
template <typename... CHECKs>
537+
class Checker : public CheckerBase, public CHECKs... {
550538
public:
551539
template <typename CHECKER>
552-
static void _register(CHECKER *checker, CheckerManager &mgr) {
553-
CHECK1::_register(checker, mgr);
554-
Checker<CHECKs...>::_register(checker, mgr);
540+
static void _register(CHECKER *Chk, CheckerManager &Mgr) {
541+
(CHECKs::_register(Chk, Mgr), ...);
555542
}
556543
};
557544

558-
template <typename CHECK1>
559-
class Checker<CHECK1> : public CHECK1, public CheckerBase {
545+
/// Checker families (where a single backend class implements multiple related
546+
/// frontends) should derive from this template and specify all the implemented
547+
/// callbacks (i.e. classes like `check::PreStmt` or `eval::Call`) as template
548+
/// arguments of `FamilyChecker.`
549+
///
550+
/// NOTE: Classes deriving from `CheckerFamily` must implement the pure
551+
/// virtual method `StringRef getTagDescription()` which is inherited from
552+
/// `ProgramPointTag` and should return the name of the class as a string.
553+
///
554+
/// Obviously, this boilerplate is not a good thing, but unfortunately there is
555+
/// no portable way to stringify the name of a type (e.g. class), so any
556+
/// portable implementation of `getTagDescription` would need to take the
557+
/// name of the class from *somewhere* where it's present as a string -- and
558+
/// then directly placing it in a method override is much simpler than
559+
/// loading it from `Checkers.td`.
560+
///
561+
/// Note that the existing `CLASS` field in `Checkers.td` is not suitable for
562+
/// our goals, because instead of storing the same class name for each
563+
/// frontend, in fact each frontendchecker needs to have its own unique value
564+
/// there (to ensure that the names of the register methods are all unique).
565+
template <typename... CHECKs>
566+
class CheckerFamily : public CheckerBackend, public CHECKs... {
560567
public:
561568
template <typename CHECKER>
562-
static void _register(CHECKER *checker, CheckerManager &mgr) {
563-
CHECK1::_register(checker, mgr);
569+
static void _register(CHECKER *Chk, CheckerManager &Mgr) {
570+
(CHECKs::_register(Chk, Mgr), ...);
564571
}
565572
};
566573

@@ -581,6 +588,20 @@ class EventDispatcher {
581588
}
582589
};
583590

591+
/// Tag that can use a checker name as a message provider
592+
/// (see SimpleProgramPointTag).
593+
/// FIXME: This is a cargo cult class which is copied into several checkers but
594+
/// does not provide anything useful.
595+
/// The only added functionality provided by this class (compared to
596+
/// SimpleProgramPointTag) is that it composes the tag description string from
597+
/// two arguments -- but tag descriptions only appear in debug output so there
598+
/// is no reason to bother with this.
599+
class CheckerProgramPointTag : public SimpleProgramPointTag {
600+
public:
601+
CheckerProgramPointTag(StringRef CheckerName, StringRef Msg);
602+
CheckerProgramPointTag(const CheckerBase *Checker, StringRef Msg);
603+
};
604+
584605
/// We dereferenced a location that may be null.
585606
struct ImplicitNullDerefEvent {
586607
SVal Location;

0 commit comments

Comments
 (0)