Skip to content

Commit 41ac245

Browse files
committed
[include-cleaner] Include-cleaner library structure, and simplistic AST walking.
Include-cleaner is a library that uses the clang AST and preprocessor to determine which headers are used. It will be used in clang-tidy, in clangd, in a standalone tool at least for testing, and in out-of-tree tools. Roughly, it walks the AST, finds referenced decls, maps these to used sourcelocations, then to FileEntrys, then matching these against #includes. However there are many wrinkles: dealing with macros, standard library symbols, umbrella headers, IWYU directives etc. It is not built on the C++20 modules concept of usage, to allow: - use with existing non-modules codebases - a flexible API embeddable in clang-tidy, clangd, and other tools - avoiding a chicken-and-egg problem where include cleanups are needed before modules can be adopted This library is based on existing functionality in clangd that provides an unused-include warning. However it has design changes: - it accommodates diagnosing missing includes too (this means tracking where references come from, not just the set of targets) - it more clearly separates the different mappings (symbol => location => header => include) for better testing - it handles special cases like standard library symbols and IWYU directives more elegantly by adding unified Location and Header types instead of side-tables - it will support some customization of policy where necessary (e.g. for style questions of what constitutes a use, or to allow both missing-include and unused-include modes to be conservative) This patch adds the basic directory structure under clang-tools-extra and a skeleton version of the AST traversal, which will be the central piece. A more end-to-end prototype is in https://reviews.llvm.org/D122677 RFC: https://discourse.llvm.org/t/rfc-lifting-include-cleaner-missing-unused-include-detection-out-of-clangd/61228 Differential Revision: https://reviews.llvm.org/D124164
1 parent 14869bd commit 41ac245

File tree

15 files changed

+336
-0
lines changed

15 files changed

+336
-0
lines changed

clang-tools-extra/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ add_subdirectory(clang-doc)
1414
add_subdirectory(clang-include-fixer)
1515
add_subdirectory(clang-move)
1616
add_subdirectory(clang-query)
17+
add_subdirectory(include-cleaner)
1718
add_subdirectory(pp-trace)
1819
add_subdirectory(pseudo)
1920
add_subdirectory(tool-template)
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
add_subdirectory(lib)
2+
if(CLANG_INCLUDE_TESTS)
3+
add_subdirectory(test)
4+
add_subdirectory(unittests)
5+
endif()

clang-tools-extra/include-cleaner/README.md

Whitespace-only changes.
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
//===--- AnalysisInternal.h - Analysis building blocks ------------- C++-*-===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
//
9+
// This file provides smaller, testable pieces of the used-header analysis.
10+
// We find the headers by chaining together several mappings.
11+
//
12+
// AST => AST node => Symbol => Location => Header
13+
// /
14+
// Macro expansion =>
15+
//
16+
// The individual steps are declared here.
17+
// (AST => AST Node => Symbol is one API to avoid materializing DynTypedNodes).
18+
//
19+
//===----------------------------------------------------------------------===//
20+
21+
#ifndef CLANG_INCLUDE_CLEANER_ANALYSISINTERNAL_H
22+
#define CLANG_INCLUDE_CLEANER_ANALYSISINTERNAL_H
23+
24+
#include "clang/Basic/SourceLocation.h"
25+
#include "llvm/ADT/STLFunctionalExtras.h"
26+
27+
namespace clang {
28+
class Decl;
29+
class NamedDecl;
30+
namespace include_cleaner {
31+
32+
/// Traverses part of the AST from \p Root, finding uses of symbols.
33+
///
34+
/// Each use is reported to the callback:
35+
/// - the SourceLocation describes where the symbol was used. This is usually
36+
/// the primary location of the AST node found under Root.
37+
/// - the NamedDecl is the symbol referenced. It is canonical, rather than e.g.
38+
/// the redecl actually found by lookup.
39+
///
40+
/// walkAST is typically called once per top-level declaration in the file
41+
/// being analyzed, in order to find all references within it.
42+
void walkAST(Decl &Root, llvm::function_ref<void(SourceLocation, NamedDecl &)>);
43+
44+
} // namespace include_cleaner
45+
} // namespace clang
46+
47+
#endif
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
set(LLVM_LINK_COMPONENTS Support)
2+
3+
add_clang_library(clangIncludeCleaner
4+
WalkAST.cpp
5+
6+
LINK_LIBS
7+
clangBasic
8+
clangAST
9+
)
10+
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
//===--- WalkAST.cpp - Find declaration references in the AST -------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#include "AnalysisInternal.h"
10+
#include "clang/AST/RecursiveASTVisitor.h"
11+
12+
namespace clang {
13+
namespace include_cleaner {
14+
namespace {
15+
using DeclCallback = llvm::function_ref<void(SourceLocation, NamedDecl &)>;
16+
17+
class ASTWalker : public RecursiveASTVisitor<ASTWalker> {
18+
DeclCallback Callback;
19+
20+
void report(SourceLocation Loc, NamedDecl *ND) {
21+
if (!ND || Loc.isInvalid())
22+
return;
23+
Callback(Loc, *cast<NamedDecl>(ND->getCanonicalDecl()));
24+
}
25+
26+
public:
27+
ASTWalker(DeclCallback Callback) : Callback(Callback) {}
28+
29+
bool VisitTagTypeLoc(TagTypeLoc TTL) {
30+
report(TTL.getNameLoc(), TTL.getDecl());
31+
return true;
32+
}
33+
34+
bool VisitDeclRefExpr(DeclRefExpr *DRE) {
35+
report(DRE->getLocation(), DRE->getFoundDecl());
36+
return true;
37+
}
38+
};
39+
40+
} // namespace
41+
42+
void walkAST(Decl &Root, DeclCallback Callback) {
43+
ASTWalker(Callback).TraverseDecl(&Root);
44+
}
45+
46+
} // namespace include_cleaner
47+
} // namespace clang
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
set(CLANG_INCLUDE_CLEANER_TEST_DEPS
2+
ClangIncludeCleanerTests
3+
)
4+
5+
foreach (dep FileCheck not count)
6+
if(TARGET ${dep})
7+
list(APPEND CLANG_INCLUDE_CLEANER_TEST_DEPS ${dep})
8+
endif()
9+
endforeach()
10+
11+
configure_lit_site_cfg(
12+
${CMAKE_CURRENT_SOURCE_DIR}/lit.site.cfg.py.in
13+
${CMAKE_CURRENT_BINARY_DIR}/lit.site.cfg.py
14+
MAIN_CONFIG
15+
${CMAKE_CURRENT_BINARY_DIR}/lit.cfg.py)
16+
17+
configure_lit_site_cfg(
18+
${CMAKE_CURRENT_SOURCE_DIR}/Unit/lit.site.cfg.py.in
19+
${CMAKE_CURRENT_BINARY_DIR}/Unit/lit.site.cfg.py
20+
MAIN_CONFIG
21+
${CMAKE_CURRENT_BINARY_DIR}/Unit/lit.cfg.py)
22+
23+
add_lit_testsuite(check-clang-include-cleaner "Running the clang-include-cleaner regression tests"
24+
${CMAKE_CURRENT_BINARY_DIR}
25+
DEPENDS ${CLANG_INCLUDE_CLEANER_TEST_DEPS})
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import lit.formats
2+
config.name = "clangIncludeCleaner Unit Tests"
3+
config.test_format = lit.formats.GoogleTest('.', 'Tests')
4+
config.test_source_root = config.clang_include_cleaner_binary_dir + "/unittests"
5+
config.test_exec_root = config.clang_include_cleaner_binary_dir + "/unittests"
6+
7+
# Point the dynamic loader at dynamic libraries in 'lib'.
8+
# FIXME: it seems every project has a copy of this logic. Move it somewhere.
9+
import platform
10+
if platform.system() == 'Darwin':
11+
shlibpath_var = 'DYLD_LIBRARY_PATH'
12+
elif platform.system() == 'Windows':
13+
shlibpath_var = 'PATH'
14+
else:
15+
shlibpath_var = 'LD_LIBRARY_PATH'
16+
config.environment[shlibpath_var] = os.path.pathsep.join((
17+
"@SHLIBDIR@", "@LLVM_LIBS_DIR@",
18+
config.environment.get(shlibpath_var,'')))
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
@LIT_SITE_CFG_IN_HEADER@
2+
# This is a shim to run the gtest unittests in ../unittests using lit.
3+
4+
config.llvm_libs_dir = path("@LLVM_LIBS_DIR@")
5+
config.shlibdir = path("@SHLIBDIR@")
6+
7+
config.clang_include_cleaner_binary_dir = path("@CMAKE_CURRENT_BINARY_DIR@/..")
8+
9+
# Delegate logic to lit.cfg.py.
10+
lit_config.load_config(config, "@CMAKE_CURRENT_SOURCE_DIR@/Unit/lit.cfg.py")
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
import lit.llvm
2+
3+
lit.llvm.initialize(lit_config, config)
4+
lit.llvm.llvm_config.use_default_substitutions()
5+
6+
config.name = 'ClangIncludeCleaner'
7+
config.suffixes = ['.test', '.c', '.cpp']
8+
config.excludes = ['Inputs']
9+
config.test_format = lit.formats.ShTest(not lit.llvm.llvm_config.use_lit_shell)
10+
config.test_source_root = config.clang_include_cleaner_source_dir + "/test"
11+
config.test_exec_root = config.clang_include_cleaner_binary_dir + "/test"
12+
13+
config.environment['PATH'] = os.path.pathsep.join((
14+
config.clang_tools_dir,
15+
config.llvm_tools_dir,
16+
config.environment['PATH']))

0 commit comments

Comments
 (0)