|
| 1 | +# In-memory file analysis |
| 2 | + |
| 3 | +<note>Experimental API. Subject to changes.</note> |
| 4 | + |
| 5 | +Parsing a Kotlin file does not require any knowledge about the project it belongs to. On the other hand, for *semantic |
| 6 | +code analysis*, understanding dependencies and compilation options of a file is crucial. |
| 7 | + |
| 8 | +In most cases, source files — whether written by a user or auto-generated — are stored on disk and belong to a specific module. |
| 9 | +The build system understands the project layout and instructs the compiler or the IDE to use appropriate dependencies for |
| 10 | +all files in that module. For script files, such as `build.gradle.kts`, the situation is more complex since |
| 11 | +scripts technically do not belong to any module. However, in such cases, the build system also provides the necessary |
| 12 | +context. For example, Gradle build scripts include the Gradle API and all libraries Gradle depends on in their classpath. |
| 13 | + |
| 14 | +In certain cases, it might be useful to analyze a file without storing it in the file system. For example, an IDE |
| 15 | +inspection may use in-memory files to verify whether code will remain valid after applying a proposed change. |
| 16 | +Specifically, the inspection might create a copy of the `KtFile`, apply the change to the copy, and check for new |
| 17 | +compilation errors. In such scenarios, the inspection needs to supply the correct analysis context for the in-memory |
| 18 | +`KtFile`. |
| 19 | + |
| 20 | +The Analysis API provides multiple approaches for analyzing in-memory files and attaching context to them. |
| 21 | +Below, we explore these options and their differences. |
| 22 | + |
| 23 | +## Stand-alone file analysis |
| 24 | + |
| 25 | +Let's begin with the most generic case: creating an in-memory file with arbitrary content and analyzing it. |
| 26 | + |
| 27 | +To create a file, we use the `KtPsiFactory` class: |
| 28 | + |
| 29 | +```kotlin |
| 30 | +val text = |
| 31 | + """ |
| 32 | + package test |
| 33 | + |
| 34 | + fun foo() { |
| 35 | + println("Hello, world!") |
| 36 | + } |
| 37 | + """.trimIndent() |
| 38 | + |
| 39 | +val factory = KtPsiFactory(project) |
| 40 | +val file = factory.createFile(text) |
| 41 | +``` |
| 42 | + |
| 43 | +`KtPsiFactory` offers many utilities for creating chunks of Kotlin code. In our case, we are primarily interested in |
| 44 | +creating entire Kotlin files. |
| 45 | + |
| 46 | +If we analyze the file we created using `analyze {}`, we notice that the `println` reference is reported as unresolved: |
| 47 | + |
| 48 | +```kotlin |
| 49 | +analyze(file) { |
| 50 | + val diagnostics = file.collectDiagnostics(KaDiagnosticCheckerFilter.ONLY_COMMON_CHECKERS) |
| 51 | + // ["Unresolved reference 'println'."] |
| 52 | + val messages = diagnostics.map { it.defaultMessage } |
| 53 | +} |
| 54 | +``` |
| 55 | + |
| 56 | +This happens because the `KtFile` we created lacks any attached context, making the Kotlin Standard Library unavailable. |
| 57 | +However, code in the file still *can* access a few basic Kotlin types, such as `Int` or `String`, and can resolve |
| 58 | +references to declarations from the same file. |
| 59 | + |
| 60 | +Now, let's assume we have a `contextFile` that belongs to a module and want to analyze our in-memory file as it was |
| 61 | +in that module. First, we retrieve the containing module of the `contextFile`. |
| 62 | + |
| 63 | +```kotlin |
| 64 | +val contextModule = KaModuleProvider.getModule(project, contextFile, useSiteModule = null) |
| 65 | +``` |
| 66 | + |
| 67 | +Now that we have the module, we attach it to our file: |
| 68 | + |
| 69 | +```kotlin |
| 70 | +@OptIn(KaExperimentalApi::class) |
| 71 | +file.contextModule = contextModule |
| 72 | +``` |
| 73 | + |
| 74 | +If the context module includes a dependency to the Kotlin Standard library, the analysis will no longer produce errors. |
| 75 | + |
| 76 | +The created file can reference declarations from the context module, including `internal` ones. |
| 77 | +However, no matter the content of our newly created file, it will not affect resolution of our context file. |
| 78 | +Such as, if we declare a function in the `file`, it will not be visible from the `contextFile`. |
| 79 | + |
| 80 | +Here is the complete code of our example: |
| 81 | + |
| 82 | +```kotlin |
| 83 | +val text = |
| 84 | + """ |
| 85 | + package test |
| 86 | + |
| 87 | + fun foo() { |
| 88 | + println("Hello, world!") |
| 89 | + } |
| 90 | + """.trimIndent() |
| 91 | + |
| 92 | +val factory = KtPsiFactory(project) |
| 93 | +val file = factory.createFile(text) |
| 94 | + |
| 95 | +val contextModule = KaModuleProvider.getModule(project, contextFile, useSiteModule = null) |
| 96 | + |
| 97 | +@OptIn(KaExperimentalApi::class) |
| 98 | +file.contextModule = contextModule |
| 99 | + |
| 100 | +analyze(file) { |
| 101 | + val diagnostics = file.collectDiagnostics(KaDiagnosticCheckerFilter.ONLY_COMMON_CHECKERS) |
| 102 | + // An empty list |
| 103 | + val messages = diagnostics.map { it.defaultMessage } |
| 104 | +} |
| 105 | +``` |
| 106 | + |
| 107 | + |
| 108 | +## Context modules |
| 109 | + |
| 110 | +In the previous example, we used the `KaModuleProvider.getModule()` function to retrieve the module containing the |
| 111 | +`contextFile`. The returned value is of type `KaModule` which is an Analysis API abstraction over `Module`, `Library`, |
| 112 | +and `Sdk` concepts from IntelliJ IDEA. Specifically, a `KaSourceModule` represents a source module, and libraries and |
| 113 | +SDKs are represented by `KaLibraryModule`s. Every `KaSymbol` in the Analysis API is associated with some `KaModule`. |
| 114 | + |
| 115 | +If you already have a reference to a `Module`, you can convert it to a `KaModule` using one of the Kotlin plugin helper |
| 116 | +functions: |
| 117 | + |
| 118 | +```kotlin |
| 119 | +fun Module.toKaSourceModule(kind: KaSourceModuleKind): KaSourceModule? |
| 120 | +fun Module.toKaSourceModuleForProduction(): KaSourceModule? |
| 121 | +fun Module.toKaSourceModuleForTest(): KaSourceModule? |
| 122 | +``` |
| 123 | + |
| 124 | +For more related APIs, refer to the Kotlin plugin's |
| 125 | +[source code](https://github.com/JetBrains/intellij-community/blob/master/plugins/kotlin/base/project-structure/src/org/jetbrains/kotlin/idea/base/projectStructure/api.kt). |
| 126 | +There are also overloads that accept `ModuleId` and `ModuleEntity` from the newer project model API. |
| 127 | + |
| 128 | +In the `KaModuleProvider.getModule` function call, we passed `useSiteModule = null`. For advanced scenarios, |
| 129 | +you might want to analyze files from other modules in the context of a synthetic module. In such cases, that synthetic |
| 130 | +module can be passed as a `useSiteModule`. For typical use cases, it is safe to pass `null`. |
| 131 | + |
| 132 | + |
| 133 | +## Physical and non-physical files |
| 134 | + |
| 135 | +In the previous example, we used the `KtPsiFactory` to create a non-physical file. From IntelliJ IDEA's perspective, |
| 136 | +"non-physical" differs from whether the file is stored on disk. We can create both physical and non-physical `KtFile`s |
| 137 | +that are not written to the disk. |
| 138 | + |
| 139 | +So what is the difference? Non-physical files are typically created for one-shot analysis. Creating them has a lower |
| 140 | +cost, but IntelliJ IDEA does not track changes in them. A physical file, on the other hand, is handled by the PSI |
| 141 | +modification machinery, allowing the Analysis API to track changes and update analysis results accordingly. |
| 142 | + |
| 143 | +If you need a long-lived file that will be modified and analyzed multiple times, you should create a physical file by |
| 144 | +using a `KtPsiFactory` with `eventSystemEnabled = true`: |
| 145 | + |
| 146 | +```kotlin |
| 147 | +val factory = KtPsiFactory(project, eventSystemEnabled = true) |
| 148 | +val file = factory.createFile(text) |
| 149 | +``` |
| 150 | + |
| 151 | + |
| 152 | +## File copies |
| 153 | + |
| 154 | +In some cases, you might want to create a complete copy of an existing file, modify it in some way, and analyze |
| 155 | +the result. |
| 156 | + |
| 157 | +To create a copy of a file, you can use the `copy` function: |
| 158 | + |
| 159 | +```kotlin |
| 160 | +val fileCopy = file.copy() as KtFile |
| 161 | +``` |
| 162 | + |
| 163 | +The `copy()` function sets a reference to the original file in the produced copy. As a result, the `getOriginalFile()` |
| 164 | +of the newly created file points back to the original file. This allows the Analysis API to automatically use the |
| 165 | +context of the original file. In other words, there is no need to manually set the `contextModule` for a copied file. |
| 166 | + |
| 167 | +The `copy()` function creates non-physical files. For this setup (a non-physical file with a `getOriginalFile()` set), |
| 168 | +the Analysis API uses a different analysis strategy by default. |
| 169 | + |
| 170 | +This behavior is optimized for efficiency. Since the original file might be large, analyzing it from scratch could be |
| 171 | +unnecessarily resource-intensive. In most cases, file copies primarily differ in declaration bodies, so the Analysis API |
| 172 | +leverages existing analysis results from the original file. |
| 173 | + |
| 174 | +If you make changes to the declaration signatures in the copied file, you should analyze it independently of the |
| 175 | +original file. To do so, use the `PREFER_SELF` resolution mode with the `analyzeCopy()` function: |
| 176 | + |
| 177 | +```kotlin |
| 178 | +analyzeCopy(fileCopy, KaDanglingFileResolutionMode.PREFER_SELF) { |
| 179 | + // Analysis code, just as in `analyze()` |
| 180 | +} |
| 181 | +``` |
| 182 | + |
| 183 | +On the other side, if you manually create a physical file copy, you can still request more efficient analysis by passing |
| 184 | +the `KaDanglingFileResolutionMode.IGNORE_SELF` option: |
| 185 | + |
| 186 | +```kotlin |
| 187 | +val factory = KtPsiFactory(file.project, eventSystemEnabled = true) |
| 188 | +val fileCopy = factory.createFile("text.kt", originalFile.text) |
| 189 | +fileCopy.originalFile = file |
| 190 | + |
| 191 | +analyzeCopy(fileCopy, KaDanglingFileResolutionMode.IGNORE_SELF) { |
| 192 | + // Analysis code, just as in `analyze()` |
| 193 | +} |
| 194 | +``` |
| 195 | + |
| 196 | +The `analyzeCopy()` function works exclusively for file copies. Unless you need to configure the resolution mode |
| 197 | +explicitly, use the usual `analyze()` instead. |
| 198 | + |
| 199 | +<note>In the future, the Analysis API may be able to track changes in file copies and decide on an appropriate |
| 200 | +resolution mode automatically.</note> |
| 201 | + |
| 202 | + |
| 203 | +## Code fragments |
| 204 | + |
| 205 | +If you only need to analyze a single expression or reference within the context of surrounding code, creating a full |
| 206 | +file copy is often unnecessary. For these use cases, the Analysis API provides the concept of *code fragments*. |
| 207 | + |
| 208 | +A code fragment is a small piece of code that can be analyzed within the context of some other code. |
| 209 | +There are three types of code fragments: |
| 210 | + |
| 211 | +- `KtExpressionCodeFragment` for analyzing a single expression; |
| 212 | +- `KtBlockCodeFragment` for analyzing a block of statements; |
| 213 | +- `KtTypeCodeFragment` for analyzing a single type reference. |
| 214 | + |
| 215 | +All three types of code fragments extend the `KtCodeFragment` class, which itself extends `KtFile`. |
| 216 | + |
| 217 | +Code fragments differ from typical `KtFile`s in several important ways: |
| 218 | + |
| 219 | +- **No package directive**: A code fragment cannot have a `package` directive; its package is inherently the same as |
| 220 | + the package of the context file. |
| 221 | +- **Import handling**: Imports are submitted externally, and no import aliases are permitted. |
| 222 | +- **Local-only content**: All content within a code fragment is considered local. |
| 223 | + |
| 224 | +To create a code fragment, you need two inputs: the source text of the fragment and a context element from the |
| 225 | +surrounding code. For example, consider the following code snippet where `print(name)` is a context element: |
| 226 | + |
| 227 | +```kotlin |
| 228 | +fun test() { |
| 229 | + val name = "poem.txt" |
| 230 | + print(name) |
| 231 | +} |
| 232 | +``` |
| 233 | + |
| 234 | +Now, let's create a code fragment that references `name` to read from a file: |
| 235 | + |
| 236 | +```kotlin |
| 237 | +val fragment = KtExpressionCodeFragment( |
| 238 | + project, |
| 239 | + name = "fragment.kt", |
| 240 | + text = "File(name).readText()", |
| 241 | + context = contextElement, |
| 242 | + imports = listOf("java.io.File") |
| 243 | +) |
| 244 | +``` |
| 245 | + |
| 246 | +A code fragment can reference any declaration visible from its context element, including local declarations. |
| 247 | +For the above example, the code fragment accesses the local variable `name`. |
| 248 | + |
| 249 | +If we pass the `val name = "poem.txt"` declaration as a context element, the code fragment analysis will result in an |
| 250 | +error, as variables are not yet available on the line of their declaration. |
| 251 | + |
| 252 | +Since code fragments extend `KtFile`, you can analyze them in the same way as you analyze files: |
| 253 | + |
| 254 | +```kotlin |
| 255 | +analyze(fragment) { |
| 256 | + // Analysis code |
| 257 | +} |
| 258 | +``` |
| 259 | + |
| 260 | +Code fragments can have a context element from another code fragment, or from an in-memory file. |
0 commit comments