Simple changes to a file shouldn't cause a reanalysis of everything #4498

excubo-jg · 2023-06-10T10:24:34Z

excubo-jg
Jun 10, 2023

(This was initially issued in PTVS but is related to Pylance)

Edits of purely local relevance like adding a new line or changing code from if i < 0: to if i > 0: cause a reevaluation of all open files (or the whole workspace depending on the settings). Files are thus checked hundreds of times which are neither changed nor impacted by changes. If these other files take time to get evaluated, even minor edits in small files take several seconds to evaluate which is quite a nuisance.

The only solution seems to have as few files open as possible. Which isn't that helpful and I do not understand the rationale: If the outcome of the analysis of the file which is being edited is dependent on other files being checked then that check should not be dependent on whether these files are open or not. If it is not dependent, then why are other files checked at all?

rchiodo · 2023-06-12T16:25:57Z

rchiodo
Jun 12, 2023
Collaborator

Thanks for the issue.

Are you seeing it happen continually? Do you have a repro that can show this? (And a log of it happening)

Typing in a file should only analyze that file. There is an indexing process that happens in the background, but once that's done, it shouldn't run on every keystroke.

0 replies

erictraut · 2023-06-12T16:40:32Z

erictraut
Jun 12, 2023

Typing in a file should only analyze that file.

It will also analyze other open files that import from the modified file (directly or indirectly) because changes in the modified file could affect their analysis. This reanalysis normally takes only a few tens of milliseconds, so it shouldn't be that noticeable. However, if you're editing a core file that every other file imports, it's possible that it will be observable. The worst case is if you have a bunch of large files that are part of a large import cycle. That's normally something one would want to avoid, but I've seen it occur in some Python projects.

Keep in mind that even seemingly-innocuous code changes within a file can have an impact on the types of externally-visible symbols and therefore the type information and diagnostics within files that depend on them.

0 replies

excubo-jg · 2023-06-12T22:13:21Z

excubo-jg
Jun 12, 2023
Author

I added in one python code file this function

def for_test(self) -> None:
    pass

and then turned on logging and added a new line before pass and a tab.

 def for_test(self) -> None:

    pass

You can see in the log file that both changes caused several other files to get accessed:
log.txt

The processing times can be much longer, depending on the set of files which get triggered. As the structure of the project is as it is I am wondering if this cascading to other files is always necessary. From what I am seeing I think any keystroke causes a multi-file update and this leads to reported issues and warnings becoming laggy and out of sync with the actual code. If updates for instance to symbols were to impact several files than this delay is not much of a problem given the overall structure of the code but I'd assume that a lot of the edits have only local implications and a more responsive reaction to .typos would be much appreciated.

0 replies

rchiodo · 2023-06-12T22:25:51Z

rchiodo
Jun 12, 2023
Collaborator

I believe you're asking that we scan for syntax errors before (or faster) than other problems? I'm not sure if we can do analysis without parsing import trees. When the edit happens, we have to reparse all imports.

Maybe we could do a two pass approach?@erictraut do you think it's feasible to split the work up?

Is this example the one that prompted you to log the issue or are you seeing much larger slowdowns elsewhere? Everything seems to have been analyzed in less than 500ms. For this example, it doesn't seem worth it to change the behavior.

0 replies

erictraut · 2023-06-12T22:30:27Z

erictraut
Jun 12, 2023

As I mentioned above, any change to a file triggers a reanalysis of that file and any files that depend on that file (transitively). It doesn't matter whether it's an empty line, a string, etc.

You mentioned that you "added in one python code file this function". Which file in the log does that correspond to? Is it in played.py?

The logs that you provided show very little (and very fast) foreground analysis (the lines labeled "FG"). According to the log, the foreground analysis of played.py took just 131ms.

The background analysis ("BG") is done lazily to refresh the symbol index. This should have no bearing on responsiveness, so you can ignore those log items.

Pylance is already highly optimized to deliver diagnostic results as quickly as possible for the foreground (active) window, then for the other open windows, then for any closed files (if diagnosticMode is set to "workspace").

0 replies

erictraut · 2023-06-12T22:35:34Z

erictraut
Jun 12, 2023

Pylance does include a 250ms delay after you type before it performs any analysis. This allows you to type multiple characters before it does any work. Any time you stop typing for 250ms, it starts analyzing any dirty files in priority order.

What is the actual problem you're seeing here? Is there a noticeable lag that you're experiencing? Is there high CPU utilization over a long period of time? Or are you just concerned about the information that you're seeing in the logs?

0 replies

excubo-jg · 2023-06-13T14:21:43Z

excubo-jg
Jun 13, 2023
Author

Many thanks, these explanations are very helpful.

Actually, the reason why I became aware of the processing times is that Visual Studio displays "N files to analyze" while processing. This message is continuously updated with changing file counts until the background analysis is finished and "Python analysis done" is displayed. Also, the content of the Error List is not updated any earlier. Thus, errors and warnings can be out of sync with the latest code revisions until all processing has been completed which is quite a nuisance. Using BG to track progress seems rather unhelpful as you are suggesting to ignore it. Shall I open an issue in PTVS to get timely error and status updates or does Pylance track this?

Still, the background analysis in this code base (also in others I am working with) is so time consuming that it does not finish for minutes with increased CPU utilization and fan noise as long when coding as it constantly gets restarted. [The 500ms in the example are actually 750ms delay for the user and times can go up to around 5k ms depending on which file is edited and which files are open]. While it might not improve on reaction times I think quite a few of the recalculations could be avoided if changes which cannot impact the indices in the background analysis are not processed after the foreground analysis is finished thus bringing down total CPU-load.

0 replies

erictraut · 2023-06-13T15:29:07Z

erictraut
Jun 13, 2023

Now that I think about it more, I think I was wrong above when I said that "BG" is used only to refresh the symbol index. I think "BG" is also used for full file analysis.

I suspect that your situation involves many open files (some of which contain significant amounts of code) that have a direct or indirect import dependency on the file you're editing. This will trigger reanalysis of these dependent files. To reduce this effect, try to eliminate any import cycles. Ideally, your program's import graph should be a DAG. This will limit reanalysis to only those files that are "upstream" from the edited file. If the edited file is a leaf node that all other files depend upon, then there's not much to be done other than reanalyzing everything else. Another thing you can do is to not leave so many windows open and/or switch diagnosticMode from "workspace" to "openFilesOnly".

0 replies

Simple changes to a file shouldn't cause a reanalysis of everything #4498

Uh oh!

excubo-jg Jun 10, 2023

Replies: 8 comments

Uh oh!

rchiodo Jun 12, 2023 Collaborator

Uh oh!

erictraut Jun 12, 2023

Uh oh!

Uh oh!

excubo-jg Jun 12, 2023 Author

Uh oh!

Uh oh!

rchiodo Jun 12, 2023 Collaborator

Uh oh!

erictraut Jun 12, 2023

Uh oh!

erictraut Jun 12, 2023

Uh oh!

excubo-jg Jun 13, 2023 Author

Uh oh!

erictraut Jun 13, 2023

excubo-jg
Jun 10, 2023

rchiodo
Jun 12, 2023
Collaborator

erictraut
Jun 12, 2023

excubo-jg
Jun 12, 2023
Author

rchiodo
Jun 12, 2023
Collaborator

erictraut
Jun 12, 2023

erictraut
Jun 12, 2023

excubo-jg
Jun 13, 2023
Author

erictraut
Jun 13, 2023