Skip to content

Commit 459ca3f

Browse files
Add a troubleshooting guide for encoding issues (#1075)
* Add a troubleshooting guide for encoding issues * Use vsce 1.x since 2.x is not compatible with Node 12.x * Tune the text in the doc * Adjust the paragraph structure per review comment
1 parent 92542b6 commit 459ca3f

File tree

3 files changed

+81
-4
lines changed

3 files changed

+81
-4
lines changed

.github/workflows/build.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ jobs:
3535
run: npm install
3636

3737
- name: Install build tools
38-
run: npm install -g vsce typescript
38+
run: npm install -g "vsce@<2" typescript
3939

4040
- name: Lint
4141
run: npm run tslint
@@ -71,7 +71,7 @@ jobs:
7171
run: npm install
7272

7373
- name: Install build tools
74-
run: npm install -g vsce typescript --force
74+
run: npm install -g vsce@">1.0 <2.0" typescript --force
7575

7676
- name: Lint
7777
run: npm run tslint
@@ -107,7 +107,7 @@ jobs:
107107
run: npm install
108108

109109
- name: Install build tools
110-
run: npm install -g vsce typescript
110+
run: npm install -g "vsce@<2" typescript
111111

112112
- name: Lint
113113
run: npm run tslint

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,8 @@ Please also check the documentation of [Language Support for Java by Red Hat](ht
127127
Pro Tip: The documentation [Configuration.md](https://github.com/microsoft/vscode-java-debug/blob/master/Configuration.md) provides lots of samples to demonstrate how to use these debug configurations, recommend to take a look.
128128

129129
## Troubleshooting
130-
Reference the [troubleshooting guide](https://github.com/Microsoft/vscode-java-debug/blob/master/Troubleshooting.md) for common errors.
130+
Reference the [Troubleshooting Guide](https://github.com/Microsoft/vscode-java-debug/blob/master/Troubleshooting.md) for common errors.
131+
Reference the [Troubleshooting Guide for Encoding Issues](https://github.com/Microsoft/vscode-java-debug/blob/master/Troubleshooting_encoding.md) for encoding issues.
131132

132133
## Feedback and Questions
133134
You can find the full list of issues at [Issue Tracker](https://github.com/Microsoft/vscode-java-debug/issues). You can submit a [bug or feature suggestion](https://github.com/Microsoft/vscode-java-debug/issues/new), and participate community driven [![Gitter](https://badges.gitter.im/Microsoft/vscode-java-debug.svg)](https://gitter.im/Microsoft/vscode-java-debug)

Troubleshooting_encoding.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Troubleshooting Guide for Encoding Issues
2+
3+
This document provides a guide mostly for Windows users to solve common Java encoding issues.
4+
5+
## 1. Background
6+
Computers can only understand the binary data such as 0 and 1, and it uses charset to encode/decode the data into real-world characters. When two processes interact with each other for I/O, they have to use the compatible charset for encoding and decoding, otherwise garbled characters will probably appear. macOS and Linux use UTF-8 everywhere and encoding is not a problem for them. For Windows, however, the default charset is not UTF-8 and is platform-dependent, which can lead to inconsistent encoding between different tools.
7+
8+
## 2. Common Problems
9+
Below are the typical encoding problems when running a Java program on Windows terminal.
10+
11+
2.1) The file or directory name contains unicode characters, Java launcher cannot find the corresponding classpath or main class well.
12+
```
13+
C:\Test>java -cp 中文目录 Hello
14+
Error: Could not find or load main class Hello
15+
```
16+
17+
```
18+
C:\Test>java -cp ./Exercises 练习
19+
Error: Could not find or load main class ??
20+
Caused by: java.lang.ClassNotFoundException: ??
21+
```
22+
23+
2.2) The string literals with unicode characters appear garbled when printed to the terminal.
24+
```java
25+
public class Hello {
26+
public static void main(String[] args) {
27+
System.out.println("你好!");
28+
}
29+
}
30+
```
31+
32+
```
33+
C:\Test>java -cp ./Exercises Hello
34+
??!
35+
```
36+
37+
2.3) Garbled characters when Java program interacts with terminal for I/O.
38+
39+
```java
40+
import java.util.Scanner;
41+
42+
public class Hello {
43+
public static void main(String[] args) {
44+
Scanner scanner = new Scanner(System.in);
45+
System.out.println(scanner.nextLine());
46+
}
47+
}
48+
```
49+
50+
```
51+
C:\Test>chcp
52+
65001
53+
54+
C:\Test>java -Dfile.encoding=UTF-8 -cp ./Exercises Hello
55+
你好
56+
��
57+
```
58+
59+
## 3.Troubleshooting Suggestions
60+
The following diagram shows the parts of encoding that may be involved when writing and running Java in VS Code.
61+
![image](https://user-images.githubusercontent.com/14052197/140934909-20ce8482-d39c-4c8b-a92b-2878861a5b08.png)
62+
63+
- During the compilation phase, VS Code Java extension uses the file encoding from VS Code settings to read .java source files and compile .class files. Encoding is consistent between editor and Java extension.
64+
65+
- During the run/debug phase, Java extension launches the application in the terminal by default. Most encoding problems occur because the terminal and JVM use incompatible charsets for data processing, or use charsets that do not support the target unicode characters.
66+
- <b>JVM</b> - Uses a default charset compatible with the system locale of Windows platform, and you can change it by using the JVM argument `"-Dfile.encoding"`, or by using `"encoding"` setting in launch.json when running through Java debugger extension.
67+
- <b>Windows Terminals</b> - Uses code page to handle encoding, and you can use `"chcp"` command to view and change the code page.
68+
69+
To solve the encoding problems, the straightforward idea is to use UTF-8 in all toolchains. But unfortunately Windows terminals (such as cmd) do not support UTF-8 perfectly. Therefore, the alternative idea is to let the terminal and JVM use compatible character sets for data processing.
70+
71+
### 3.1) Fix Suggestion : Change system locale to the target language.
72+
73+
On Windows, when you change the system locale, the default Java charset will be changed to one compatible with the system locale, and the terminal's (e.g. cmd) code page will be automatically updated to be consistent as well. Therefore, changing system locale to the target language can solve most encoding issues on Windows. This is also suggested by Java site https://www.java.com/en/download/help/locale.html.
74+
75+
The following screenshot shows how to change the system locale in Windows. for example, if I want to use a terminal to enter Chinese characters into a Java program, I can set the Windows system locale to Chinese. The default Java charset will be `"GBK"` and the cmd codepage will be `"936"`, which will support Chinese characters nicely.
76+
![image](https://user-images.githubusercontent.com/14052197/138408027-da71d3f4-7f64-4bfb-8b34-89d0605606f5.png)

0 commit comments

Comments
 (0)