-
Notifications
You must be signed in to change notification settings - Fork 67
Description
I noticed that in actual decoding, garbled characters would appear for traditional Chinese characters.
just like =CD=A8=D6=AA=EAP=EC=B6 or =BE=A7=D4=AA=8FS=B7=BF=EAP=E9]=CA=C2=ED=97
After some modifications, I have resolved this issue. If the author thinks it's acceptable, this can be merged into the main branch.
In quoted_printable_mail_codec.dart
`/// Decodes the specified text
///
/// [part] the text part that should be decoded
/// [codec] the character encoding (charset)
/// Set [isHeader] to true to decode header text using the Q-Encoding scheme,
/// compare https://tools.ietf.org/html/rfc2047#section-4.2
@OverRide
String decodeText(
final String part,
final Encoding codec, {
bool isHeader = false,
}) {
final buffer = StringBuffer();
// remove all soft-breaks:
final cleaned = part.replaceAll('=\r\n', '');
for (var i = 0; i < cleaned.length; i++) {
final char = cleaned[i];
if (char == '=') {
final hexText = cleaned.substring(i + 1, i + 3);
var charCode = int.tryParse(hexText, radix: 16);
if (charCode == null) {
buffer.write(hexText);
} else {
final charCodes = [charCode];
while (cleaned.length > (i + 4) && cleaned[i + 3] == '=') {
i += 3;
final hexText = cleaned.substring(i + 1, i + 3);
charCode = int.parse(hexText, radix: 16);
charCodes.add(charCode);
}
//some special text,just like =CD=A8=D6=AA=EAP=EC=B6 or =BE=A7=D4=AA=8FS=B7=BF=EAP=E9]=CA=C2=ED=97
if (cleaned.length >= (i + 4)) {
String nextStr = cleaned.substring(i, i + 4);
if (nextStr.startsWith('=') && !nextStr.endsWith("=")) {
String tempStr = cleaned.substring(i + 3, i + 4);
charCode = int.tryParse(tempStr, radix: 16);
if (charCode == null) {
int asciiValue = tempStr.codeUnitAt(0);
List tempList = [charCodes.last, asciiValue];
if (isGBK(tempList)) {
charCodes.add(asciiValue);
i += 1;
}
}
}
}
try {
final decoded = codec.decode(charCodes);
buffer.write(decoded);
} on FormatException catch (err) {
print('unable to decode quotedPrintable buffer: ${err.message}');
buffer.write(String.fromCharCodes(charCodes));
}
}
i += 2;
} else if (isHeader && char == '_') {
buffer.write(' ');
} else {
buffer.write(char);
}
}
return buffer.toString();
}
bool isGBK(List bytes) {
int i = 0;
while (i < bytes.length) {
int byte = bytes[i] & 0xFF;
if (byte <= 0x7F) {
i++;
} else {
if (byte < 0x81 || byte > 0xFE) {
return false;
}
i++;
if (i >= bytes.length) {
return false;
}
int secondByte = bytes[i] & 0xFF;
if (!((secondByte >= 0x40 && secondByte <= 0x7E) || (secondByte >= 0x80 && secondByte <= 0xFE))) {
return false;
}
i++;
}
}
return true;
}
`
The main idea is to transcode the extra bit and append it to the array, then check if it is in GBK encoding format. If so, append it to the encoding array to form new text, rather than directly transcoding it which would result in garbled characters.
Thanks to the author for their selfless dedication.