Extra compatibility handling for decoding

I noticed that in actual decoding, garbled characters would appear for traditional Chinese characters. 

just like =CD=A8=D6=AA=EAP=EC=B6 or  =BE=A7=D4=AA=8FS=B7=BF=EAP=E9]=CA=C2=ED=97

After some modifications, I have resolved this issue. If the author thinks it's acceptable, this can be merged into the main branch.


In quoted_printable_mail_codec.dart

`/// Decodes the specified text
  ///
  /// [part] the text part that should be decoded
  /// [codec] the character encoding (charset)
  /// Set [isHeader] to true to decode header text using the Q-Encoding scheme,
  /// compare https://tools.ietf.org/html/rfc2047#section-4.2
  @override
  String decodeText(
    final String part,
    final Encoding codec, {
    bool isHeader = false,
  }) {
    final buffer = StringBuffer();
    // remove all soft-breaks:
    final cleaned = part.replaceAll('=\r\n', '');
    for (var i = 0; i < cleaned.length; i++) {
      final char = cleaned[i];
      if (char == '=') {
        final hexText = cleaned.substring(i + 1, i + 3);
        var charCode = int.tryParse(hexText, radix: 16);
        if (charCode == null) {
          buffer.write(hexText);
        } else {
          final charCodes = [charCode];
          while (cleaned.length > (i + 4) && cleaned[i + 3] == '=') {
            i += 3;
            final hexText = cleaned.substring(i + 1, i + 3);
            charCode = int.parse(hexText, radix: 16);
            charCodes.add(charCode);
          }
          //some special text,just like =CD=A8=D6=AA=EAP=EC=B6 or  =BE=A7=D4=AA=8FS=B7=BF=EAP=E9]=CA=C2=ED=97
          if (cleaned.length >= (i + 4)) {
            String nextStr = cleaned.substring(i, i + 4);
            if (nextStr.startsWith('=') && !nextStr.endsWith("=")) {
              String tempStr = cleaned.substring(i + 3, i + 4);
              charCode = int.tryParse(tempStr, radix: 16);
              if (charCode == null) {
                int asciiValue = tempStr.codeUnitAt(0);
                List<int> tempList = [charCodes.last, asciiValue];
                if (isGBK(tempList)) {
                  charCodes.add(asciiValue);
                  i += 1;
                }
              }
            }
          }
          try {
            final decoded = codec.decode(charCodes);
            buffer.write(decoded);
          } on FormatException catch (err) {
            print('unable to decode quotedPrintable buffer: ${err.message}');
            buffer.write(String.fromCharCodes(charCodes));
          }
        }
        i += 2;
      } else if (isHeader && char == '_') {
        buffer.write(' ');
      } else {
        buffer.write(char);
      }
    }

    return buffer.toString();
  }

  bool isGBK(List<int> bytes) {
    int i = 0;
    while (i < bytes.length) {
      int byte = bytes[i] & 0xFF;
      if (byte <= 0x7F) {
        i++;
      } else {
        if (byte < 0x81 || byte > 0xFE) {
          return false;
        }
        i++;
        if (i >= bytes.length) {
          return false;
        }
        int secondByte = bytes[i] & 0xFF;
        if (!((secondByte >= 0x40 && secondByte <= 0x7E) || (secondByte >= 0x80 && secondByte <= 0xFE))) {
          return false;
        }
        i++;
      }
    }
    return true;
  }
`

The main idea is to transcode the extra bit and append it to the array, then check if it is in GBK encoding format. If so, append it to the encoding array to form new text, rather than directly transcoding it which would result in garbled characters.

Thanks to the author for their selfless dedication.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extra compatibility handling for decoding #261

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extra compatibility handling for decoding #261

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions