Skip to content

ArrayIndexOutOfBoundsException thrown for invalid ending XML string when using JDK default Stax XML parser #618

@arthurscchan

Description

@arthurscchan

In XmlTokenStream::_collectUntilTag method, there is an infinite while loop to loop through the provided XML string (through _xmlReader) character by character. The loop only exits by return statements when a valid character (XMLStreamConstants.START_ELEMENT, XMLStreamConstants.END_ELEMENT or XMLStreamConstants.END_DOCUMENT) is found. If the provided XML string is invalid without those characters, it will continue to loop through the whole XML String and eventually throw ArrayIndexOutOfBoundsException when _xmlReader has no more characters that can be returned by the next() method. Besides, there are also some other methods depends on those END_ELEMENT to stop looping out-of-bound. The suggested fix could be simply wrapping the ArrayIndexOutOfBoundsException with the JsonParseException.

        CharSequence chars = null;
        while (true) {
            switch (_xmlReader.next()) {
            case XMLStreamConstants.START_ELEMENT:
                return (chars == null) ? "" : chars.toString();

            case XMLStreamConstants.END_ELEMENT:
            case XMLStreamConstants.END_DOCUMENT:
                return (chars == null) ? "" : chars.toString();

            // note: SPACE is ignorable (and seldom seen), not to be included
            case XMLStreamConstants.CHARACTERS:
            case XMLStreamConstants.CDATA:
                // 17-Jul-2017, tatu: as per [dataformat-xml#236], need to try to...
                {
                    String str = _getText(_xmlReader);
                    if (chars == null) {
                        chars = str;
                    } else  {
                        if (chars instanceof String) {
                            chars = new StringBuilder(chars);
                        }
                        ((StringBuilder)chars).append(str);
                    }
                }
                break;
            default:
                // any other type (proc instr, comment etc) is just ignored
            }
        }
    }

We found this issue by OSS-Fuzz and it is reported in https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=64655, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=64667 and https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=64659.

Metadata

Metadata

Assignees

No one assigned

    Labels

    2.17Issues planned at earliest for 2.17

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions