Yarchik 0aee1d5382 Do not drop the character after U+FFFE or U+FFFF in encodeToXmlString
encodeToXmlString skips surrogate pairs with the guard
`char > 0xd7ff && (char < 0xe000 || char > 0xfffd)` and then does `i++` to step
over the low surrogate. That predicate is also true for U+FFFE and U+FFFF, which
are single UTF-16 code units, not surrogate pairs. The `i++` then skips the
character that follows them, so it is silently dropped.

For example, encodeToXmlString of U+FFFF followed by "A" returned "&#xFFFF;"
instead of "&#xFFFF;A". The function serializes XML text nodes and attribute
values in xml_parser.js and xfa_object.js, so this corrupts round-tripped XML
and XFA content.

The correct test for a surrogate pair is `char > 0xffff`, since codePointAt
returns a value at or above 0x10000 only for a real pair. This preserves the
existing behavior for emoji, the U+FFFD boundary, and lone surrogates, and stops
dropping the character after U+FFFE and U+FFFF.
2026-07-02 14:03:49 +01:00
..