It was very hard to figure out why � is invalid. Both the XML 1.0 and 1.1 specs say that a CharRef can be '&#' [0-9]+ ';' (sec 4.1), but there is a note on the side that it must also be a valid Char. The Char definition (sec 2.2) then lays out which characters are valid. (The 1.0 and 1.1 specs disagree though.) I've gone with the 1.1 definition.
I did not update the rest of the reader to disallow those characters in pcdata segments. If this hurts you or you morally disagree, submit another PR, please.