Certain outputs or systems consider certain unicode code point ranges to be invalid. For example on SQS's SendMessage Endpoing there is this warning:
A message can include only XML, JSON, and unformatted text. The following Unicode characters are allowed.
For more information, see the [W3C specification for characters](http://www.w3.org/TR/REC-xml/#charsets).
#x9 | #xA | #xD | #x20 to #xD7FF | #xE000 to #xFFFD | #x10000 to #x10FFFF
If a message contains characters outside the allowed set, Amazon SQS rejects the message and returns an InvalidMessageContents error.
Ensure that your message body includes only valid characters to avoid this exception.
While it is possible to reference unicode code points in a bloblang mapping:
root.result = this.data.contains("\uFFFE")
We appear to lack an ergonomic way to potentially deal with unicode code-point ranges, like is possible with Go's strings.Map func.
Relevant closed PR: https://github.com/warpstreamlabs/bento/pull/717/changes