I am working with the REST API via the statistical language R. Using several packages I always come to the same problem. As I start to deal with Emojis, it looks that they come in the form of UTF-16 encoded with UTF-8. This causes some problems dealing with the Unicode standard Code definded for them.
For example, you can see what I am talking about here:
as you can see in the second question, the text that comes from the REST API is “\xED\xA0\xBD”, but as I understand, there is no way to make the conversion to the high surrogate (\xED\xA0\xBD is the high surrogate U+D83D, – and \xED\xB2\x83 is the low surrogate U+DC83)
The equivalence is something you can check here https://codepoints.net/U+D83D