Getting strange characters in full_text from REST API


#1

Hi I have a problem when using /status/user_timeline.

The original tweet is (handshake symbol)We’ll see you next time.
What I get in full_text field of the api response is 🤝We’ll see you…

There are a few other posts dated a few years ago about this actually… So the issue is not resolved yet?


#2

This probably depends on the encoding you are using.

Can you provide specific Tweet IDs that demonstrate this issue? What code are you using to read the timeline?


#3

Thx for the reply and sorry for sounding rude in the reply of other thread.

The tweet url is https://twitter.com/echorealty/status/829116715510415360

In the full text field, what I get is

"full_text": "RT @echorealty: And just like that, another #MidAtConf is in the books. Thanks to #ICSC for another great deal-making date! 🤝We'll see you…",

I am using this package https://www.npmjs.com/package/twitter to get the timeline response and no code is specified so I assume it is using their default code…

Edit : it is a retweet and id is 829116821349552130. the tweet url above is the original tweet.


#4

I think this is a language or encoding thing. Here’s what I see if I pull that RT using twurl (block below scrolls right for the full value).

$ twurl "/1.1/statuses/show/829116715510415360.json?tweet_mode=extended"
{
  "created_at": "Tue Feb 07 23:56:43 +0000 2017",
  "id": 829116715510415360,
  "id_str": "829116715510415360",
  "full_text": "And just like that, another #MidAtConf is in the books. Thanks to #ICSC for another great deal-making date! 🤝We'll see you next time. https://t.co/Q3U4LPfiyv",
  "truncated": false,
...

So there’s an emoji in there, but that’s the only “unusual” character. twurl is built on the Ruby twit gem so I assume there’s a difference in the way that the twitter node module is encoding the text value in the response.


#5

Thanks for the investigation. The problem seems to lie in javascript native JSON.stringify function. I follow this http://stackoverflow.com/questions/4901133/json-and-escaping-characters and use the following function as a walkaround…

function JSON_stringify(s, emit_unicode)
{
   var json = JSON.stringify(s);
   return emit_unicode ? json : json.replace(/[\u007f-\uffff]/g,
      function(c) { 
        return '\\u'+('0000'+c.charCodeAt(0).toString(16)).slice(-4);
      }
   );
}

Another strange thing is that for the json block that you posted above I can only see a square symbol in the position of Emoji on Mac chrome. But on mobile safari I can actually see the handshake emoji. My chrome version is 56.0 so it supports Emoji for sure. This is probably off topic related to this but do you have any insight regarding this?


#6

Just for someone who’s having the same issue that desktop browser not showing the emojis from twitter. It seems like twitter emoji and os emoji are not identical so thats why some systems can not display them? I am using this https://github.com/twitter/twemoji to convert twitter emoji to img tag and unify the display behaviour across all platforms.