This has been raised in another topic in 2017 but seemingly no answer yet.

I’m using twitteroauth for PHP.

The indices for hashtags etc become unreliable when emojis are present in a tweet. Seemingly the emoji is counted as one character, but this character does not seem to be taken in to account when calculating in and out indices for hastags, user_mentions etc.

In the example below, the indices returned in the JSON block will select ’ @bcui’ instead of ‘@bcuic’ because of the 1 character seemingly counted by the emoji at the start of the tweet, but not accounted for in the indice values

\ud83c\udf40Good luck to our UK colleges @bcuic & …ect

Here’s one more example with a screen shot of my render.
“full_text”:“Will be interesting to see how the #AFL suspends Toby Greene this week…\ud83e\udd14 #getcreative #AFLFinals

“text”:“getcreative”,
“indices”:[
75,
87
]

Should I look to strip out emojis from the fullPtext result?
Should I follow up the links on this page…

The algorithms documented and implemented in the twitter-text library are the correct way to handle counting emoji characters. We don’t currently ship a PHP implementation of this algorithm, but it should be possible to derive one from the other code samples.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.