Underscores disappearing from Arabic hashtags


#1

We’re receiving reports of users who are adding hashtags in Arabic with underscores in them, but once the posts reach Twitter, the underscore is removed. The user notified us that this is only happening when the post reaches Twitter, and displays correctly in the app. The user is attempting to post the hashtag #عام_زايد (though they have let us know that this is happening with multiple hashtags), and that this also occurs when posting directly to the Twitter platform. Is this expected behavior based on the language format?


#2

Can you please explain the endpoint used and (preferably) provide a code snippet or example? Thank you.


#3

Hi @andypiper, I unfortunately don’t have a code snippet, but we are using the POST statuses/update endpoint. As far as an example goes, here’s a link to one of their posts where the underscore was removed after being posted: https://twitter.com/dctabudhabi/status/951488046242856961

We did some testing and copy/pasting the text of the message into another program does show the underscore, as does the hoverover URL on the hashtag, so I’m wondering if this is maybe just a display issue?


#4

This could be associated with the twitter-text parsing algorithm. I’m not sure whether that supports underscores in hashtags or in Arabic specifically.

[update: just tested that, and it seems to parse them correctly]

Can I confirm that this is only happening when you post via statuses/update, but not from the Twitter apps?


#5

Hi @andypiper, our client did confirm that this occurred for them posting directly from the Twitter app as well as via statuses/update.

When I copy/paste the text elsewhere, the underscore does show up, and the hashtag does appear to be one single link–do you think this is just a display issue?


#6

That’s really weird - I can’t seem to reproduce this, as I just tried posting via the API and the underscore remains (and becomes a clickable link in the Tweet). It could indeed be a display issue. The Tweet you provided 951488046242856961 does seem to have the underscore in the Tweet object text and hashtags field.