Statuses/update can return a 187 error for non-duplicate tweets


#1

When I tweet a single unicode character (say 0x1F601) and immediately tweet another unicode character (0x1F602), the second tweet always fails with a duplicate tweet error (187).

I think something is wrong here because I also observed the following:

  1. When I repeat the above scenario but wait 10 seconds between tweets, there is no failure.
  2. When I repeat the above scenario (no wait) but instead of unicode I use two different ASCII characters, there is no failure.

#2

Twitter can use some fuzzy, multi-stage logic to determine what & when it considers content duplicative. I don’t think there’s anything unusual going on here.


#3

Seems odd that the logic only produces bad results with unicode.

These tweets are seen as duplicates:
“I am [smiley emoji] today”
“I am [frowny emoji] today”

But these are not:
“I am X today”
“I am Y today”


#4

Episod, Are you sure about the “fuzzy, multi-stage logic” explanation? The reason I ask is because I thought of another (I think possible) explanation: the emoji I am testing with are unicode surrogate pairs. Maybe the initial test for duplication only looks at the first code point? Since emoji are made up of two code points and the first code point is always the same (\xF0\x9F in UTF-8), that might explain why all emoji appear to be duplicates.