Exclude tweets with certain words does not work as expected


While trying to clean a stream of tweets with our hashtag, we encountered an odd issue with the exclude filter.

Our search query like this: #hastag -cam -sexy -dating -naked filters out the tweets with naked in them, but not the cam/sexy/dating ones.

Is this an issue with our query, or are we missing something?


Can you provide details of the actual request you’re making to the API?

Note that the track= parameter of the statuses/filter streaming API doesn’t support negation.


It actually happens on the advanced twitter search too, so, I might as well give that as example.


There’s at least 2 “cam” tweets and a “dating” tweet in the first few results.


Thanks, that’s a great reproducible example. Checking internally; stay tuned.


Looking more closely at this, it looks like some of these tweets are using homoglyphs to escape filtering.

For instance, in this tweet the “c” of “cam” is actually Unicode U+0441, “CYRILLIC SMALL LETTER ES” and not a lowercase “c” at all… which is why it’s not filtered out by “-cam”.

We’ll need to look into homoglyph canonicalization in our search pipeline.


At least we can filter these specific tweets out for now by using all specific cases, but I suppose bots outsmart us again here.

Thanks for the quick support and looking into this.