Bug with keywords prefixed with accent


#1

When I track the keyword “lula”, I get tweets with “célula” word, is this right?

I don’t get tweets if “celula” (without accent) <- I think this is the right way (only receive lula words)

Thanks


#2

I’m interested in this too - i can’t check this myself right now, but this might be a mistake in how characters are split for matching. Do you have some example tweets? Could there be a non breaking space or some other UTF8 weirdness in that word?

And yes, celula should not match célula like you say https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters#track


#3

Hi @brnalencar - which API are you using, and if possible, could you share your full request (or part) with keys/tokens redacted?


#4

Hi @Hamza and @IgorBrigadir.

I’m using the Streaming API, “statuses/filter”. To get the an example of tweets, just use the keyword “lula” (ex - president of Brazil) in the track parameter.

Then, tweet something containing “célula” or “libélula” (words in Portuguese Brazil). The Streaming API track this tweet because have a “lula” in suffix and contain accent.

Without accent (celula) the tweet is not track (right behaviour).