[Resolved] Entities (e.g. hashtags) are not correctly extracted from tweet text


#1

Hello!
Please, take a look on tweet with id 575668739456237600.

Text:
"#no #comment :smirk: #ne #güzel #istanbul #be #deniz #sea #stil #boğaz #hisar #üstü #bosphorus #karizma #like https://t.co/wWNTVNBKuR"

#ne has indices [15, 18] . I think [16, 19] should be. Is it because of smiling symbol?

Thanks.


#2

It’s because javascript is counted :smirk: as two characters.