Search not Working for Unicode Character in source: Field


#1

I can search on Twitter (or in my application) by source using a query such as “source:Zendesk AND what”. This will bring up tweets posted from Zendesk and containing the word “what”.

One source I can’t find ANY way to search for is Adobe Social. I believe in the twitter source field, it is represented as “Adobe® Social”. The registered symbol seems to break the search functionality.

A search query such as “source:“Adobe® Social” AND what” does not work. Neither does it work when I remove the registered symbol.

How can I use search to find tweets posted by Adobe Social?


#2

That particular search filter is a bit over-simplified and, due to the way application names are flattened for that filter, will probably be difficult for you to use with an application with special characters in the name. If you control this application, you would be better served by tracking the creation of tweets from the program itself as they are created, rather than trying to track them after the fact using the Search API – the Search API has many quirks to its indexing techniques and overall does not reflect an exhaustive source of tweets.

If tracking by the source tag is important for you for this application, your best bet is going to probably be to simplify the name of your application not to include the special character.


#3

As I do not control Adobe Social, is there a way to track the creation of its tweets?


#4

Partner providers of Twitter data, like Datasift and Gnip, have some more fine-grained options for digging into/tracking tweets.

Given the way that the Search service indexes tweets, I would never use the Search API in a case where you’re trying to track “all” tweets that match condition XYZ – you’ll almost always want to use a Streaming API for a scenario like that – but our Streaming API doesn’t support the ability to track on individual field values like this.

Since you’re already using the Search API and might not have completeness as a requirement, have you considered just listening to the ~1% statuses/sample stream and collecting tweets that have a specific value for the “source” field?


#5

I hadn’t considered that. For reference, that would entail using the streaming API, sampling, and then filtering by source field, correct? That would be an interesting way to keep tabs on overall usage of different Twitter sources.

Anyways, I’ve decided it’s not imperative to search for sources that use irregular characters. Thanks for the help.


#6

Yep, exactly. Just listening to a steady sample stream of tweets and either collecting the tweets themselves based on criteria you’re interested in our otherwise tabulating derived data like frequency of specific sources.