Hi, two questions:
- Is it possible to minimise the amount of tweets you get from bots when using the Twitter API?
- If not, how do you best detect them in your data?
On (2), so far I have found the users who have duplicated tweets which has been successful in finding a couple of bots.
Thanks!
Specifically, I have obtained data filtering by campaign hashtags for US presidential elections. I want to find tweets from the respective voter bases, but it seems that my sample mainly entails bots.
Firstly, many tweets come from the same users. If I remove duplicate user ids, I get roughly 50% of the sample. Then, many of the users receive a high score on Botometer (roughly 0.75 to 0.8).
What should I do?
1 Like
You an try some heuristics like excluding tweets posted from certain apps (filtering on your end for the source: field).
I wouldn’t rely entirely on botometer scores blindly, there are some methodologial flaws with doing that, here’s a good disussion & paper on this:
Hi Igor - thanks! Just wondering, do you know of any litterature that could help with which apps to exclude?
This will depend on what your study is and why you’re filtering things but i would generally filter anything that’s not the official twitter clients (with a few exceptions like popular social media management tools like mixpanel, etc.) There are bots that automate the mobile / web clients and would appear as though they’re tweeting from the website but those are few in my experience. I don’t have a comprehensive list unfortunately.
Thanks, Igor - much appreciated! Could you please refer me to any website/guide that may help as to how I implement this filter in the query? I have not found anything, unfortunately.
This would be something you would manually define in your own code, there’s no search operator for specifying it in v2 search unfortunately: Search Tweets - How to build a query | Docs | Twitter Developer Platform
Ah, okay. Will see if there exists some literature on this. Thanks again, Igor.
1 Like
In case it helps, you can take a look at this paper:
“SOCIAL MEDIA, SENTIMENT AND PUBLIC OPINIONS: EVIDENCE FROM #BREXIT AND #USELECTION” Gorodnichenko, Pham and Talavera
They have a relatively easy way of doing it and they also mention some literature that might come handy
1 Like