Hello folks, I’m trying to study the numbers of tweets containing specific query words in each country over the last year. Right now I’m using the get_geo_tweets() function from academictwitteR package in R.
Although only a subset of the resultant data contained exact coordinates, all of them contained a place_id. To find out what each place_id meant, I’m using the API.geo_id(place_id) function from tweepy module in Python 3.
This method does give me the country of each place_id. However, as the GET geo/id/:place_id method has a rate limit of 75 queries/ 15 minutes, this method became extremely time consuming as I have to query thousands of place_ids for my project. Is there a way to get what I want faster?
I’m studying tweets from every country, so searching Tweets from specific countries does not suit that well for me. Although I can see how it can serve as a way to circumvent my current problem (by searching Tweets from every single country), I have no idea how to do it.
Also, do people know why Twitter doesn’t just put out a dictionary-ish thing for the place_ids?
If you use fields and expansions, Using fields and expansions | Docs | Twitter Developer Platform you should already have the place objects in the original responses, you do not need to look them up at all.
I’m not too familiar with the R library but it sounds like an omission to not include the place objects that you get back from the API.
I had a quick look and the library does not appear to be processing place objects? https://github.com/cjbarrie/academictwitteR/blob/master/R/get_geo_tweetsv2.R
I would recommend using twarc instead to father the data and load it into R for analysis if you need:
1 Like
Thank you so much, IgorBrigadir! In case other people have the same problem, I’ll document what I did:
I used twarc2 that came with twarc. The configuration of twarc2 was somehow buggy when I tried this on GitBash on my Windows 10 laptop, and for my friend who uses Jupyter notebook. In short, there should be a slot where you can enter your bearer_token, but on these platforms that line never appear.
I ended up using Google colab. Here’s the code that worked for me:
!pip install twarc
!twarc2 configure
word = 'asianfood'
q = 'has:geo #' + word
!twarc2 search '$q' --archive --start-time 2020-03-01T00:00:00 --end-time 2020-04-01T00:00:00 > raw_output.json
!twarc2 flatten 'raw_output.json' '$word'_output.json
The project yielded interesting results. Thank you again!
1 Like