Hello
i run a python-script with a academic licence. i have around 10’000 conversation_ids and i want to download them all. i run a loop and query the search endpoint. the problem is that after a while (maybe 20 ids) the API starts to return result_count=0 for each request in the loop. if i stop the script and run it again, the API returns conversations again.
any ideas?
thanks in advance
Sounds like you maybe running into a rate limit. It’s hard to say without having the imementation.
Twarc can do this too, in case you’re not using it already: twarc2 - twarc there’s a command for fetching these and twarc can so be used as a library
I dont think that a rate limit is the problem. I immediately get search results back when i restart the script.
here is the code:
def search_tweet(self, params, search_type):
search_url=""
if search_type == "full":
search_url = "https://api.twitter.com/2/tweets/search/all"
start_time=str(datetime.datetime(2021,1,1).replace(microsecond=0).isoformat())+"Z"
params["start_time"]=start_time
params["max_results"] = 500
if search_type == "standard":
search_url = "https://api.twitter.com/2/tweets/search/recent"
params["max_results"] = 100
tweets = []
next_token = 0
while next_token != 1:
# twitter request limit 1s per request
time.sleep(1.1)
if next_token != 0:
params["next_token"] = next_token
response = requests.request("GET", search_url, headers=self.create_headers(), params=params)
if response.status_code != 200:
raise Exception(response.status_code, response.text)
if int(response.headers["x-rate-limit-remaining"]) == 0:
time.sleep(960)
response = response.json()
if int(response["meta"]["result_count"]) == 0:
break
if "next_token" in response["meta"]:
next_token = response["meta"]["next_token"]
else:
next_token = 1
tweets += self.merge_full_search(response)
return tweets
I’m pretty sure the issue is this. x-rate-limit-remaining header can be 0 here without errors, but the wait time should not be 960, seconds, that’s the rate limit window for v1.1 API. The wait time should come from the headers too. It may be a couple of seconds instead.
according to GET /2/tweets/search/all | Docs | Twitter Developer Platform the rate limit is 300 requests every 15 minutes (900 seconds)
here is an example header-response:
{
"date":"Tue, 31 Aug 2021 11:56:00 UTC",
"server":"tsa_o",
"set-cookie":"personalization_id=\"XXXX\"; Max-Age=63072000; Expires=Thu, 31 Aug 2023 11:56:00 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None, guest_id=XXXX; Max-Age=63072000; Expires=Thu, 31 Aug 2023 11:56:00 GMT; Path=/; Domain=.twitter.com; Secure; SameSite=None",
"content-type":"application/json; charset=utf-8",
"cache-control":"no-cache, no-store, max-age=0",
"content-length":"668",
"x-access-level":"read",
"x-frame-options":"SAMEORIGIN",
"content-encoding":"gzip",
"x-xss-protection":"0",
"x-rate-limit-limit":"300",
"x-rate-limit-reset":"1630411820",
"content-disposition":"attachment; filename=json.json",
"x-content-type-options":"nosniff",
"x-rate-limit-remaining":"296",
"strict-transport-security":"max-age=631138519",
"x-connection-has
Yes, what i mean is that you can check for the x-rate-limit-reset time and use that - in case it’s a shorter wait than the 15 minutes, but also those headers are sometimes flaky, and you should try to retry a failed call much sooner, it will just fail again but sometimes it will work and you can continue. This complicates the retry logic but does speed things up.
i updated the code like this:
time.sleep(int(response.headers["x-rate-limit-reset"])-int(time.time())+5)
i added 5secs as margin. but that doesnt solve my issue. after a few conversation_ids the API just returns empty responses like this:
{
"meta":{
"result_count":0
}
}
when i abort the python script and restart it immediately after the API starts to return responses again.
1 Like
Strange - is there a way to get the exact call being made? do you have the conversation ids? I’d like to try it to see if twarc does the same thing - but if the API itself giving you inconsistent results there’s not much you can do.
ill use twarc by myself to query the conversation_ids and check if i get different results.
1 Like
Were you able to get this issue resolved? I am facing the same problem:(