Can I get blacklisted for getting one 429 status code per window?


#1

Hi,

It seems I got blacklisted. I’d like to fix this ASAP but I need to understand why I’ve been blacklisted.

The logic I’ve been applying to get data from the API was to perform requests until I get a 429. Then sleep for the given amount of seconds (and a little more). Resume data extraction.

So, I was getting one 429 per window.

  1. Could that be the reason why I got blacklisted?
  2. Does the blacklisting goes off after a specified period of time?
  3. What is the process to follow in order to be removed from the list? I’m quite open to fix my code but I really need to know what I’ve been doing wrong.

Thanks.


#2

Bump?


#3

If you’re on shared hosting, it’s possible to get blacklisted due to abusive activity anywhere else in the IP block.

But generally you wouldn’t get blacklisted for doing what you describe above. Which API methods are you using? Are you handling other error conditions well, like when you get other 400 or 500 series codes? How aggressively are you making requests within a rate limit window?

Are you crawling data? We don’t typically allow for serial collection of data.


#4

For some reason I got removed from the blacklist. Maybe the blacklisting isn’t for life but rather a couple of days.

Now, specifically answering your questions:
Q: Which API methods are you using?
A: get followers/ids and get users/lookup

Q: Are you handling other error conditions well, like when you get other 400 or 500 series codes?
A: I cant recall receiving any other 4xx condition. I occasionally get some 503 (The Twitter servers are up, but overloaded with requests. Try again later). In that case I immediately retry the request (should I add a sleep there too?)

Q: How aggressively are you making requests within a rate limit window?
A: I believe this is the key factor, if I’m not misunderstanding the question. I basically try to run out of requests as soon as possible (no sleep between requests) and the wait for the rest of the window. Probably twitter would expect me to sleep between the requests so that instead of hitting the server very often at the beginning of the window and nothing during the rest of the window, I would hit constantly at during the whole window. Is this correct?

Q: Are you crawling data?
A: I’m actually testing a java library (Twitter4j) and some modifications I made to its code and sent to the author of the library. So I was heavily hitting the API to see if the data I was fetching went along with the one that I could actually see in the page and also checking some other internal code tests. So, I guess that from a Twitter perspective this could have been seen as a crawl. I also started and stopped the processes many times to make the tests resulting in making the same call against the same user more than once. I also think this could be related to the blacklisting.

So far, I added a sleep between calls and everything seems to be OK. I’m also not exhausting all the calls in a window so that should help too.

Btw, thanks for you reply @episod


#5

Hi, did you ever figure this out?
I am actually doing the exact same thing-- running my program until I get the 429 response and then putting it to sleep for 15 minutes. When it continues, it is still out of requests, though, so I changed the sleep time to 20 minutes and when it tries to continue, it gets another 429 response. Not sure what I’m doing wrong, any help would be appreciated! (I’m just searching twitter handles)