The Streaming API docs state that
a) each account may create only one standing connection to the Streaming API, and
b) excessive connection attempts will result in an automatic ban of the IP.
So if I use multiple Twitter applications and clients for the same account/screen_name, would they compete for the standing connection? And if they use the same IP address (router), would the ones trying to reconnect count as “Excessive connection attempts” and eventually get my IP banned?
The Streaming API docs state that
In most cases for personal use you’re not going to run into problems unless the software you’re using has aggressive reconnect strategies to the Streaming API. A single IP address or account connected to User Streams isn’t likely to ruffle any feathers – you can run two streaming Twitter clients from your machine pretty easily.
The more intense parts of the streaming API like the firehose and track capabilities are really meant to be connected to sparingly and for long periods of time. Most consumer applications don’t make use of these parts of the streaming API.
Most of the protections are in place to protect against those who are circumventing rate limits and data access limitations by opening excessive simultaneous connections, or rotating through multiple accounts.
When you’ve reached too many connections for the same IP address or account, you’ll get a 420 error code on subsequent connection attempts. Responding to that 420 appropriately is important – if you continue attempting connections without adding significant waiting periods (and disconnecting unneeded connections) you can run the risk of more long-term denial of access.
@episod, is it possible to get an account upgraded such that it can run multiple track streams simultaneously? We have been using a single stream for development which has worked well. As we plan our production environment the natural course would be to open up multiple streams. Our use case calls for up to 16 simultaneous but narrow (max we have seen is 2 tweets/second) streams. Each one starts and stops at a different time with a different set of key track terms.
We’ve looked through the dev site for directions on this with no luck. We also looked at gnip. It looks like a great service but also appears to be overkill for us at this point. Any direction you could provide would be helpful. We would be happy to share specific details of our project over email.
Am I right in understanding that I may use an account for connecting to the Streaming API, and still be able to use it for personal use on a desktop client (e.g. Twitter for Mac)? Will there be a risk of banning/disconnects? I am planning to migrate to the Streaming API soon, and I’m a bit worried about this issue.
I wouldn’t recommend this approach.
When you say you want to open 16 simultaneous streams, what’s the reasoning for that? You should combine as many predicates as you have permission for into a single outstanding stream connection, then reconnect the stream at specific intervals to change the terms. The onus would then be on your app to properly sort through the results you’re getting and bucket them the way that makes sense of your application (rather than bucketing per open stream, which is what I’m imagining you’re suggesting).
Ultimately your goal should be two-fold: combine as many predicates as possible into a single stream and to keep the connection open for as long as possible.
You’re correct that you can use the same account for both – as long as you aren’t aggressively reconnecting or ignoring HTTP status codes, you shouldn’t have many issues using the same account. Of course, it’s best to use a dedicated account for streaming access that is decoupled from your traditional user accounts – it separates concerns considerably and protects you against a few kinds of issues.
Thank you, I will play it safe and proceed to create separate accounts.
Similar to the other questions. If we are building a service that has authenticated clients (either oAuthed to our API key – or they could setup their own API key) each tracking a small # of #tags, we should not open one streaming connection / client (from our server)… Right? If we build one tracking list on one stream connection for all clients, obviously (1) we will need to keep reconnecting with a new tracking list each time any client makes a change… which is probably OK (2) after a bunch of clients we will hit the limit of 400 tags / connection. When we get to this point, is the standard Twitter policy to push people off to GNIP/DataSift? Or could we get an upgrade for our single oAuthed API key?
I came here to ask the same question as it happens. Can someone address this issue?
Is it possible / acceptable to use the streaming api via a PHP server for all of an applicaitons users. (server side) instead of client slide. In theory this would have to open too many streams which is frowned upon with guidelines?
“b) excessive connection attempts will result in an automatic ban of the IP.”
Is the streaming API only for desktop apps / client side applications. If it was server side, would it have to be one stream per user, or is there a way to open a stream for all users? (assume not).
Any advice, thoughts would be useful.
There are a few different ways to approach this – but essentially when you’re thinking about the core Streaming API (not User Streams or Site Streams), it’s essentially a server-to-server technology and the user context that you’re operating in isn’t your end-user context, it’s a user dedicated to your application for working with the Streaming API. You work to keep connections open for as long as possible and queue up changes to the stream (based on your end-user’s requests) so that you limit reconnects to as seldom as possible. In this case, you’d open up a single connection to stream on behalf of many of your end users.
Site Streams operates in a similar context: you’re still primarily connecting as a user who represents the application (and owns the application) – however, you’re also providing a list of users you’re acting on behalf of (and have access tokens for). You’re then streamed the equivalent of the “me feed” from User Streams for each user you’re acting on behalf of.
The core streaming API is limited in the % of firehose you can have access to, the number of distinct query terms, queried users, geographical areas. For a lot of developers, the basic limits are sufficient for their user base – for others greater limits become ideal, or other forms of compromise like prioritizing one class of end user over another.
“For a lot of developers, the basic limits are sufficient for their user base – for others greater limits become ideal, or other forms of compromise like prioritizing one class of end user over another.”
The basic limits are free? But the greater limits are ‘pay for’ via resellers?
@episod - I have put through an application for site streams - thanks for your advice, we feel this is the right way for us going forward.
This can be a complex issue. GNIP and Datasift provide data for a number of use caes, but one of the primary areas they serve is for tweets in non-display context or tweets in very limited (paywall, for example) contexts.
My recommendation: use the Search API for ad-hoc queries by your end-users. If they decide they want a query to be recurring, that they want to watch it from “now on” – add it to a queue you have for the next time you add additional terms to your outstanding streaming connection. Consider tiering access to your service to further allocate/direct resources.
A queuing system becomes pretty necessary when you’re working with the Streaming API. You want to limit reconnects as much as possible – one way to do that is to use two streams (different users) and use them alternately.
When you run into a situation where the basic limits are no longer adequate (not just because you want lots of breathing room ), send an email to email@example.com explaining in detail how you’re currently using the streaming API, information about who your users are and how they access your product, what they do with tweets within your product, and how heightened streaming limits will aid your ability to serve our mutual users. Often we’ll recommend you go to GNIP or DataSift, if it’s appropriate for your scenario.
It never hurts to make sure that tweets are fully actionable in your application: reply, retweet, favorite, follow; that your application meets display guidelines and so on.
Hope this helps you think about this. The Streaming API and OAuth API keys have very few relationships with each other when it comes to this stuff.
"you can run two streaming Twitter clients from your machine pretty easily."
Is it still possible? I have failed maintaining two filter stream connections in parallel, they got disconnected all the time. It looked to me like twitter API was disconnecting me, but might be related to the client library I am using. Any opinion on this? Each stream was following approx. 1000 twitter Ids.
I have the same issue. Running sample stream and a filter stream with the same credential seem to disconnect each other. I’m using phirehose library. The filter stream follows 1400 ids. @episod, can you confirm that we can run two streams concurrently? Thanks!
The disconnect logic has been improved since his post. You can now maintain one connection per account. Connecting with the same credentials will kick off the oldest connection.
Can I use 2 Twitter developer accounts, one for the streaming API access and another for giving my users oauth access to sign up/publish posts to Twitter?
Is this against Twitter’s rules?
As long as you’re striving to maintain a single open connection to the Streaming API, it doesn’t matter if it’s the same account as the one that owns the application you use for REST API access – though it’s often better to have them be the same.
Why is it better to have them be the same?
To better identify that they’re associated with each other. Twitter doesn’t have a way for you to tell us that two accounts are part of the same application, developer, or user.