Streaming API for Data Analysis

streaming
api

#1

Hi All,

Just wanted to be sure before running the Streaming API - We have to create a pilot/poc for demonstration purpose and would like to use the streaming API for some data analysis. At this point of time, we are fine with the restrictions of streaming API i.e. 1% data of the entire Twitter fire-hose. If things go well, we will make use of full Twitter firehouse i.e. by using GNIP.

Would like to test on the below parameters:-

  • Running the streaming API for a longer period
  • Single account credentials (authentication & keys) and one live connection
  • Richness of the data

Please guide!

Best Regards


#2

If you’re OK with the limitation on volume of data available, then the other points should not be an issue for you. I’d recommend against frequent reconnections to the Streaming API endpoint, as that can lead to issues.


#3

Thanks Andy, this really helps! : )

  • Do you suggest we stop/start the Streaming API every now & then? i.e.Stop the streaming API after every 12 hrs.
  • We want to run the streaming API for few days? What is the typical lag/delay for a tweet that gets posted in real-time?
  • Any other critical things/issues to keep in mind

This will give us a good start to evaluate/demonstrate the real-time streaming capability and potential use of full fire-hose for immediate future use.

Thanks again for all your help!

Best Regards


#4

You can read about connections to the Streaming API for answers to those questions.


#5

You don’t have to stop and restart the stream as long as you can consume it. Keep it open as long as you wish.

The delay is minimal. Maybe a few seconds if that.

Keep in mind the Tweets will stream as fast as you can download them. Shouldn’t be an issue for your servers. More of a warning if you are demoing off a laptop on a cellular connection.


#6

Thanks Jonathan! Thanks Andy!.

One final question, using the streaming API we want to develop the demo as a proof of concept (POC) for one of our client. As mentioned earlier, for the POC, we are fine with the 1% of the Twitter data and 400 Keywords limit. Is there any other risk here? i.e. legal etc.

Please guide! Thanks for all the help.

Best Regards


#7

For a POC you should be ok. If you plan on building a business or product around the 1% stream I would encourage you to review everything under the Developer Terms of Use. If you eventually go the Gnip route, you can discuss data licensing with a representative at that time.