How do I access higher volume of tweets in realtime, beyond the default 1%?


#1

I am one of the original authors of Spark Streaming (https://spark.apache.org/streaming/), and I would like to use the higher volume of streaming tweets to develop applications. I know that this question has been asked many times in this forum, but most, if not all of them, seem to be inconclusive. Please help!


#2

They’re not inconclusive, actually. All of them since the Twitter ToS change state the same three tenets:

  • Elevated access (“privileged” access, if you prefer) is gone
  • The 1% you get “for free” is not usable for commercial purposes, is limited to 400 keywords on track, too
  • Anything beyond the 1% cap should be acquired through a third-party data provider - Gnip ( http://www.gnip.com/ ) or DataSift ( http://www.datasift.com/ )

Bear in mind that 1% of twitter is already big. If that’s still not enough for you, I strongly consider you check Gnip or DS out. I’m only familiar with Gnip’s product, as we use them as our data provider, and they’re reliable and efficient.


#3

Hi @SebRenauld, in case I use Basic Level Streaming API that falls under 1% limit and limited keywords and followers , can I use those tweets in my application.
My Application pulls data from various sites for stock market and show it on a dashboard.
I am fine using limited access to tweets unless my clients demand access more data and in that case I know they need to get into contract with GNIP or Datasift.
So wanted to know do I need to pay twitter for access to Basic Streaming API’s and giving data to my clients, even if it is less than 1%.
Please help.


#4

Present i am using this streaming api send tweets automatically.
i am attached Example of the Code once check this is use full for you.

public string MakeStreamRequest(TwitterRequest twitterRequest, int wait)
{

        string ReturnValue;
        HttpWebRequest webRequest;
        StreamWriter requestWriter;
        string responseData = "";

        string methodPath = DefineRequestPath(twitterRequest.RequestMethod.ToString());
        string requestUrl = BuildRequestURL(twitterRequest.StreamApiUrlHost, twitterRequest.ResponseType, twitterRequest.UserId, methodPath);
        string postData = BuildParameters(twitterRequest.APIversion, requestUrl, twitterRequest.ConsumerKey, twitterRequest.ConsumerSecret, twitterRequest.Token, twitterRequest.TokenSecret, (twitterRequest.RequestMode == RequestMode.QueryString) ? "GET" : "POST", twitterRequest.UrlParameters, methodPath);


        CookieContainer cookies = new CookieContainer();

        if (twitterRequest.RequestMode == RequestMode.AuthHeaders)
        {
            webRequest = (HttpWebRequest)WebRequest.Create(requestUrl);
        }
        else
        {
            webRequest = (HttpWebRequest)WebRequest.Create(requestUrl + postData);
        }
        webRequest.Method = (twitterRequest.RequestMode == RequestMode.QueryString) ? "GET" : "POST";
        webRequest.ContentType = "application/x-www-form-urlencoded";
        //webRequest.CookieContainer = cookies 

        //Eventually we might implement post data once Twitter has provided 
        //methods that work with post, currently there are only retrieve/get methods. 
        //------------------------------------------------------------- 

        if (twitterRequest.RequestMode == RequestMode.AuthHeaders &&
            twitterRequest.UrlParameters != null)
        {
            string vars = postData;
            if (vars.Length > 1)
            {
                vars = vars.Substring(1);
            }

            webRequest.ContentLength = vars.Length;
            webRequest.Timeout = 1000000;
            webRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
            requestWriter = new StreamWriter(webRequest.GetRequestStream());
            requestWriter.Write(vars);
            requestWriter.Close();
        }

        try
        {

            using (var webResponse = (HttpWebResponse)webRequest.GetResponse())
            {
                System.Console.WriteLine("Response content encoding :" + webResponse.ContentEncoding);
                Encoding encode = System.Text.Encoding.GetEncoding("utf-8");

                using (var responseStream = new StreamReader(webResponse.GetResponseStream(), encode))
                {
                    while (!MaxTweetApp.IsStopped)
                    {
                        try
                        {
                            ParseString(responseStream);
                            //System.Threading.Thread.Sleep(1000);
                        }
                        catch (Exception ex)
                        {

                        }
                    }

                }
            }
            

        }
        catch (WebException ex)
        {
            Console.WriteLine(ex.Message);

            if (ex.Status == WebExceptionStatus.ProtocolError)
            {
                //-- From Twitter Docs --
                //When a HTTP error (> 200) is returned, back off exponentially.
                //Perhaps start with a 10 second wait, double on each subsequent failure,
                //and finally cap the wait at 240 seconds.
                //Exponential Backoff
                if (wait < 10000)
                {
                    wait = 10000;
                }
                else
                {
                    if (wait < 240000)
                    {
                        wait = wait * 2;
                    }
                }
            }
            else
            {
                //-- From Twitter Docs --
                //When a network error (TCP/IP level) is encountered, back off linearly.
                //Perhaps start at 250 milliseconds and cap at 16 seconds.
                //Linear Backoff
                if (wait < 16000)
                {
                    wait += 250;
                }

            }
        }
        catch (Exception ex)
        {
            responseData = null;
        }

        ReturnValue = responseData;

        return ReturnValue;
    }


    private void ParseString(StreamReader responseStream)
    {
        string content;
        content = responseStream.ReadLine();

        if (content != string.Empty)
        {
            try
            {
                Newtonsoft.Json.Linq.JObject jObject = JObject.Parse(content);
                if (jObject["text"] != null)
                {
                    OnStatusUpdate((string)jObject["text"], (string)jObject["user"]["screen_name"]);
                }
            }
            catch (System.Exception ex)
            {
                Console.WriteLine(DateTime.Now.ToString() + " " + ex.ToString());
            }
        }
    }

#5

You do not need to pay, however, there are strict restrictions on what you can do with it. From memory, you cannot:

  • Export the data in any other form than the tweet IDs for any client
  • Provide single-user based data guessed from your tweets (but you can aggregate)
  • Display anything without following the (insanely strict, but rightly so) twitter ToS. This includes a bird logo slapped right beside each and every tweet, unambiguously pointing out that it is from twitter

Ignore the java copy-pasta below. And Gnip doesn’t have that restriction, by the way. But Gnip is a pretty penny.


#6

Hi
I’m reviving this old thread as it matches with my question. I have a Banking Customer (India) and using the streaming API we’re accessing the relevant data for analytics. Do I need to pay for any service? At this point we may be fine with the 1% and 400 Keyword limit. Please help.
Regards
Rahul