Streaming filter

php

#1

I’m currently undertaking a stage to project a twitter density map. I’ve used 140dev.com free source code for twitter database server (http://140dev.com/free-twitter-api-source-code-library/twitter-database-server/).

As I said, my aim is to get all the tweets in a specific area (Dublin and its suburbs) and display them on a map at the place they are associated with.
In the file get_tweet.php of 140dev.com, I’ve replaced the line “$stream->setTrack(array(…” with “$stream->setLocations(array…” . So I use a function coming from the Phirehose library to filter tweets according to their geographical origine.

I’d like to catch all the tweets that are geolocated or associated to a place in the area of Dublin (which I defined as a rectangle with lat/long coordinates in the set location function). After searching on the Internet, I found that the the Streaming API can only give access to a sample of tweets that are currently created. Some sources affirm that this rate should be about 1% of the total of global tweets.
But I found a link (https://twittercommunity.com/t/limit-on-streaming-tweets/8453/3) on twitter dev forum where a twitter staff member explains that if a filter is set and if the potential sample (using the total of global tweets) matching with the parameters of the filtring is under 1% of global tweets, then I’ll get this sample in its entirety.

My first question could seem stupid: Will I get all the tweets that are geolocated in the Dublin area with my code?
Indeed, I think that the total amount of tweets that match with my demand (that is to say geolocated tweets in the area of Dublin or tweets associated with a place in the Dublin area) is under 1% of the amount of global tweets but I’m not completely sure that the setLocation function is the kind of filter which is quoted in the forum link. You know, often in science, especially when someone try to popularise a theory, some words are misused.

To conclude, I’m using a mysql database but I was adviced to use a postgresql database with pgadmin3 in order to work with map data and place tweets on a map. The problem is that postgresql is more difficult to use, I never used it and I I found no tutorial about using it with twitter streaming API.
My second question is: do you think it’s possible to continue my project with a mysql database (I’m supposed to use Leaflet to work with map and geo data but I’ve not begin this part yet) or should I migrate to a postgresql database? In either case, do you know some tutorials, software or anything that could help me?


#2

Yes - in theory - that looks like it would work. One thing to be aware of is that the proportion of Tweets that have geo data associated with them is very small (users have to opt in to share their locations), so you may get none at all at some times.

Not sure what to suggest on the subject of database choice, except that I know that Postgres has some geo support built-in, and MySQL doesn’t - if you are happy with your code and method, I’d stick with what works for you.

We have some geo demos available at https://twitterdev.github.io including a globe and a mapping app - although neither of them are in PHP.


#3

Thank you very much!