What types of data analysis are allowed with datasets (for researchers)?


A few questions/clarifications on what analysis is permitted (for research papers etc.)

  1. Can Twitter data be analysed qualitatively in order to place it in a categories. E.g: 1800 tweets over a week from the streaming api in categories such as:
    Tweets reacting negatively towards Obama care = 360
    Tweets reacting positively… = 360
    Tweets reacting to possible economic impact… =360
    Tweets unsure of… = 360
    Tweets from media… = 360

Then in each category 1 or 2 anonymised tweet examples, i.e. not disclosing the name or Twitter handle and not including the bird logo, but clearly indicating it is from twitter by using the offline displaying guideline (minus the name and twitter handle). Pie charts or other charts would be created with the analysed data.

  1. Creating various graphs of peak activity over that week and a table with most active users and the number of tweets? Also displaying the most re tweets.

  2. Using semantic analysis on the tweets detecting positive and negative tweets.

  3. Identifying the location of the tweets e.g. on a map of the earth i.e a heat map.

  4. Creating infographics with the most key words

I have read that some of the above are considered bad practice but I have seen data analysed like this in highly cited journals. Is bad practice something you can do but twitter doesn’t like it? Whereas there are certain things that are banned, and can not be done under any circumstances; as they go against the privacy policy such as sharing or publishing twitter data.

  1. Can Twitter analysis be compared to other social media data by clearly stating which data analysis is associated with twitter. And is it possible to aggregate sentiments from Twitter to see what overall opinion on social media?

Datasets would not be shared at all and it would be impractical to publish the entire list of tweets. So they would not be released.