Twitter Streaming API not working with twitter4j and Apache Flume

streaming
flume
twitter4j

#36

@IgorBrigadir didn’t find the twitter4jproperties in flume floder… :slight_smile: could you please provide the provide path where this file is actually placed Plz :confused:


#37

You’ll need to create a twitter4j.properties file in that case - classpath for flume http://www.tutorialspoint.com/apache_flume/apache_flume_environment.htm depends on your environment & how you’re running the collector.


#38

problem has been solved. The trick is to use a new version of twitter 4j that has the new endpoint 1.1 set for stream api url and use its jar file with flume. For flume, we need to recompile the flume sources using maven compiler with the twitter 4j version set to 4.0.4 onwards in the pom.xml file. The compiled flume jar file can then be used with apache hadoop to stream tweets.


#39

Hi folks. I am also have this issue where I get the 404 error.

I am using flume-sources-1.0-snapshot.jar and am not using any twitter4j jar files. I have deactivated all of them because they previously interfered with tweet collection and threw errors. After I deactivated the twitter4j jars, I was able to collect keyword-specified tweets just fine for over a month.

However, this error has come up recently and I am stumped. Since I didn’t previously use the twitter4j jars, I would think that the URL specification change to the twitter4j.properties file would not apply to me. I anyways cannot locate this file and do not believe it exists in my environment. There has previously been no need for it.

That being said, are there any other options for me to consider? I see that the streaming API is down: https://dev.twitter.com/overview/status. Is that the issue? Please recommend any troubleshooting options here. Thanks in advance for your time, I really appreciate it.


#40

Hello,

@umersafeer ,
Add -Dtwitter4j.streamBaseURL=https://stream.twitter.com/1.1/ in your commande line.

Stephane


#41

This dashboard is not currently reporting the status correctly - the Streaming API is working fine.

I do not understand how you can be collecting Tweets in Flume without using Twitter4J. There’s no other way for you to connect to the Twitter API.


#42

Sorry my friend and thank you for your attention but, Can you give us a longer explanation? Because I don’t know if I need to decompile a file to change the path and then compile it again. Please help.


#43

I’ve seen your solution, but I don’t know exactly where do I have to add the line you said. In which file do I need to add it. Thank you very much.


#44

i also had same problem from 2 days then i follow this link now for me this problem is solved, i added twitter4j-media-support-3.0.3.jar, twitter4j-stream-3.0.3.jar, twitter4j-core-3.0.3.jar in flume’s lib directory and set the FLUME_CLASSPATH to these jar files.


#45

i added twitter4j-media-support-3.0.3.jar, twitter4j-stream-3.0.3.jar, twitter4j-core-3.0.3.jar in flume’s lib directory and set the FLUME_CLASSPATH to these jar files. but problem is not solved.


#46

thanks alot i have also got solution…! Actually the v1 endpoints of the streaming
API have been expired from now on you have to use the new endpoints. We
have made modifications in the jar file you can download the jar file
from this link… https://www.dropbox.com/s/qvudqfym5givwdg/flume-sources-1.0-SNAPSHOT.jar?dl=0


Flume Authentication Error 401
#47

using this jar streaming became slow.


#48

flume problem is solved but now hive is not working

Exception in thread “main” java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
… 7 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
… 13 more
Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
NestedThrowables:
java.lang.reflect.InvocationTargetException
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:587)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:624)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74)
… 18 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:325)
at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:282)
at org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:240)
at org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:286)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:631)
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
at org.datanucleus.NucleusContext.createStoreManagerForProperties(NucleusContext.java:1187)
at org.datanucleus.NucleusContext.initialise(NucleusContext.java:356)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:775)
… 47 more
Caused by: java.lang.NoClassDefFoundError: Could not initialize class com.jolbox.bonecp.BoneCPConfig
at org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:59)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:238)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:131)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.(ConnectionFactoryImpl.java:8


#49

I don’t think this error has anything to do with the Twitter API.


#50

please , can you help me to solve this problem I also face this problem I dont know what should I do to get data from twitter . the commands are working but flume not transfer data into hadoop

this is my log file
23 May 2016 06:05:25,211 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:459) - process failed
java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.refreshAfterWrite(JLjava/util/concurrent/TimeUnit;)Lcom/google/common/cache/CacheBuilder;
at org.apache.hadoop.security.Groups.(Groups.java:97)
at org.apache.hadoop.security.Groups.(Groups.java:74)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:303)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:790)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:760)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:633)
at org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:2812)
at org.apache.hadoop.fs.FileSystem$Cache$Key.(FileSystem.java:2802)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2668)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:243)
at org.apache.flume.sink.hdfs.BucketWriter$1.call(BucketWriter.java:235)
at org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:679)
at org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50)
at org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:676)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


#51

Is there any solution for “twitter4j.TwitterStreamImpl: 404:The URI requested is invalid or the resource requested, such as a user, does not exist” we’re stuck at this point ? couldnt find a solution for this.


#52

Is this related to Apache Flume?

What URI are you calling?


#53

Thanks @umersafeer. You saved my life. I spent last 3 hours trying to figure out what went wrong in my perfectly fine code. Just one question -> these endpoints are responsible for generating the URL/URI which then bring back the data? Also what is the significance of twitter4j-media-support-3.0.3.jar, twitter4j-stream-3.0.3.jar, twitter4j-core-3.0.3.jar files ??


#54

Thanks a lot for the solution, it’s working fine !


#55