Hi,
I have installed twarc in the command prompt with the command:
pip install twarc (which works).
But, now when I type in the command “twarc configure”, it returns: " ‘twarc’ is not recognized as an internal or external command, operable program or batch file.". I have tried multiple times but still getting the same message.
I am a tiny bit more familiar with the Python environments, but not too comfortable with running the code from the command line. Could someone please suggest a fix?
Are you using windows? Maybe this will help Windows 10 - twarc
Also, for Academic Endpoint using v2 API it’s twarc2 configure, unfortunately it’s separate to twarc configure
Yes, I am using windows. I think it has something to do with the path.
I tried twarc2 configure as well, but no luck.
Yes, this is a common issue with windows - how did you install Python? This matters to how the PATH is set - did any of the Windows 10 - twarc commands and settings work?
I can’t get past the first command for install. When I run the install command again, it says that the requirement has been satisfied. I attached a screenshot of the message that I get.
I originally downloaded python from its website, for windows. Later I downloaded Anaconda since I had to use some editors on there. Now I use python through Anaconda.
If you have anaconda, can you try running all the commands through “Anaconda Prompt” and not the default “Windows Command Line”. you should see (base) in the prompt if it worked. Running the twarc commands in Anaconda Prompt should work (unfortunately i don’t have windows running to make sure right now)
Thank you so much for your help. It worked in Anaconda.prompt… yay!
Is it not prescribed to use twarc as a library? Does it have the same functionality when used as a library, as it does in the command line?
Also, would you suggest using the basic JSON file for big data (consisting of millions of tweets?) or should one use another database to go with it?
Sure - twarc can work either way, to use it as a library, this is a good place to start:
Functionality it is the same, but command line makes things slightly more convenient
I would recommend keeping the original API responses exactly how twarc stores them: as new line delimited json, 1 response per line. This means that you can process and change to whatever database you need later on. It depends on what your analysis is.
Some useful things may be:
Convert to 1 tweet per line as opposed to 1 response with a set of tweets and metadata:
twarc2 flatten results.jsonl tweets.jsonl
Or install twarc-csv with pip install twarc-csv and convert to a CSV for importing into other tools:
twarc2 csv results.jsonl tweets.csv
Great! Thank you so much for the links and suggestions. You have been a great help.
1 Like