I’ve been reading a few interesting analyses of Twitter data recently such as this #gamergate analysis by Andy Baio. I thought it would be nice to have a mechanism for people to quickly and easily import data from Twitter to Neo4j for research purposes. Like a good programmer I had to go up at least one level of abstraction. Thus was born the ruby gems neo4apis and neo4apis-twitter (and, incidentally, neo4apis-github just to prove it was repeatable).
neo4apis-twitter gem is easy and can be used either in your ruby code or from the command line.
neo4apis takes care of loading your data efficiently as well as creating database indexes so that you can query it effectively.
First you need to make an instance of
Neo4Apis::Twitter. Pass in a
Neo4j::Session from the
neo4j-core gem specify the database to import to:
neo4japis_twitter = Neo4Apis::Twitter.new(Neo4j::Session.open, import_retweets: true, import_hashtags: true)
From there we can initialize a client from the
twitter_client = Twitter::REST::Client.new ... neo4japis_twitter.batch do twitter_client.search('ebola', result_type: 'recent').take(100).each do |tweet| neo4apis_client.import :Tweet, tweet end end
This will perform a search using the
- The tweet
- The tweeter
Because we set the
import_hashtags options to
true above, it will also import:
- The original tweet if a retweet
- The tweeter for the original tweet
- The hash tags from the tweet
These are all loaded as nodes into the neo4j database and connected with relationships as appropriate.
Using the command line
If you don’t want to bother with writing the ruby code, some commands are provided for you! After installing the gem and setting up the twitter authentication config file (see the README), you just need to perform your search:
neo4apis twitter search "hugs and puppies" 50 --import-retweets --import-hashtags
That will import 50 tweets (and associated users, and retweets, and hashtags, etc…) for the “hugs and puppies” search into neo4j! Note that you can see the documentation for the
search command by typing:
neo4apis twitter help search
There is also a command to stream tweets continuously:
neo4apis twitter filter "ballooning etiquette"
Data will be continuously imported until you hit
Ctrl-C. Again, for help with this command you can type:
neo4apis twitter help filter