Realtime visualisation of @replies

4 April 2011

4 comments

A Blue Perspective: Realtime visualisation of @replies

See @replies visualised in realtime

Back on September 15, 2010, it was "R U OK?" Day. This is a national day of action which is designed to raise awareness of suicide by encouraging people to reach out and make contact with others by asking "Are you OK?"

We were commissioned by the R U OK? organisation to create a visualisation that highlighted the connections people were making throughout the day, and after a number of rounds of brainstorming (and budget cuts) we chose to highlight the connections that were being made over Twitter.

Because of fears about subject matter we weren't allowed to highlight content contained in any tweets, so to gather our data we performed a search for all tweets in Australia that are directed at someone (@replies), geocoded both ends of the conversation, recorded the timestamp of the message, and map this connection in time and space. For each location we also perform frequency analysis on the tweets and provide a tag cloud of the most used words (available if you hover over a city).

All of this is done in realtime, so you can see actual conversations as they are formed. Beyond the realtime aspect, we also give you two weeks worth of historical data so that you can see the rise and fall of activity throughout the day and across weeks. Each of these hourly periods also has a unique pattern of replies that crisscross the country, showing up when people from other cities strike up conversation with one another.

It's interesting to see the times of day when people are on Twitter the most, and also see the days of the week when activity is high. On the actual R U OK? Day we saw a dramatic (~2x) spike in @reply activity which we could hopefully attribute to the spread of R U OK? Day.

The Achilles heel of this visualisation, however, is the amount of processing that it has to do. Firstly, it fetches a search from the Twitter API several times a minute (you have to be careful to stay within the rate limits), then for each tweet it has to check whether the sender and receiver have location data, and if they do, geocode both locations.

Geocoding is an expensive operation and because I'm geocoding roughly 200 points a minute, we quickly fall afoul of hourly geocoding rate limits. To counteract this, I have it setup to try geocoding via Google Maps, and than fail over to Yahoo! once we hit the limit. Somewhere along the way I implemented caching of geocoding results and we now have a pretty handy database of geocoded strings.

The price for daisy chaining all these APIs is fragility. It's a lot of work to keep this system up, and as a result I'll be taking the whole thing offline in about a week. (Also, it's costing me a fair bit in App Engine hosting charges.)

Still, if you want to check it out, you can for now. Thereafter, I'll replace it with a video of what it once was.

Update 2011-04-19: I've replaced it with a video of what it once was.

Categories

, , ,

Comments

  1. 1/4

    Day Barr commented on 4 April 2011 @ 01:05

    For geocoding, you can use the GeoLite City database which is available for free from MaxMind at http://www.maxmind.com/app/geolitecity

    As you can host the dataset yourself, there is no limit on lookups and no API dependency.

    Getting this data into App Engine is tricky though. See http://stackoverflow.com/q/4608239/445073

  2. 2/4

    Lachlan Hardy commented on 4 April 2011 @ 12:22

    I've been using https://simplegeo.com/products/ for geo stuff lately. Love their work. Especially the new Storage product.

    I don't use AppEngine any more, though, so I've no experience with it on that platform.

  3. 3/4

    Erietta Sapounakis commented on 5 April 2011 @ 12:23

    Thanks for the technical performance run down. Always good to know, as a non geek, the implications and limitations to keep in mind when coming up with designs.

  4. 4/4

    Daniel Cates commented on 5 April 2011 @ 23:07

    The map is beautiful, and as a statistician I believe that if you can reveal the meaning of data through a beautiful image, your job is done for the most part.

  5. Leave your own comment

    Comments have been turned off on this entry to foil the demons from the lower pits of Spamzalot.

    If you've got some vitriol that just has to be spat, then contact me.

Follow me on Twitter

To hear smaller but more regular stuff from me, follow @themaninblue.

Monthly Archives

Popular Entries

My Book: Simply JavaScript

Simply JavaScript

Simply JavaScript is an enjoyable and easy-to-follow guide for beginners as they begin their journey into JavaScript. Separated into 9 logical chapters, it will take you all the way from the basics of the JavaScript language through to DOM manipulation and Ajax.

Step-by-step examples, rich illustrations and humourous commentary will teach you the right way to code JavaScript in both an unobtrusive and an accessible manner.

RSS feed