Plotting Tweets from DB in Ruby, grouping by hour.

Posted by plotti on Stack Overflow See other posts from Stack Overflow or by plotti
Published on 2010-04-20T11:47:53Z Indexed on 2010/04/21 5:43 UTC
Read the original article Hit count: 218

Filed under:
|

Hey guys I've got a couple of issues with my code.

  • I was wondering that I am plotting the results very ineffectively, since the grouping by hour takes ages
  • the DB is very simple it contains the tweets, created date and username. It is fed by the twitter gardenhose.

Thanks for your help !

require 'rubygems'
require 'sequel'
require 'gnuplot'

DB = Sequel.sqlite("volcano.sqlite")
tweets = DB[:tweets]

def get_values(keyword,tweets)
        my_tweets = tweets.filter(:text.like("%#{keyword}%"))
    r = Hash.new
    start = my_tweets.first[:created_at]
    my_tweets.each do |t|
     hour = ((t[:created_at]-start)/3600).round
     r[hour] == nil ? r[hour] = 1 : r[hour] += 1
    end

    x = []
    y = []
    r.sort.each do |e|
     x <<  e[0]
     y <<  e[1]
    end
    [x,y]
end

keywords = ["iceland", "island", "vulkan", "volcano"]
values  = {}

keywords.each do |k|
  values[k] = get_values(k,tweets)
end


Gnuplot.open do |gp|
 Gnuplot::Plot.new(gp) do |plot|
  plot.terminal "png"
  plot.output "volcano.png"
  plot.data = []
  values.each do |k,v|
     plot.data <<  Gnuplot::DataSet.new([v[0],v[1]]){ |ds|
       ds.with = "linespoints"
       ds.title = k
    }
  end
 end
end

© Stack Overflow or respective owner

Related posts about ruby

Related posts about twitter