Happy Thanksgiving

The kids are playing, the wife is out seeing a movie with a friend and I’m checking RSS feeds and going thru browser tabs. We’ll be going to my parents later this afternoon. I’ve seen many posts today taking stock of how things are and where things should go. I should probably do the same.

    Things I’m thankful for…
  • I’m employed
  • I have an amazing family which keeps me grounded
  • I am once again proud of my country after Election Day
  • I find new things to learn about almost daily
  • This barely covers everything I’m thankful for but I’m having difficulty putting them into words.

    To finish off, read this post about doing better.

    Being thankful shouldn’t just be a warm and fuzzy in my opinion, it should also be a call to take stock in those around you and to do better. I know I am.

    Google Flu Trends

    Lots of people are linking to it but Google’s Flu Trends is a pretty amazing site.

    The things you can figure out when you have the incredible amount of data Google has access to can provide insights into things previously not possible. I really think the idea that the CDC was up to two weeks behind in noticing the outbreaks is says the most.

    You can also download the raw data and display it in other ways if you’d like.

    Looking into HBase

    HBase is the open source implementation of Google’s Bigtable. I’ve been keeping my eye on it in combination with Hadoop. I had some extra time today so I decided to see how easy it would be to hook it up with the aggregator we built for things like Topics.

    One of the nice things about HBase is the REST interface that can read and write data. I hooked up the Ruby client so that whenever I saved posts from the feed to MySQL, it would also send data to HBase.

    The writing to HBase is pretty straightforward and the REST client makes it really easy. However, getting the data out needs to be looked at a bit more closely.

    HBase is NOT a relational database. If you approach like it is, you will get utterly confused and frustrated. Instead, it can be thought of as a collection of Maps. So, in order to get data out, you need to iterate over the Maps looking for particular columns.

    When you use the REST API, you do this via the creation of a scanner and pop‘ing off the results like from a queue.

    That’s some of what I found out, let’s see what else I can dig into today.