HBase is the open source implementation of Google’s Bigtable. I’ve been keeping my eye on it in combination with Hadoop. I had some extra time today so I decided to see how easy it would be to hook it up with the aggregator we built for things like Topics.
One of the nice things about HBase is the REST interface that can read and write data. I hooked up the Ruby client so that whenever I saved posts from the feed to MySQL, it would also send data to HBase.
The writing to HBase is pretty straightforward and the REST client makes it really easy. However, getting the data out needs to be looked at a bit more closely.
HBase is NOT a relational database. If you approach like it is, you will get utterly confused and frustrated. Instead, it can be thought of as a collection of Maps. So, in order to get data out, you need to iterate over the Maps looking for particular columns.
When you use the REST API, you do this via the creation of a scanner and pop‘ing off the results like from a queue.
That’s some of what I found out, let’s see what else I can dig into today.
- BROWSE / IN TIMELINE
- « Doing My Civic Duty
- » Google Flu Trends
- BROWSE / IN HBase Hadoop REST Ruby
- « Starling
- » Hadoop’ing at My Desk
SPEAK / ADD YOUR COMMENT
Comments are moderated.
