Cloud Architectures

The Amazon Web Services Blog points to a new white paper one of their engineers has written, dealing with Cloud Architectures. It’s a really good overview of the cloud that Amazon offers and it gets into the architecture decisions when building something for the cloud.

Cloud Architectures address key difficulties surrounding large-scale data processing. In traditional data processing it is difficult to get as many machines as an application needs. Second, it is difficult to get the machines when one needs them. Third, it is difficult to distribute and co-ordinate a large-scale job on different machines, run processes on them, and provision another machine to recover if one machine fails. Fourth, it is difficult to auto-scale up and down based on dynamic workloads. Fifth, it is difficult to get rid of all those machines when the job is done. Cloud Architectures solve such difficulties.

Applications built on Cloud Architectures run in-the-cloud where the physical location of the infrastructure is determined by the provider. They take advantage of simple APIs of Internet-accessible services that scale on-demand, that are industrial-strength, where the complex reliability and scalability logic of the underlying services remains implemented and hidden inside-the-cloud. The usage of resources in Cloud Architectures is as needed, sometimes ephemeral or seasonal, thereby providing the highest utilization and optimum bang for the buck.

Definitely check the tips for building cloud applications since they are very relevant no matter if you are deploying on Amazon or your own system.

Molten Data and NPR

Update: Jeff Jarvis asks a great question about what people could do with this data. It’ll be fun to find out.

I was re-reading Matt Waite’s post on molten data and then read about NPR releasing an API for parts of their content. The two seem linked.

I know the East Coast Times is working on some sort of API but I’ve been thinking about how we could open things up and allow folks access to so much of our good stuff. Why not start with just articles, using dates, keywords or writers as the inputs. Moving on from there, you could add photos, video and then more of our data apps. That seems pretty straightforward to me.