LinkLog: Streaming Data, Distributed Execution Engines

Posted on March 17, 2009

5


It is rare that in three days time you come across four references to similar technologies. That is what happened to me a couple of days ago.

  1. There was a reference to Hadoop on Twitter. I almost forgot about Hadoop, the open source equivalent of Map/Reduce.
  2. I was watching a rather unusual Google Tech Talk the other day. It was unusual, because a person from Microsoft Research was talking about Dryad, their distributed execution engine, at Google.
  3. One of the participants asked a question whether the speaker can compare Dryad to IBM’s Stream Processing Core. So I had to look it up.
  4. Following a few links from the IBM article, I found SPADE, a declarative language for handling streaming data. I have always been fascinated by domain specific languages to solve special problems, especially with data. You learn a lot by just understanding the high level concepts.

So here they are. A set of related technologies with some overlap.

Google Map/Reduce

Apache Hadoop

Microsoft Research’s Dryad

IBM Streaming Processing Core

IBM SPADE – Stream Processing Application Declarative Engine

Posted in: Software, Trends