In Piam Memoriam Fundatoris Nostri
Posted by John Bates
There have been a number of exchanges recently on the cep-interest group and on this blog on the topic of “the origins of event processing and CEP." As someone who has been involved in event processing research and products for 17 years I’ve been asked to add a perspective here. Wow this makes me feel old.
Although I started researching composite/complex event processing as part of my PhD at Cambridge in 1990, I certainly wasn’t the first. So I can’t claim to have invented CEP. As Opher Etzion correctly observed in an email to cep-interest, my experience was also that this discipline originated from the “active database” community. There was much work done prior to 1990, which added events and event composition to databases. The term “ECA rules” – or Event-Condition-Action rules were a popular way of describing the complex/composite event processing logic.
When I was experimenting with multimedia and sensor technologies in the early 90s – and trying to figure out how to build applications around distributed asynchronous occurrences (such as tagged people changing location) – I realized that building “event-driven” applications in a distributed context was a new and challenging problem from a number of angles. I looked for prior work in the area. Although I didn’t specifically find any work in this area, I was able to look to the active database community. In particular, a paper by Gehani et al on “composite event expressions” (as recently mentioned by Opher) looked ideal for the applications I had in mind. This paper outlined an algebra for composing events and a model to implement the subsequent state machines. I implemented the Gehani model as part of my early work. While it was a great concept, it had a number of shortcomings:
- Although it claimed to be able to compose any complex event sequence, it was incredibly difficult to compose even a simple scenario.
- It didn’t consider the issues of distributed event sources, such as network delays, out-of-order events etc.
- It didn’t consider the key issue of performance – how could you process loads of events against a large number of active composite event expressions.
Active databases had only considered events within the database. And databases had fundamental problems of store-index-query – which are not ideally suitable for such fast-moving updates. In order to make composite events applicable as a way of building applications, the above shortcomings had to be addressed.
Composite event expressions was only one aspect of my work initially, but as the volumes of real-time data continued to grow and new sources of data continued to emerge, it became clear that the work in distributed composite/complex event processing had legs. Also, it seemed to excite many people.
There were of course the cynics. Many of my Cambridge colleagues thought that events were already well understood in hard and soft real-time systems and in operating systems – and that’s where they belonged. It is true that event processing has been part of systems for several decades. Traditional systems handle events in the operating system. However, never before had events been exposed at the user level as ad hoc occurrences, requiring specific responses. There was a new requirement for applications that could “see” and respond to events.
Some closed “event-based systems”, such as GUI toolkits, like X-windows, allowed users to handle events in “callback routines”. For example, when someone clicks on a window, a piece of user code could be called. This approach tried to make the most of traditional imperative languages, and make them somewhat event-based. But this paradigm is tough to program and debug – and events are implicit rather than explicit. Also the program has to enter an “event loop” in order to handle these events – to make up for the fact that the programming paradigm wasn’t designed to handle events explicitly.
So we began to realize that events had to be explicit “first class citizens” in development paradigms. Specifically, we saw a new set of capabilities would be required:
- An event engine – a service specifically designed to look for and respond to complex event patterns. This engine must be able to receive events from distributed sources and handle distributed systems issues.
- An event algebra – a way of expressing event expressions, involving composing events using temporal, logical and spatial logic, and associated actions. These might be accessible though a custom language or maybe even through extensions of existing languages.
- Event storage – a services specifically designed to capture, preserve in temporal order and analyze historic event sequences.
A number of my colleagues in Apama worked on these and other areas. As far as a research community went, we published mostly in distributed systems journals and conferences, such as SigOps. We worked closely with other institutions interested in events, such as Trinity College Dublin.
In 1998 I, along with a colleague, Giles Nelson, decided to start Apama. I later found out that concurrent with this, and coming from different communities, other academics had also founded companies – Mani Chandy with iSpheres and David Luckham with ePatterns. These companies had different experiences – David told me ePatterns unfortunately overspent and became a casualty in the 2000 Internet bubble bursting. David of course went on to write a very successful book on event processing. iSpheres went on to do brilliantly in energy trading but was hurt by the Enron meltdown and struggled to compete with Apama in capital markets. Apama focused primarily on capital markets, with some focus on telco and defence, and went on to be very successful, being acquired by Progress in 2005. Interestly, long after these pioneering companies were started, several new startups appeared – all claiming to have invented CEP !!
So that’s my potted history of CEP. I don’t think any of us can claim to have invented it. I think some of us can claim to have been founding pioneers in taking it into the distributed world. Some others of us can claim to have started pioneering companies. All of us in this community are still in at an early stage – and it is going to get even more fun.
There’s one bit I haven’t talked about yet – and that’s terminology. Most researchers originally called this area “composite event processing”. The term “complex event processing” now seems to be popular – due to David’s book. There are some arguments about the differences between “complex/composite event processing” and “event stream processing”. From my perspective, when Apama was acquired by Progress, Mark Palmer invented the term “event stream processing” to avoid using the word “complex” – which Roy Schulte from Gartner thought would put off customers looking for ease-of-use. However, then it seemed that the industry decided that event stream processing and complex event processing were different – the former being about handling “simple occurrences” in “ordered event streams” and the latter being about handling “complex occurrences” in “unordered event streams”. Now in my opinion, any system that can look for composite patterns in events from potentially distributed sources, is doing complex/composite event processing. Yes, there may be issued to do with reordering but there may be not. It depends on the event sources and the application.
It's often tricky to work out "who was first" in academic developments. But it's good to know we have some excellent pioneers active within our CEP community who all deserve a lot of respect.