As
the CEP community is well aware the technology often gets compared and
contrasted to traditional data processing practices. It's a topic that
ebbs and flows like the ocean's tides. It's a conversation that occurs
regularly in various forums, with clients, analysts and of course
within this community at large. Far be it from me to stir things
up again but I believe there is some credible evidence the critics draw
upon. This touched a nerve not too long ago when Curt Monash hammered the CEP community. The response was swift, but frankly unconvincing.
In many respects, I believe this argument is rooted in how some CEP
vendor's have architected their product. Many vendors have a focus of
event stream processing as a variant of database processing. They see
streaming data as just a database in motion
and therefore have designed and built their products myopically around
that paradigm. By doing so those vendors (possibly inadvertently)
have plotted a course where the fit-for-purpose of their products is
focused on use-cases that are data-intake dominate. They can consume
the fire-hose of data, filter, aggregate and enrich it temporally but little else. What is missing in this equation is the
definition of the semantics of the application.
Whether that is a custom application such as algo trading or
monitoring telco subscribers or some business intelligence (BI)
dashboard. To those vendors, that is viewed as adjunct or external
(i.e. the client) and solved with the typical complement of
technologies; java, C++, .NET and considered outside of the direct
domain of CEP.
While this paradigm clearly does work, it incites the (CEP) critics; "where's the distinguishing characteristics? why can't I just do this with traditional data processing technologies?".
A challenging question when so many CEP products are stuck with
that look and feel of a database, even a few of the academic
projects I've reviewed seem to be myopically centered on this same
paradigm. It reminds of that television commercial with the tag
line: "Stay Within the Lines. The Lines Are Our Friends." (I think it was for an SUV). Quite frankly such thinking does not represent the real world. Time to think outside the box (or table as the case may be).
Yes, the real world is full of in motion
entities, most often interacting with each other in some way. Cars and
trucks careening down the freeway zigzag from one lane to another
at high speed with the objective of reaching a destination in the
shortest possible time without
a collision. Would be an interesting CEP application to track and
monitor the location and movement of such things.
In fact, location tracking is beginning to show signs of being a common use-case with the Apama platform. Not long ago we announced a new customer, Royal Dirkzwager that uses Apama to track ship movements in sea lanes. My colleagues Matt Rothera and David Olson recently published a webinar on maritime logistics.
This webinar follows the same basic premise as the Royal Dirkzwager
use-case, that of tracking the location of ships at sea.
In fact, we aren't the only one seeing activity in location
tracking, here's a similar use-case for CEP in location-based defense intelligence. The basic idea is the ability to track the movement of some entity, typically in relation
to other entities, are they getting closer together (i.e. collision
detection) or moving further apart (i.e. collision avoidance), are they
moving at all? at what speed? will they reach a destination at an
appropriate time? A CEP system needs, at it's core the ability to have
both temporal and geospatial concepts to easily support this
paradigm. Here's a handful of examples where this applies:
- Tracking ship movements at sea (as I mentioned with Royal Dirkzwager, and the Apama webinar on maritime logistics)
- Airplanes moving into and out of an airspace
- Baggage movement in an airport
- Delivery trucks en route to destinations
- Service-enabled mobile phones delivering content as people move through shopping and urban areas
- Men, machines and material moving on the battlefield
These are just a handful of location tracking use-cases for which the Apama platform is well suited.
Another colleague, Mark Scannell has written a location
tracking tutorial that is part of the Apama Studio product. This is a
great example that exemplifies the power of the Apama EPL for building
location tracking CEP applications. The tutorial provides a narrative
description explaining it's purpose and the implementation. Below I've
included a small snippet of that example to highlight the elegant
simplicity, yet powerful efficiency of the Apama EPL. If you're
in need of a brief introduction to the Apama EPL, you can find
that here, the beginning of a three part series on the language.
Location Tracking in the Apama EPL.
// Track me - the tracked entity
action trackMe() {
// Call self again when new location is detected
on LocationUpdate( id = me.id ):me {
trackMe();
}
// Detect a neighbor in our local area -
// do this by excluding events that are for our id,
// which will also cause us to reset this listener
// through calling trackMe() again.
LocationUpdate n;
on all LocationUpdate( x in [ me.x - 1.0 : me.x + 1.0 ],
y in [ me.y - 1.0 : me.y + 1.0 ] ):n and
not LocationUpdate( id = me.id ) {
// Increment number of neighbors that have been spotted
// and update the alarm
spotted := spotted + 1;
updateAlarm();
// Decrement count of spotted neighbors after one second
on wait ( 1.0 ) {
spotted := spotted - 1;
updateAlarm();
}
}
}
|
As a brief introduction, the Location Tracker tutorial is designed to track the movement of Entities
(i.e. cars, ships, trucks, planes, or any of those things I listed
above) in relation to other Entities within a coordinate system or grid. An
entity is considered a neighbor
if it is within 2 grid units (-1,+1) of any other entity. The grid and
the units within the grid are largely irrelevantly for the syntactic
definition of entity tracking. Their semantic meaning on the other
hand, is within the context of a specific use-case (i.e. a shipping
harbor, air space, battlefield, etc.).
From the tutorial I pulled a single action, trackMe,
it contains the heart and soul of the tracking logic. As entities move they produce LocationUpdate events. The event contains the entities unique id and the X,Y coordinate of the new location. This trackMe action is designed to track their movement by monitoring LocationUpdate events. For each unique entity there is a spawned monitor instance (a running micro-thread so to speak) of this trackMe action.
The idea is that when an entity moves its new location is instantly
compared against all other tracked entities (except of course itself,
witness the recursive call to trackMe when id's match (id = me.id)) to determine if it has become a neighbor (remember the 2 grid units). This is elegantly implemented with the listener "on all LocationUpdate( x in [me.x - 1.0 : me.x + 1.0], ...". In a narrative sense, this can be read as "If the X,Y coordinate of this entities new location is within 2 grid units (-1.0, + 1.0) of me then identify it as a neighbor and update an alarm condition" (via a call to updateAlarm()).
This small bit of code (about 20 lines) exhibits an immensely
powerful geospatial concept, the ability to track the
movement of 100's, 1000's even 10,000's of entities against each other
as they move, and of course this is accomplished with millisecond
latency.
This small example demonstrates a few of characteristics of
the Apama EPL, specifically that it is an integrated well-typed
procedural language with
event expressions. It allows you to code complex event conditions
elegantly
expressed in the middle of your procedural program. This allows you to
focus on the logic of your application instead of just selecting
the
right event condition.
However to get a clear picture, the language of CEP is just one aspect
of an overall platform. The Apama strategy has also been focused
on a whole product
principle, one where the whole is greater than the sum of the parts.
As a mandate to our vision we have outlined four key defining characteristics: 1) A scalable, high performing Event Engine. 2) Tools for rapid creation of event processing applications supporting business and IT users. 3) Visualization technologies for rich interactive dashboards and 4) An Integration fabric for the rapid construction of high performance, robust adapters to bridge into the external world.
The possibility exists that CEP will
struggle to break out as a truly unique technology platform when so
many just see a variant of database technology. It's time
to break out of the box, drive over the lines and succinctly answer the critics questions. CEP is not about
tables, rows and columns but events. Events that are often artifacts of the real world. A world that is
in constant motion, be it ships, planes, car, trucks, phones, or you
and me. Information flows from it all in many forms but that does mean
we have squeeze it into the database paradigm.
Once again thanks for reading,
Louie