Summary
The Event Joiner Processor is used to combine data from two event records with different GeoEvent Definitions using a matching key. The processor creates a new GeoEvent Definition which includes the attribute fields from each specified GeoEvent Definition. Once the processor has received an event record of each GeoEvent Definition, it copies the most recent data from each event record and sends the merged event out as a new event record.
Examples
- A real-time join between live weather data and live stationary sensor data can provide ancillary information about the environment where the sensors are located. This is useful for monitoring flood, wind, or other natural conditions that could interfere with the sensors in real-time.
- The output of a real-time join between two moving assets from two different feeds could be used to derive additional real-time information. For example, the location of each moving asset as a part of one new event record could be used to calculate a bearing between those assets in real-time. For more information, see the Bearing Calculator Processor.
- Event records derived from the Incident Detector Processor have a different schema from their original source input. Using the Event Joiner Processor, the processed event records from the Incident Detector Processor can be joined again with the original source data to maintain both schemas in real-time.
Usage notes
- To configure the Event Joiner Processor, select two GeoEvent Definitions and a field from each schema whose values will be used to perform the join. Once the processor receives an event record of each GeoEvent Definition type, with a matching attribute value (or key) from the join fields, a new event record will be created. The new event record will share the combined schema of the two GeoEvent Definitions used in the join.
- While waiting to perform a join, the processor uses an internal cache to retain event records. The processor caches one event record of each unique attribute value for each GeoEvent Definition type. When the processor receives an event record with a previously observed attribute value, the older event record is discarded. This way only the most recently received event record for a given attribute value key will be used when joining two event records.
- The Event Joiner Processor, like the Field Enricher (Feature Service) Processor, has a configurable cache size. When the number of cached event records received exceeds the configured cache size, older records in the cache will be purged to accommodate new event records.
- The cache for the processor is held in memory. The cache will be recreated if changes are published to a GeoEvent Service which includes an Event Joiner Processor or if the ArcGIS GeoEvent Server service is restarted. Stopping and starting a GeoEvent Service in GeoEvent Manager will not cause the cache to be recreated.
- When configuring the Event Joiner Processor, you must specify what should be done with the latest, cached, event records from each GeoEvent Definition after they are used to perform an event join. By default, cached event records will be discarded after they have been used and the processor will wait to receive a new event record of each type for a subsequent join. You can configure the processor, however, to retain its cached event records after a join and produce a new combined event record whenever a new event record of either GeoEvent Definition type is received.
- As a best practice, it is strongly recommended a Field Mapper Processor be used immediately after the Event Joiner Processor for schema correction. The Event Joiner Processor makes several necessary changes to the schema of the output GeoEvent Definition including:
- Existing field names are prepended with their source GeoEvent Definition name. An example field called Altitude from the first GeoEvent Definition, A, will be renamed to A_Altitude in the output GeoEvent Definition. This is necessary to distinguish what might otherwise be duplicate field names from each source GeoEvent Definition used for the join.
- Existing tags are removed from the output GeoEvent Definition. Reserved tags such as GEOMETRY and TRACK_ID cannot be applied to more than one field in a GeoEvent Definition.
- All the fields from each source GeoEvent Definition are included in the output GeoEvent Definition.
- As a best practice, set the Cache Size to a value greater than the expected number of unique event records for the highest volume input feed. The Event Joiner Processor purges its cache once the maximum number of events has been reached, so if the value of the Cache Size is set too low, records may be purged before a successful join can occur. Also, take precaution setting the Cache Size value too high, since event records are stored in memory and a value that is set too high could result in performance implications.
Parameters
Parameter | Description |
---|---|
Name | A descriptive name for the processor used for reference in GeoEvent Manager. |
Processor | The name of the selected processor. |
First GeoEvent Definition | The name of the first GeoEvent Definition. The first GeoEvent Definition is used to identify the schema of the first set of event records in the real-time join. Event records associated with the first GeoEvent Definition will be joined to event records associated with the second GeoEvent Definition. |
First GeoEvent Definition Join Field | The name of the field from the first GeoEvent Definition whose value will be used as a key to perform the join with the real-time data from the second GeoEvent Definition. Individual event records are cached using the value in the First GeoEvent Definition Join Field until a corresponding matching value exists from the Second GeoEvent Definition Join Field. While waiting to perform a valid join, the processor only caches the latest event record for each key. The name of the First GeoEvent Definition Join Field does not have to be the same as the Second GeoEvent Definition Join Field for a join to take place. Only the value in each join field, which is used as a common key, must be the same for a successful join to occur. |
Second GeoEvent Definition | The name of the second GeoEvent Definition. The second GeoEvent Definition is used to identify the schema of the second set of event records in the real-time join. Event records associated with the second GeoEvent Definition will be joined to event records associated with the first GeoEvent Definition. |
Second GeoEvent Definition Join Field | The name of the field from the second GeoEvent Definition whose value will be used as a key to perform the join with the real-time data from the first GeoEvent Definition. Individual event records are cached using the value from the Second GeoEvent Definition until a corresponding matching value exists from the First GeoEvent Definition Join Field. While waiting to perform a valid join, the Event Joiner Processor only caches the latest event record for each key. The name of the Second GeoEvent Definition Join Field does not have to be the same as the First GeoEvent Definition Join Field for a join to take place. Only the value in each join field, which is used as a common key, must be the same for a successful join to occur. |
New GeoEvent Definition Name | The name assigned to the new GeoEvent Definition created by the processor. The new GeoEvent Definition will combine the schema of the first and second GeoEvent Definitions. Note: While the new GeoEvent Definition maintains the existing field order and data types, the field names are changed. Additionally, existing tags from each source GeoEvent Definition do not carry over to the new GeoEvent Definition. See the Usage notes above for more details. |
Cache Size | Specifies the maximum number of event records to maintain in the cache. The maximum number of event records cached are respective to each GeoEvent Definition. For example, if the Cache Size is set to 2000, the cache will store 2000 event records corresponding with the first GeoEvent Definition and 2000 event records corresponding to the second GeoEvent Definition. Event records are cached using the value of the first and second GeoEvent Definition join field. The default is 1000. Note:Once the set Cache Size is exceeded, the cache will be purged. |
Clear After Join | Specifies whether to clear corresponding event records after a successful join. The default is Yes.
Setting this to Yes or No has potential to impact the core behavior of the Event Joiner Processor. When set to Yes, the processor will only join the latest event records sharing a matching key value. When set to No, the processor takes on an entirely different behavior. Event joins will consistently occur using a combination of the latest real-time event records and older, cached, event records. |
Considerations and limitations
- Joins cannot be made on event records sharing the same GeoEvent Definition name. The processor uses the name of the first and second GeoEvent Definition to construct separate caches for each set of event records. If data is coming from two different source feeds and they share the same schema and GeoEvent Definition, consider creating a duplicate GeoEvent Definition with a different name for one of the source feeds before passing the event records to the processor.
- Only fixed GeoEvent Definitions are supported for event joins. Managed GeoEvent Definitions are not supported.