Introduction to Data Events
The day to day business operations of the organization are driven by, and result in the generation of transactions. Typically, in the business environment these transactions take the form of documents, orders, invoices, shipping documents, payments, vouchers, requisitions, applications, contracts, policies, plans, and a myriad of internal forms which record, for some variable length of time, what has occurred. They can also take a less tangible form such as telephone conversations and meetings. These documents, etc., are both internally, and externally, generated. The documents appear directly as entities in the real world model and their contents appear indirectly in the data model in the form of entities, attributes of entities and relationships between entities. Because the events which they represent occur randomly, these documents also appear, initially, in a relatively random arrival pattern and they also require varying periods of time to process. These periods of time can range from seconds to days.
The length of time needed to process a document is determined by its complexity, its interaction with other transactions (processing dependencies), its completeness, its cleanliness, and the time delay for data research, calculation, verification, decision making, etc.
For ease of discussion and presentation, the event which generates the document and the receipt and processing of the document, will be called a data event.
A data event is defined as some organizational activity, either internal or external, by which new data enters the files of the firm, or by which some change in status, condition or relationship is determined.
The files of the firm are those collections of like, or related, transaction, or data event documents.
A document can be considered a collection of data elements which enter the firm as a result of a data event. A document can also be considered a data carrier since it transports data into and from the firm.
A data element is a single item of data at its lowest meaningful level.
While the documents which enter into the organizational sphere of awareness have been discussed as if they were homogeneous and self contained, in fact usually they are not. The normal business transaction (document) contains a mixture of data relating to the immediate transaction, and other useful, but extraneous, data. Some data on the transaction is of immediate use and some is destined for later use. Not all data is usable by, nor relevant to the operational area receiving the transaction.
The receipt of the order initiates a sequence of parallel and sequential but not necessarily continuous processes, which, potentially and ultimately, affect all aspects of the organization from sales thorough shipping, invoicing, inventory, accounts receivable, manufacturing and a myriad of reporting, accounting and control activities. This is the life cycle view of the order entity.
In one form or another, the data on that customer purchase order (or sales generated order, or whatever), affects a large part of the organization. In fact, in a sense, it can be said that most organizations which produce products or provide services, exist for, and because of, that order. Proper processing of that order is essential to the life and financial health of the organization, and all organizational activities are aimed toward satisfying the request it represents and following its directions.
Thus, the efficient, complete and accurate processing of the order and the recording of the data contained therein, is vital to the company. Its contents are examined, updated, copied, propagated, filed and finally archived. The primary point however, is that one, and only one, (data) event triggered all of this activity - the receipt of that order.
In the course of its passage through the organization, bits and pieces of data were picked off, isolated, immediately processes, filed for later use, used to reference other files, or to create other documents. It can be seen from this example that no data exists in isolation. Data is used in context, and in conjunction with other data. The incoming document only acts as a trigger which causes the activation of immediate processes, and in delayed form, the activation, after other events, of still more processing. As the data passes through each operational area of the organization, it is acted upon, and those actions, in turn, cause the generation of internal data recording current and triggering subsequent processes.
If the data on this order were complete as is, and if its data content never changed, but preserved in its entirety and in its original form, the data event and the data could be considered as the same. However, normally, many other things happen which cause change, both major and minor, to the original order data. These changes are also called data events. In the course of the normal business day many different data events occur which likewise propagate more data thorough the organization.
These events represent the activities of the business and ultimately generate the records and files of the business. With few exceptions, it is the business's ability to receive, process, store, and retrieve these transactions which determine its ability to grow and prosper. The reports generated from this data, enable management to control, coordinate, plan and monitor organizational efficiency.
But it is the sum total of all the data events in order entry, customer service, credit, order fulfillment, shipping, invoicing and payment operations which are meaningful. Sales analysis, accounts receivable, inventory processing all reflect data event activity.
But, as in the preceding example of the order, the transaction in and of itself means little. It does, however, contain the raw materials (data) which drive many activities.
Integration of Business Activities
The activities of business are integrated. That is, they are connected and interdependent, and the connection is the data which drives them or more precisely, it is the data they need to accomplish their objectives. What happens to, and because of, the data at each operational processing station determines what happens at subsequent stations. In an assembly line environment, one free for external stimulus and external change (in other words a completely controlled environment) the data can be viewed as flowing through the business, causing activity to occur and more data to be generated. While this is seemingly in conflict with our previously explained transactional view, it is in harmony with our entity activity life cycle view.
Under either view, each business task, each business activity etc., consists of a sequence of data retrievals, data process steps, archival activities for data being changed but whose old values need to be recorded for historic reasons (filing) and subsequent data update or maintenance steps, followed by recording of the newly changed data back in the corporate files.
The entity activity life cycle view of business operations treats this sequence of activities as its main focus and determines the retrieval, archiving and updating of all the data necessary to support the collection of activities or tasks within each of the identified entity life cycles.
Within the business, the none of these processes are of long duration as measured from initiation to completion. The duration is dependent upon both the amount of data in the transaction initiator, and the actions to be taken on the data contained in that initiator.
Transactions as Business Stimuli
If the organization can be viewed as an entity, the transactions are the stimuli which cause it to react. The more complex the stimulus, the more complex the reaction, and the longer its effects are felt.
All businesses react to data events which are incoming, and initiate data events which are outgoing. A data event is some occurrence, internal or external, which cause the business to record data for immediate or delayed retrieval and use. Delayed use could be for processing, reference or both. Rarely are data events recorded and forgotten. Each data event causes organizational reaction and, in a sense, can be viewed as a trigger of that reaction.
The data events of the organization can be identified and their (processing) reactions can be mapped (or modeled), starting from the event occurrence (receipt or action), through the point at which all of its data contents have been examined, referenced, validated, and recorded.
Data Entity Activity Time Lines as Data Event Identifiers
The mechanism of the entity time lines can identify all data events which could occur within the life of each data entity as defined by its life cycle. These events can be ordered along the time line by the major groups of activities which occurred within the life cycle of the entity. Each data event can then be mapped in terms of:
In the real world, business activities are data event driven, and data events are, for the most part, external. Data events are seen not in terms of single elements or element entities, but as complex aggregates of data which cause multiple impacts on the organization. Data event triggers, or initiators, thus contain many data elements which affect many data entities, and establish or modify many relationships.
Notice that this chapter refers to data entities and data actions, and data events. There are to be sure, an equal number of process activities, process entities and process events. However, process implies manipulation and handling of existing data rather than generation of new data. Process results in information being generated. Process rarely generates data.
Data versus Information
There is a fundamental distinction to be made between data analysis and the data model, and information analysis and information models. Data are facts (plural) (datum is singular). Data is processed to become information. Data is thus raw material from which information is extracted or generated. While raw data may have intrinsic value, structured, complete, timely data becomes information that is of infinite value to the organization. The value of information to the firm is directly related to the accuracy of the initial data from which it was derived. The capture of data in a flexible, fully accessible form in the data base is thus important. Data bases are just that - bases of data; pools of raw, unprocessed facts. Data also has a structure which represents the manner in which the data was aggregated and classified. This structure shows implicit and explicit relationships, dependencies, levels of detail and access sequences - all of which are of value to the organization enabling data once recorded to be retrieved.
The data entity time lines record and partially organize the data events. Each data event identified on each time line must be further described, either by physical examination or in the case of hypothesized events, by hypothesizing as to the data content or data availability when the event occurs. Disregarding whether the processes or activities initiated by the data event are mental, manual or automated, the steps needed to fully record the data event in the files of the firm, to determine where the data that triggered or initiated that event is to be recorded must be documented. Data event analysis examines each data event in light of a limited set of data activities. These data activities are:
None of these data activities involve processing, per se. The activities of retrieve, update and verify are used to identify data requirements. How these work and help should become obvious in the discussion that follows. One goal of a design is to identify each of the data event for each of the entities and to use those data events to determine the descriptive attributes, operational attributes and data relationships of each of those data entities in light of the data introduced by each of the data events. The data event models document the steps that must be accomplished to fully record the data event into the corporate files.
Corporate data results from its day-to-day business activities and transactions, and from the records of actions taken by or against the data entities with which the corporation is concerned. Obviously, it is not enough to identify the data transactions but it is also necessary to identify how it is verified and validated and, based upon what is known about the data event, how and where it is to be recorded.
The data event is both real and conceptual. Its description must answer the questions: What is known about the content of the data event trigger? What does that data trigger look like and what can be gleaned from it, either explicitly or implicitly? What opportunity information or data is on the trigger? What are all the forms it can take? What are all the sources from which it can arise? What are all the shapes it can assume?
For instance, take the data event - order receipt. It can originate from the sales force, from a P.O., from a phone call, from a letter, from a magazine response card, from blind or unsolicited inquiries.
The ordering entity, or the entity for whom the order is placed, must be assumed to know nothing other than who it is and what is wanted. No assumptions can be made as to whether it is a new customer, an old (active or inactive) customer, or what. No assumptions can or should be made as to customer identifiability other than from what is assumed to be present on the order trigger. Nor can can assumptions be made as to the completeness, accuracy or validity of the data received. The trigger is defined as that physical, oral or written vehicle of communication which initiates the data event. While an order is being used to illustrate the complexity of the procedure, the principles can be applied to any data event.
Additionally, some data events fold into others as they are modeled into the data activities necessary to record them.
In developing the data entity time line, data events were mapped to an entity. Data event modeling reversed the process and the data entities are mapped to the data event via data activities. Obviously, since one data event can map to multiple data entities, so too multiple data entities can map to multiple data events.
The goal is to identify what data must to be referenced, the context within which that data must be referenced, the data that must be archived, updated, added or deleted, and the context within which that data must be archived, updated, added or deleted.
Data Event Triggers
Each data event model graphically illustrates the data activities triggered by that data event. Each data event model is accompanied by a narrative, describing it in detail. These narratives, without describing who or how, detail the assumptions made about the data event. The trigger document or event is described in all its shapes and forms. The probable and possible data event trigger contents are described answering the questions: What does it look like, what does it contain, who originated it (where does it come from)?
After defining and describing the trigger the information of interest contained within it can be analyzed. In other words, what new data or information does the document or event provide us about our environment? Does it provide insight or new information about its originator? Does it identify some new relationship, either implicitly or explicitly? Is it useful to record that new data, information or relationship?
Obviously, in any organization any given data carrier (in our on-going example orders) can come in many forms. While generally similar, each form is different and must potentially be handled differently. Each contains a different combination of data elements. For every list of the different kinds of orders that can be developed, one or two additional kinds can always be added. This appears to imply that the design can never be complete. In a sense this correct, and in a sense it is not. Because the world and thus the environment is constantly changing, it can be assumed that contrary to previous "static" assumptions, the inputs to the company will also change. But while the form and some content may change, the personality of those inputs, because they are used to communicate specific kinds of information, should not change drastically, if at all. Data event modeling ignores the form of the trigger and concentrates on the content. Identifying the various forms only helps to focus attention on the many different ways that the same data can appear.
The Conceptual Data Event
The design must focus on the conceptual role of the data event and the data carrier. For instance, what is an order? It conveys to the company a request for some product or service provided by the company. It also conveys the identity of the requestor (the customer). It specifies conditions, terms and delivery information, and of course the specific as to what product or service is being requested, how many of each, what special options, etc. For the order to be of value, the firm must
This same order also initiates a variety of other activities, some of which will be performed immediately, and some which must be deferred for some reason to a later time. Data must be recorded to enable the firm to perform those later activities.
Data event actions only include the need to receive, validate and record. Little information is needed about the validation processing itself other than to state that validation is a comparison against other data, and to describe what that other data is, and the sequence in which it must be assembled to perform validation. The data event ends with the step that records the relevant, verified information in the files of the firm.
The data event model details each data access step for validating and recording.
The narrative which accompanies each model describes, in user terms, the sources, content and role of the data event trigger in the business activities and its immediate or deferred effect on each business activity. All activities which result from that data event should be identified as well as all conditions under which it can occur.
Data Event Categories
As the data events are modeled, they can be grouped into the following categories:
It is immaterial as to whether the event includes is a mental, manual or automated actions, or a combination of all of these. It represents a data event that can be simulated manually and mentally, if necessary. The data event records what attributes of each entity are necessary, are received, what attributes of each entity must to be referenced or updated, and above all, why.
Data Event Narratives
The narrative describes the flow of data not only from data action to data action within a data event, but also the flow of data from the data entities of the firm, and to the data entities of the firm. It describes where the data carrier comes from, what it contains, the data that must to be referenced to verify and validate it and where that data is located, and from the company's overall standpoint, the data that will be changed, added or deleted as a result of this data event. The narrative describes only those actions that can be accomplished with the data contained in this data carrier and with data already in the files of the firm.
The data event describes what must be done, not how. It describes:
The end result of the systems analysis process is the identification of all existing system components (both data and task) and the framework within which those components operate. The task descriptions were in terms of the task steps and the data needed by each step. Data descriptions were in terms of the current organization of the data (data framework).
Because the final components of the analysis (the tasks and data) are know, can be seen and can be enumerated the analysis phase can be viewed as an inductive process.
The end result of the examination and study phase identifies existing components which should remain unchanged, existing components requiring change, requirements for additional capability (but not the new components themselves), and existing components which should be eliminated.
Because some of the components of the examination and study phase are known and others must be derived (they do not yet exist) this phase (almost always driven by sets of rules) is both inductive (enumerative) and deductive (rule driven).
The design phase identifies the new placement of existing components (within the redesigned framework) and the new components (based upon identified requirements. This identification of new components while based upon identified requirements is predictive in that nothing exists. The new components are only modeled, or described, not built. To the extent that the requirements are understood, the new components will be relatively easy to design, to the extent that the requirements are vague or ill-defined the new components will be difficult to design.
The components which must be changed contain both existing and predictive pieces (the changes being the predictive parts) but since the changes are within an existing component framework or structure they are relatively easier to redesign than a completely new component.
Because designs are derived from logical processes (rule based) they are almost completely deductive. Opportunity analysis (the determine of additional benefits from the same set of resources or the attainment of synergy where the whole is greater than the sum of its parts) is also a deductive process.
By removing the existing task orientation of the components and replacing it with a data event orientation, all components can be reexamined on an equal basis, and can be portrayed in a common form.
Benefits of Data Event Orientation
The data event orientation has one additional benefit in that it ties together both data and data action into a unified whole. The data event models in total validate the remainder of the design. Users do not view data and process separately, they are two sides of the same coin, with one determining the other. Data events recognize this view and use it to build the new system design.
Because each data event is data initiated, and because the data event focuses on the actions necessitated by that data, they are a more stable form of representation, because the data of the firm is more stable than the process of the firm.
Because the data trigger or data carrier is treated in conceptual form rather than on a document by document basis processing can be more easily proceduralized and thus can be more generalized and standardized.
Data events can be identified simply and easily by repeatedly asking the same: What can happen that changes what I know about my environment.
Data events, because they are organized around, and derived from the activities within each individual entity life cycle are much more simplified than standard process generated transactions because each data event brings in information about one entity.
If the data event can be identified, then it is a simple matter to determine how the firm was notified that the event happened or to determine what the firm must know about the event. All data events have a trigger, tangible or intangible. All data events have something which tells us that it occurred. These "told-me"s are what told us that something of interest happened. If the designers cannot readily identify the "told-me" for a data event, they should easily be able to construct one.
Rule Based Design
A system design provides successive levels of detail and clarity to the collective set of user business operational requirement specifications. These user business operational requirement specifications can also be though of as business rules. They specify how the user wishes the business to operate and what business requirements the user wishes to have satisfied by the system when it is implemented.
These (business) rules cover both the actual physical and non-physical parts of the system design.
Because of the extensive reliance on rules, the non-physical part has been called the logical design (that is the statements of business rules).
The on-going and routine operations of the business can be viewed as a chain of predictable and semi-predictable events, actions and activities where each event and each condition has a set of rules which govern how it must be handled. Each item of data that enters the firm is also governed by a set of rules which determine when each item may be considered valid, how that validation must take place, where that data should be filed and what must happen to the data already filed which covers the same subject.
The business rules of the firm cover every condition and association, with respect to validity, acceptability, and verification. If these rules do not exist they must be created. If they are not complete they must be augmented. If they do not reflect current conditions they must be changed.
Categories of Data Events
Not all events are the same and although all events result in data being recorded, not all events are true data events. In fact not all events are the same, data or otherwise. Business events may be categorized into the following:
While these events encompass most if not all business activity and include processing, decision making, management, and operational activities, because we are designing business processing systems, and specifically business systems which process data, we are only interested at this point in the data processing portions of these events.
The designer's goal is to determine what activities the business would like to engage in, and more importantly what activities it must engage in. Those activities once identified must be organized into a reasoned, rational, logical (rule based) order with as little redundant effort as possible, with maximum management visibility and control as possible, such that each activity is performed in its necessary order from the perceived start of the overall business life cycle, to its perceived end.
Processes in the traditional sense are repetitive sets of activities with built in levels of freedom, interdependence and control. Data events have no built in levels of freedom, little if any interdependence and a high degree of control.
Data events have no decision making aspects. The data event is a schematic representation of a predetermined set of business rules whose execution is triggered by the occurrence of some event or the arrival of some data (also an event).
Documentation of Data Events
The data event and narrative answers some but not all of the following questions:
Data events also document, to the extent possible, the conditions which govern the handling each event. For instance:
Data events also cover internal actions such as changes in state, work readiness, work processed, credit approved or disapproved, item out of stock, item low in stock, etc.
All of these events are covered by business rules. which must be documented and which can be documented as data events.
Data Directed Systems Design - A Professional's Guide
Written by Martin E. Modell
Copyright © 2007 Martin E. Modell
All rights reserved. Printed in the United States of America. Except as permitted under United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a data base or retrieval system, without the prior written permission of the author.