4 Extensions to the Core Mapping

(1)

Formal Semantics and Automated Analysis of BPMN Process Models

^?

Remco M. Dijkman¹, Marlon Dumas², and Chun Ouyang²

1 Department of Technology Management, Eindhoven University of Technology, GPO Box 513, NL-5600 MB, The Netherlands

{r.m.dijkman}@tm.tue.nl

2 Faculty of Information Technology, Queensland University of Technology, GPO Box 2434, Brisbane QLD 4001, Australia

{m.dumas,c.ouyang}@qut.edu.au

Abstract. The Business Process Modelling Notation (BPMN) is a standard for capturing business processes in the early phases of systems development. The mix of constructs found in BPMN makes it possible to obtain models with a range of semantic errors. The ability to statically check the semantic correctness of models is thus a desirable feature for modelling tools based on BPMN. However, the static analysis of BPMN models is hindered by ambiguities in the standard specification and the complexity of the language. The fact that BPMN integrates constructs from graph-oriented process definition languages with features for concurrent execution of multiple instances of a subprocess and exception handling, makes it challenging to provide a formal semantics of BPMN.

Even more challenging is to define a semantics that can be used to analyse BPMN models. This paper proposes a formal semantics of BPMN defined in terms of a mapping to Petri nets, for which efficient analysis techniques exist. The proposed mapping has been implemented as a tool that generates code in the Petri Net Markup Language. This formalisation exercise has also led to the identification of a number of deficiencies in the BPMN standard specification.

1 Introduction

The Business Process Modeling Notation (BPMN) [11] has emerged as a standard notation for capturing business processes, especially at the level of domain analysis and high-level systems design. The notation inherits and combines elements from a number of previously proposed notations for business process modeling, including the XML Process Definition Language (XPDL) [16] and the Activity Diagrams component of the Unified Modeling Notation (UML) [10].

Like these predecessors, a key idea of BPMN is that process models are composed of: (i) activity nodes, denoting business events or items of work performed

?This work is partially supported by the Australian Research Council under Discovery Grant DP0451092. The second author was also funded by a Smart State Fellowship co-sponsored by Queensland Government and SAP.

(2)

by humans or by software applications; and (ii) control nodes capturing the flow of control between activities. Activity nodes and control nodes can be connected by means of a flow relation in almost arbitrary ways.

Languages that follow a similar paradigm, known as graph-oriented process definition languages, have been previously studied from a formal perspective (e.g.

the work on task structures [2]). It is known that models defined in this family of languages can exhibit a range of semantic errors, including deadlocks and livelocks. Furthermore, BPMN brings additional features not traditionally associated with graph-oriented languages, drawn from a range of sources including Workflow Patterns [5] and Business Process Execution Language (BPEL) [6], a standard for defining business processes at the implementation level. These features include the ability to define: (i) subprocesses that may be executed multiple times concurrently; (ii) subprocesses that may be interrupted as a result of exceptions; and (iii) message flows between processes. The interaction between these features and the more traditional features found in graph-oriented languages, increases the types of semantic errors that can be found in BPMN models. Hence, the ability to statically analyse BPMN models is likely to become a desirable feature for tools supporting process modelling in BPMN. Anecdotal evidence suggests that BPMN users are producing models with semantic errors that could be detected using existing verification technology.³

The semantic analysis of BPMN models is hindered by the heterogeneity of its constructs and the lack of an unambiguous definition of the notation. While syntactic rules are comprehensively documented in tables throughout the BPMN standard specification, the actual semantics is only described in narrative form using sometimes inconsistent terminology. This paper takes on the challenge of defining a formal semantics for a large subset of BPMN. The semantics is defined as a mapping between BPMN models and Petri nets. The choice of using plain Petri nets as a target for the mapping is motivated by the availability of efficient static analysis techniques. Thus, the proposed mapping not only serves the purpose of disambiguating the core constructs of BPMN, but it also provides a foundation to statically check the semantic correctness of BPMN models. To support this claim, we have implemented a tool that translates between the XML serialization of BPMN models supported by an existing BPMN tool, and the Petri Net Markup Language (PNML). The paper reports experiences in importing the resulting BPMN models into a Petri net analysis toolset for the purpose of performing semantic analysis.

The paper focuses on the control-flow perspective of BPMN, that is, the subset of the notation that deals with the order in which activities and events are allowed to occur. It does not deal with its non-functional features (i.e. artifacts, groups and associations) and organisational modeling features (i.e. lanes and pools). Also, the proposed mapping is specifically designed to produce Petri nets that are suitable for static analysis.

The rest of the paper is structured as follows. Section 2 provides an overview of BPMN and introduces an abstract syntax capturing the essence of the nota-

3 Seewww.brsilver.com/wordpress/2006/09/06/whats-wrong-with-this-picture/.

(3)

tion. Section 3 presents a mapping from BPMN to Petri nets. Possible exten- stions that can be made to this proposed mapping are discussed in Section 4.

Next, Section 5 addresses a number of deficiencies identified during the formalisation. Section 6 reports the corresponding tool implementation and its application to static analysis. Finally, related work is discussed in Section 7 while conclusions and future work are outlined in Section 8. A meta-model of BPMN defined during the tool implementation is included in an appendix.

2 Business Process Modeling Notation (BPMN)

2.1 Overview

BPMN essentially provides a graphical notation for business process modelling, with an emphasis on control-flow. It defines aBusiness Process Diagram (BPD), a kind of flowchart incorporating constructs tailored to business process modelling, such as AND-split, AND-join, XOR-split, XOR-join, and deferred (event- based) choice. A BPD is made up of BPMN elements. We consider a core subset of BPMN elements shown in Figure 1. There are objects and sequence flows.

An object can be an event, anactivity or agateway. A sequence flow links two objects in a BPD and shows the control flow relation (i.e. execution order).

Figure 1.A core subset of BPMN elements.

An event may signal the start of a process (start event), the end of a process (end event), a message that arrives (intermediate message event), a specific time-date being reached (intermediate timer event), or an error being detected (intermediate error event) during a process.

An activity can be atask or a subprocess. A task is an atomic activity and stands for work to be performed within a process. There are seven task types:

service,receive,send,user,script,manual, andreference. For example, a receive

(4)

task is used when the process waits for a message to arrive from an external partner. A subprocess is a compound activity in that it is defined as a flow of other activities. There areembedded subprocesses andindependentsubprocesses.

The difference is that an embedded subprocess is part of the process while an independent subprocess can be called by different processes. A subprocess can be invoked via asubprocess invocation activity.⁴

A gateway is a routing construct used to control the divergence and conver- gence of sequence flow. There are:parallel fork gateways(AND-split) for creating concurrent sequence flows, parallel join gateways (AND-join) for synchronizing concurrent sequence flows,data/event-based XOR decision gatewaysfor selecting one out of a set of mutually exclusive alternative sequence flows where the choice is based on either the process data (data-based, i.e. XOR-split) or external event (event-based, i.e. deferred choice), andXOR merge gateways(XOR-join) for join- ing a set of mutually exclusive alternative sequence flows into one sequence flow.

In particular, an event-based XOR decision gateway must be followed by either receive tasks or intermediate events to capture race conditions based on timing or external triggers (e.g. the receipt of a message from an external partner).

Finally, an intermediate message, timer, or error event attached to the boundary of an activity signals an exception. We use the term “exception event” to refer to such an event. The occurrence of the activity will be interrupted upon the occurrence of the exception, and the process execution along the normal sequence flow will switch to the exception flow at the point when exception occurs. Note that an intermediate error event on a normal sequence flow models “throwing” an error, while an error event attached on the boundary of the activity models “catching” an error. This is similar to the strictly hierarchical throw-catch mechanism used in most programming languages (e.g. Java).

2.2 Abstract Syntax of BPMN

A BPMN process, which is described by a BPD using the core subset of BPMN elements shown in Figure 1, is referred to as a core BPMN process. First, we define the syntax of a core BPMN process.

Definition 1 (Core BPMN Process). A core BPMN process is a tupleP = (O,A,E,G,T,S,T^R,E^S,EÎ,EÊ,EÎ^M,EÎ^T, EÎ^R,G^F,G^J,G^X,G^M,G^V,F, Cond,Excp)where:

– O is a set of objects which can be partitioned into disjoint sets of activities A, events E, and gateways G,

– Acan be partitioned into disjoint sets of tasksT and subprocess invocation activitiesS,

– T^R ⊆ T is a set of receive tasks,

– Ecan be partitioned into disjoint sets of start eventsE^S, intermediate events E^I, and end events E^E,

4 This is called acollapsed subprocess in the current BPMN specification [11].

(5)

– EÎ can be partitioned into disjoint sets of intermediate message events EÎ^M, intermediate timer eventsEÎ^T, and intermediate error eventsEÎ^R,

– G can be partitioned into disjoint sets of parallel fork gateways G^F, parallel join gatewaysG^J, data-based XOR decision gatewaysG^X, event-based XOR decision gatewaysG^V, and XOR merge gateways G^M,

– F ⊆ O ×Ois the control flow relation, i.e. a set of sequence flows connecting objects,

– Cond:F ∩(G^X× O)→ Cis a function which maps sequence flows emanating from data-based XOR gateways to conditions,⁵, and

– Excp:E^I 9A is a function assigning an intermediate event to an activity such that the occurrence of the event signals an exception and thus interrupts the performance of the activity.

A core BPMN process is a directed graph with nodes (objects) Oand arcs (sequence flows)F. For any nodex ∈ O, input nodes of x are given by in(x) = {y ∈ O |yFx} and output nodes of x are given by out(x) = {y ∈ O | xFy}.

Also, for a core BPMN processP, if ambiguity is possible, we useP as subscripts to each element defined in the tuple P. For example, SP refers to the set of subprocess invocation activities inP. Next, we define the syntax of a core BPMN model which may comprise a set of core BPMN processes.

Definition 2 (Core BPMN Model). A core BPMN model is a tuple M= (Q, top,S,map,HR)where:

– Qis a set of core BPMN processes, – top∈ Q is the top level process,

– S=∪_P∈QS_P is the set of all subprocess invocation activities,

– map: S → Q\{top} is an injective function which maps each subprocess invocation activity onto a core BPMN process, and

– HR={(P1,P2)∈ Q × Q | ∃_s∈S

P1map(s) =P2} is a connected graph.

Remark 1. The function map is defined as an injective function capturing di- rectly the concept of embedded subprocesses. For independent subprocesses, it is possible to call the same subprocess via different subprocess invocation activities in a BPMN model. In this case, we can generate multiple copies of the subprocess, assign different names to them, and then add them into the set Q.

This way the function map also captures the concept of independent subprocesses. Also note that BPMN does not forbid recursive calls, e.g., a process P1

invoking another processP2, andP2invokingP1. Thus, the relationship between a process and its subprocesses is an arbitrarily connected graph, as opposed to a tree or a directed acyclic graph (DAG).

5 C is the set of all possible conditions. A condition is a boolean function operating over a set of propositional variables. Note that we abstract from these variables in the control flow definition. We simply assume that a condition evaluates to true or false, which determines whether or not the associated sequence flow is taken during the process execution.

(6)

Definition 1 allows for graphs which are unconnected, not having start or end events, containing objects without any input and output, etc. Therefore we need to restrict the definition towell-formed core BPMN processes andmodels.

Note that these restrictions are without loss of generality and are to facilitate the definition of the mapping. It is possible to re-write, for example, an activity with multiple incoming arcs into an activity with only one incoming arc and preceded by an XOR merge gateway.

Definition 3 (Well-formed core BPMN Process). A core BPMN process P in Definition 1 is well formed if relationFsatisfies the following requirements:

– ∀ s ∈ E^S ∪dom(Excp), in(s) = ∅ ∧ | out(s) | = 1, i.e. start events and exception events have an indegree of zero and an outdegree of one,

– ∀ e∈ E^E, out(e) =∅∧ |in(e)|= 1, i.e., end events have an outdegree of zero and an indegree of one,

– ∀x ∈ A∪(E^I\dom(Excp)),|in(x)|= 1and|out(x)|= 1, i.e. activities and non-exception intermediate events have an indegree of one and an outdegree of one,

– ∀ g ∈ G^F ∪ G^X ∪ G^V: | in(g) | = 1 ∧ | out(g) | > 1, i.e. fork or decision gateways have an indegree of one and an outdegree of more than one, – ∀ g ∈ G^J ∪ G^M, |out(g)| = 1 ∧ |in(g)| >1, i.e. join or merge gateways

have an outdegree of one and an indegree of more than one,

– ∀g∈ G^V,out(g)⊆ E^I^M∪ E^I^T∪ T^R, i.e. event-based XOR decision gateways must be followed by intermediate message or timer events or receive tasks, – ∀ g ∈ G^X,∃ an order <g which is a strict total order over the set of flows

{g} ×out(g), and ∃ x ∈ out(g) such that ¬ ∃_f∈{g}×out(g)((g,x)<gf), i.e.

(g,x)is the default flow among all the outgoing flows from g,

– ∀x ∈ O,∃s ∈ E^S∪dom(Excp),∃e∈ E^E, sF^∗x ∧xF^∗e,⁶ i.e. every object is on a path from a start event or an execption event to an end event.

Definition 4 (Well-formed core BPMN Model). A core BPMN modelM given in Definition 2 is well-formed, iff Q is a set of well-formed core BPMN processes and the relationHR is a DAG.

3 Mapping BPMN onto Petri Nets

We establish a mapping of well-formed core BPMN models to Petri nets. We allow the usage of both labelled and non-labelled transitions. The labelled transitions model tasks and events. The transitions without a label (“silent” transitions) capture internal actions that cannot be observed by external users.

6 F^∗is a reflexive transitive closure ofF, i.e.xF^∗yif there is a path fromx toy and by definitionxF^∗x.

(7)

3.1 Task, Events, and Gateways

Figure 2 depicts the mapping from a set of BPMN task, events, and gateways to Petri-net modules. A task or an intermediate event is mapped onto a transition with one input place and one output place. The transition, being labelled with the name of that task or event, models the execution of the task or event. A start or end event is mapped onto a similar module except that a silent transition is used to signal when the process starts or ends. Gateways, except event-based decision gateways, are mapped onto small Petri-net modules with silent transitions capturing their routing behaviour. These mappings, as shown in Figure 2, are straightforward. For an event-based gateway, the race condition between events or receive tasks is captured in a way that all the corresponding event/task transitions compete for the token available in the output place from the mapping of the gateway’s input object. Finally, places, which are drawn in dashed borders, indicate that their usage is not unique to one module. They are used to link the Petri net modules of two connecting BPMN objects and therefore are identified by both objects. Generally, any sequence flow is mapped onto a place except for event-based decision gateways.

Figure 2.Mapping task, events, and gateways onto Petri-net modules.

3.2 Subprocess without Exception Handling

A subprocess may be viewed as an independent BPMN process. Without considering exception handling, the mapping of a subprocess is straightforward, as shown in Figure 3. Figure 4 depicts the mapping of calling a subprocess (P) via a subprocess invocation activity (SI). The two places drawn in dashed borders capture respectively the incoming and the outgoing flows of activity SI. There

(8)

are also two new transitions: one with an identifier t_(SI_,call₎modelling the invocation of the subprocess P, the other with t(SI,return) modelling the return of the flow to the parent process afterP has completed.

Figure 3.Mapping of a subprocessP without considering exception handling.

Figure 4.Calling a subprocessPvia a subprocess invocation activitySI.

It can be observed that the above subprocessP has a single start event and a single end event. For a well-formed core BPMN process, there may be multiple start events as well as multiple end events. A BPMN process with multiple start events can be transformed into one with a unique start event by using an event-based XOR decision gateway. Therefore, without loss of generality, we can assume that each (sub)process in the rest of the mappings has a single start event. On the other hand, the semantics of a process with multiple end events is not clear in the current BPMN specification (see further discussion in Section 5), and thus we have to restrict to a (sub)process with a single end event only.

3.3 Exception Handling

In BPMN, exception handling is captured by exception flow. An exception flow originates from an exception event attached to the boundary of an activity, which is either a task or a subprocess. Figure 5 shows the mapping of an exception associated with a task. Given that a task is an atomic activity, the occurrence of the exception may only interrupt the normal flow at the point when it is ready to execute the task. Hence, this can be viewed as that there is a race condition between the execution of the task and the occurrence of the exception.

Figure 5.Mapping of a taskT with an exception flow originating from eventEx.

For exception handling associated with a subprocess, the occurrence of the exception event will interrupt the execution of the normal flow within the subprocess when it is active (running). The mapping is then complicated by the fact

(9)

that it needs to capture the cancellation of the running subprocess atany point when the exception occurs. This means that, in the mapping of this subprocess, when the transition modelling the exception event fires, all the tokens left in the net need to be removed from various places without knowing where the tokens reside. Due to the local nature of Petri net transitions, it is cumbersome to model a so-called “vacuum cleaner” removing tokens from selected parts of the net [3].

Hence, to model the cancellation of a subprocess, we adopt another approach of bypassing tasks and events in the subprocess upon occurrence of the exception.

The basic idea is to attach a status flag to the subprocess, which may have a value ofokornok. If the flag is set took, it allows the normal flow to continue;

otherwise (the flag is set tonok), it signals the occurrence of the exception, and thereby stops the normal flow but enables to bypass the remaining tasks and events until the end of the subprocess. This way we direct all the tokens left in the various places in the net to flow to the end of the subprocess, without executing any remaining tasks or events on the way. After the above bypassing finishes, the normal flow will be withdrawn before the exception handling starts.

However, the above mapping of exception handling does not work properly in the case where an activity within the subprocess is enabled multiple times concurrently. In such a case, the activity would need to be bypassed multiple times (as many times as it is enabled) and thus a counter would be required to record how many times the activity needs to be bypassed. Hence, we have to impose two restrictions when mapping subprocesses with exception handling.

First, we will be able to map a subprocess with exception handling, only if this subprocess (without considering the exception handling) can be mapped onto a1-safe Petri net. For a subprocess that does not satisfy this condition, we cannot ensure in the mapping that all enabled tasks/events in the subprocess are correctly bypassed (i.e. cancelled) before the exception handling starts. Thus, given a subprocess with exception handling, we will first translate it into a Petri net; then, we will check whether this Petri net is 1-safe, and only if it is 1-safe, we would be able to translate the associated exception handling.

Second, the fact that a subprocess on itself is 1-safe ensures that, assum- ing the subprocess is not executed multiple times concurrently, none of its tasks/events will ever be executed multiple times concurrently. However, we still need to ensure that, once the subprocess has been invoked, it is not invoked again until the first invocation has completed. This condition may be violated as a result of the “unsafeness” coming from “upstream” in the process model.

That is, we need to ensure that the fragment of the model that precedes the subprocess invocation is also 1-safe. This scenario is different from the previous one in that the cause for multiple concurrent enablements/executions of a given task/event is external to the subprocess. Rather than excluding these scenarios from the mapping, we propose to map such scenarios into a Petri net which would prevent a subprocess from being executed multiple times concurrently. This is achieved by introducing a “blocking mechanism” which effectively withholds a new execution of the subprocess until its previous execution has completed. This mechanism may be extended in such a way that a subprocess is allowed to be

(10)

executed a bounded number of times concurrently. However, we will not be able to discuss such extension in this paper, as further effort on formalising this extended mechanism revealed that the semantics of the exception handling for concurrent executions of a subprocess is not clear in the BPMN specification (see detailed discussion in Section 5).

Figure 6 depicts the mapping of an exception flow associated with a subpro- cessP via a non-error exception eventEx. The two places,p_(P_,ok)andp_(P_,nok), model theok andnokvalues of the status flag attached toP, respectively. As soon asP starts, the flag is set took, and each task or event along the normal flow inP needs to check this value (via the bidirectional arc top_(P,ok)) before it can be executed. If the exceptionEx occurs before subprocessP ends, the value of the flag will change fromoktonok. As a result, any remaining task or event in P will be skipped (e.g. transition t_(T,skip) models the skipping of task T).

Finally, just before reaching the end of P, the flow switches to the exception handling (which starts with taskTx) via transition t_(P_,excp). The occurrence of t_(P,excp) also clears the nok value of the status flag. Whereas, if no exception occurs, the flag will remainokuntil the end of subprocessP. Note that there is another place namedp_(P,excp)of which the details will be discussed shortly.

Figure 6.Mapping of a subprocessPassociated with an exception flow via a non-error exception eventEx.

In Figure 6, the net enclosed within the dotted box, which is the mapping of subprocessP without considering the exception handling, must be a 1-safe net.

Also, we restrict ourselves to the case where the subprocess is only allowed to be executed at most once at a time. According to this, a place named p_(P_,enabled) and its surrounding arcs are added to ensure that a new execution ofP will have to wait until the previous execution ofP finishes before it can start. Intuitively, the placep(P,enabled)can be viewed as holding a “resource” for execution of the subprocessP. This resource is created when the top level process starts and will be collected when the top level process ends.

Note that a special case of exception handling is the “throw-catch” error exceptions. Figure 7 shows the corresponding mapping as an invariant of the mapping shown in Figure 6. As mentioned before, an error event (Ex) attached to the boundary of a subprocess models “catching” an error exception, and it must have a matching error event (Et) on the normal flow of the subprocess which models “throwing” that error exception. Therefore, bothEt andExsignal

(11)

the occurrence of one error exception, and can be viewed as one atomic action.

They are modelled by one transition namedTEx in Figure 7.

Figure 7. Mapping of a subprocessP associated with an exception flow via “throw- catch” error exception.

Finally, taking into account the exception handling, we may need to extend the mapping of a subprocess with cancellation. Assume that a subprocess P is nested within another subprocessP⁰(i.e.P⁰ is the “parent” ofP). The execution ofP may be cancelled at any point due to the cancellation ofP⁰, despite whether or not there is an exception associated withP. Figure 8 shows the corresponding mapping. The transitiont_(P,cancel) is used to capture the trigger for cancelling the execution ofP, i.e. the cancellation ofP’s parent subprocessP⁰. Note that each task or event inP also needs to check theokstatus ofP⁰ (via the bidirectional arc top_(P0,ok)). This is to ensure that onceP⁰ is cancelled the execution of P stops immediately. In Figure 8, when the bypassing has finished, the flow still continues along the normal flow (via transitiont_(P_,nok)), as opposed to the mapping of exception handling in Figure 6 where the flow switches to the exception flow (via transitiont_(P_,excp)) at that point. Also, it is necessary to add two placesp_(P_,excp) andp_(P,cancel)to ensure correct execution of transitiont_(P,excp) ort_(P,nok). More generally, the execution of subprocessP may be cancelled due to the cancellation of one of its “ancester” subprocessP⁰⁰ (e.g. the “parent” of subprocessP⁰). In this case, the cancellation can be viewed as propogated from, e.g.,P⁰⁰ toP⁰ and thenP⁰ toP. The mapping is given in the next subsection.

Figure 8. Mapping of a subprocessP that may be cancelled due to the cancellation of its parent subprocessP⁰.

(12)

3.4 Formal Definition of the Mapping

We formally define the mapping of BPMN to Petri nets (using set notations).

To facilitate the definition, we introduce two auxiliary functions par and anc.

LetM = (Q, top, S, map, HR) be a well-formed core BPMN model. For any subprocessP ∈ Q,par(P) is the parent process ofP (i.e. (par(P),P)∈HR), and anc(P) ={P⁰ | P⁰HR⁺P}⁷ is the set of ancester processes ofP. Also, we define two predicates_HasEHand_AncHasEH, both operating over a set of subprocesses.

For any subprocessP ∈ Q,_HasEH(P) is used to indicate if there is an exception associated with P. It returnstrue iff map^-1(P) ∈range(Excp_par(P)). Similarly, AncHasEH(P) is used to indicate if there is an ancester process ofP which has an exception. It returnstrueiff∃ P⁰ ∈anc(P) such that_HasEH(P⁰) is true.

Definition 5 (Petri net semantics of well-formed core BPMN models).

Let M= (Q, top,S, map,HR) be a well-formed core BPMN model (given in Definition 4). Without considering exception handling, Mcan be mapped onto a preliminary Petri net PN⁰= (P⁰,T⁰,F⁰)where:

P⁰=S

P∈Q({ps|s∈ E_P^S} ∪ – start event

S

P∈Q({pe|e∈ E_P^E} ∪ – end event

S

P∈Q({p(x,y)|(x,y)∈ FP∧x6∈ G^V_P}) – sequence flow

T⁰=S

P∈Q({x|x∈ TP∪ E_P^I ∧in(x)6=∅} ∪ – task/interm.-event S

P∈Q({tx|x∈ G_P^F∪ G_P^J} ∪ – fork/join

S

P∈Q({t(x,y)|x∈ G^X_P∧y∈out(x)} ∪ – data decision

S

P∈Q({t_(x,y)|x∈ G^M_P ∧y∈in(x)} ∪ – merge

S

P∈Q({ts|s∈ E_P^S} ∪ – to start a process

S

P∈Q({te|e∈ E_P^E} ∪ – to end a process

S

P∈Q({t_(x,call)|x∈ SP} ∪ – subprocess call

S

P∈Q({t(x,return)|x∈ SP}) – subprocess return

F⁰=S

P∈Q({(p_(x,y),y)|y∈ TP∪ E_P^I ∧x∈in(y)\G_P^V} ∪ S

P∈Q({(y,p_(y,z))} |y∈ TP∪ E_P^I ∧in(y)6=∅∧z∈out(y)} ∪ – task/interm.-event S

P∈Q({(p(x,y),t_y)} |y∈ G^F_P∪ G_P^J ∧x∈in(y)} ∪ S

P∈Q({(t_y,p_(y,z))} |y∈ G_P^F∪ G_P^J ∧z∈out(y)} ∪ – fork/join S

P∈Q({(p_(x,y),t_(y,z))|y∈ G^X_P∧x∈in(y)∧z∈out(y)} ∪ S

P∈Q({(t(y,z),p_(y,z))|y∈ G_P^X∧z∈out(y)} ∪ – data decision S

P∈Q({(p_(x,y),t_(y,x))|y∈ G_P^M∧x∈in(y)} ∪ S

P∈Q({(t_(y,x),p_(y,z))|y∈ G^M_P ∧x∈in(y)∧z∈out(y)} ∪ – merge S

P∈Q({(p_(x,y),z)|y∈ G_P^V∧x∈in(y)∧z∈out(y)} ∪ – event decision S

P∈Q({(ps,t_s)|s∈ E_P^S} ∪ S

P∈Q({(ts,p_(s,y))|s∈ E_P^S ∧y∈out(s)} ∪ – process start

S

P∈Q({(p_(x,e),te)|e∈ E_P^E∧x∈in(e)} ∪ S

P∈Q({(te,p_e)|s∈ E_P^E} ∪ – process end S

P∈Q({(p_(x,y),t_(y,call))|y∈ SP∧x∈in(y)} ∪ S

P∈Q({(t(y,call),p_s)|y∈ SP∧s∈ E_map(y)^S } ∪ – subprocess call S

P∈Q({(p_e,t_(y,return))|y∈ SP∧e∈ E_map(y)^E } ∪ S

P∈Q({(t(y,return),p_(y,z))|y∈ SP∧z∈out(y)}) – subprocess return

Then, by taking into account exception handling,M is mapped onto a Petri net PN_M= (P_M,T_M,F_M)where:

7 HR⁺ is a (non-reflexive) transitive closure, i.e.xHR⁺y⇒x 6=y.

(13)

PM=P⁰∪S

P∈Q∧(HasEH(P)∨AncHasEH(P)){p_(P,ok),p_(P,nok)} – status flag P⁰∪S

P∈Q∧(HasEH(P)∨AncHasEH(P)){p(P,enabled)} – resource place P⁰∪S

P∈Q∧HasEH(P){p(P,excp)} – exception P⁰∪S

P∈Q∧AncHasEH(P){p(P,cancel)} – cancellation TM=T⁰∪S

P∈Q{x|x∈(E^I_P\E_P^IR)∩dom(Excp_P)} – non-error exception P⁰∪S

P∈Q∧(HasEH(P)∨AncHasEH(P))

T⁰∪S

{t_(x,skip)|x∈ TP∪ E_P^I ∧in(x)6=∅} – skip task/interm.-event

P⁰∪S

P∈Q∧HasEH(P){t_(P,excp)} – exception P⁰∪S

P∈Q∧AncHasEH(P){t_(P,cancel),t_(P,nok)} – cancellation FM=F⁰∪S

P∈Q{(p_(x,y),z)|y∈ TP∧x∈in(y)∧z=Excp^-1_P(y)}

P⁰∪S

P∈Q{(z,p_(z,x))|(∃_y_∈TPz=Excp^-1_P(y))∧x∈out(z)} – exception@task P⁰∪S

P∈Q∧(HasEH(P)∨AncHasEH(P)) –↓exception@subprocess P⁰∪S

({(ts,p_(P,ok))|s∈ E_P^S} ∪ {(p(P,ok),t_e)|e∈ E_P^E} ∪ – okstatus P⁰∪S({(p_(x,y),t_(y,skip))|y∈ TP∪ E^I_P∧x∈in(y)\G^V_P} ∪

P⁰∪S

({(p(z,x),t_(y,skip))|x∈ G_P^V∧y∈out(x)∧z∈in(x)} ∪

P⁰∪S({(t_(y,skip),p_(y,z))|y∈ TP∪ E_PÎ ∧in(y)6=∅∧z∈out(y)} ∪ – skip task/interm.-event P⁰∪S({(p(P,ok),y)|y∈ TP∪ E_PÎ\E_PÎR∧in(y)6=∅} ∪

P⁰∪S({(y,p_(P,ok))|y∈ TP∪ E_P^I\E_P^IR∧in(y)6=∅} ∪ – check okstatus P⁰∪S

({(p(P,nok),t_(y,skip))|y∈ TP∪ E_P^I ∧in(y)6=∅} ∪

P⁰∪S({(t_(y,skip),p(P,nok))|y∈ TP∪ E_P^I ∧in(y)6=∅} ∪ – check nokstatus P⁰∪S

({(ts,p(P,enabled))|s∈ E_top^S } ∪ P⁰∪S

({(p(P,enabled),t_e)|e∈ E_top^E } ∪ – create/collect resource

P⁰∪S

({(p(P,enabled),t_s)|e∈ E_P^S} ∪

P⁰∪S({(te,p(P,enabled))|e∈ E_P^E})∪ – consume/release resource

P⁰∪S

P∈Q∧HasEH(P)

P⁰∪S

({(p(P,ok),x)|(Excp_par(P)(x) =map^-1(P)∧x6∈ E^IR_par(P)) P⁰∪S({(p(P,ok),x)| ∨x∈ E_P^IR\dom(Excp_P)} ∪

P⁰∪S({(x,p(P,nok))|(Excp_par(P)(x) =map^-1(P)∧x6∈ E^IR_par(P))

P⁰∪S({(x,p(P,nok))| ∨x∈ E_P^IR\dom(Excp_P)} ∪ – status-change@exception P⁰∪S({(x,p(P,excp))|(Excp_par(P₎(x) =map^-1(P)∧x6∈ E_par(P)^IR )

P⁰∪S({(x,p(P,excp))| ∨x∈ E_P^IR\dom(Excp_P)} ∪ P⁰∪S

({(p(P,excp),t_(P,excp))} ∪ – mark/unmark exception

P⁰∪S({(p_(P,nok),t_(P,excp))} ∪ P⁰∪S

({(p(x,e),t_(P_,excp))|e∈ E_P^E∧x∈in(e)} ∪ – clear-nok@exception P⁰∪S({(t(P,excp),p_(x,y))|Excp_par(P)(x) =map^-1(P)∧y∈out(x)} ∪ – switch to exception flow

P⁰∪S({(t_(P,excp),p(P,enabled))}) – release-resource@exception

P⁰∪S

P∈Q∧AncHasEH(P)

P⁰∪S

({(p(P,ok),t_(P,cancel)),(t_(P,cancel),p_(P,nok))} ∪ – status-change@cancellation P⁰∪S({(t_(P,cancel),p_(P,cancel)),(p_(P,cancel),t_(P,nok))} ∪ – mark/unmark cancellation P⁰∪S

({(p(P,nok),t_(P,nok))} ∪

P⁰∪S({(p_(x,e),t_(P_,nok))|e∈ E_P^E∧x∈in(e)} ∪ – clear-nok@cancellation P⁰∪S

({(t(P,nok),p_e)|e∈ E_P^E} ∪ – continue@normal-flow

P⁰∪S({(t(P,nok),p(P,enabled))} ∪ – release-resource@cancellation

P⁰∪S(S

P0 ∈anc(P)∧(HasEH(P0)∨AncHasEH(P0))

P⁰∪S(S

({(p_(P0_,ok),y)|y∈ TP∪ E_P^I ∧in(y)6=∅} ∪ P⁰∪S(S

({y,(p_(P0,ok))|y∈ TP∪ E_P^I ∧in(y)6=∅})∪ – check okof ancesters

P⁰∪S({(p(par(P),nok),t(P,cancel))}) – trigger-cancellation@parent

4 Extensions to the Core Mapping

In this section, we discuss how to extend the scope of the mapping presented in the previous section to cover several advanced BPMN constructs, and also to capture the behaviour of interacting BPMN processes.

(14)

4.1 Activity Looping, Multiple Instances, and OR-Split Gateway

In BPMN, an activity may have attributes specifying its additional behaviour, such as looping and parallel multiple instances. Activity looping constructs capture both “while-do” and “do-until” loops. These are “macros” for structured looping of the activity, and thus can be replaced by the normal activity construct connected via sequence flow looping. Parallel multi-instance activity constructs reflect the programming construct “for-each”. In this case, the number of multiple instances (n) running in parallel is always known as a priori, either at design time or at runtime. Ifn is known at design time, the constructs are “macros” for concurrent executions of n copies of the activity, and thus can be replaced by n normal activity constructs enclosed within a pair of AND split and join gateways. Ifn is only known at run time, to capture the computation of n (at run time) it is necessary to apply some high-level net contructs (e.g. arc expressions in Coloured Petri nets), which is however beyond the scope of this paper.

An OR-split gateway is known as aninclusive OR decision gatewayin BPMN.

It is used for selecting any number of branches among all its outgoing flows. Since the behaviour of an OR-split gateway can be captured through a combination of AND-split and XOR-split gateways [5], the mapping can be achieved accordingly.

4.2 Interacting Processes

In BPMN, the communication between two interacting processes is captured via message flows. The two processes are located respectively within two separate pools, representing two participants (e.g. business entities or business roles). A message flow is used to show the transmission of messages between two participants via communication actions such as send task, receive task, or message event. In an abstract way, it can be mapped to a place with an incoming arc from the transition modelling the send action and an outgoing arc to the trasition modelling the receive action. An example of two interacting BPMN processes and their mapping to Petri nets are shown in Figure 9. It should be mentioned that a detailed extension to the proposed mapping to cover inter-process communication features is the subject of separate work.⁸

Figure 9.Mapping of two interacting BPMN processes.

8 See Consistency Checker tool and documentation athttp://is.tm.tue.nl/staff/

rdijkman/cbd.html

(15)

5 Issues in the BPMN Specification

During the formalisation, we identified a number of deficiencies in the BPMN specification. Below, we outline the most salient ones and discuss options for resolving them.

Completion of process model executions The BPMN specification does not clearly state when should an execution of a process model be considered to be “completed”. This is particularly an issue for process models with multiple end events since many options are possible in this case, e.g. is it enough that one end event occurs for the process instance to be completed, or should we wait for all end events to occur, or should we wait until no end event can be reached any more?

We could only find in the specification one statement regarding this issue: “the process MUST NOT end until all parallel paths have completed”. However, the notion of “parallel path” is not defined, nor is the notion of “completion of a path”. Completion policies for process models with multiple end tasks have been studied in [13]. This paper formalizes the notion that “an instance of a process model is completed when at least one of its end tasks has been executed at least once, and there is no other enabled task for that process instance”. We suggest that the BPMN specification should adopt this approach to define its completion policy. Interestingly, the adoption of this completion policy in BPMN would make it impossible to translate unsafe BPMN model into Workflow nets as ex- plained in [13]. Such models would need to be translated into nets that may have multiple “sink places” (i.e. the place with no outgoing arcs).

Exception Handling for Concurrent Subprocess Instances The BPMN specification is also unclear regarding the semantics of an exception handler attached to a parallel multi-instance activity that invokes a subprocess. It is not clear from the BPMN specification whether an exception thrown by an instance of such a subprocess and caught by an exception handler attached to the parallel multi-instance activity, should interrupt: (i) only the subprocess instance in question; or (ii) all instances of that subprocess. If the first of these two options was adopted, another ambiguity would need to be resolved, namely: should the interrupted instance of the subprocess be considered as “completed” for the purpose of determining the completion of the multi-instance activity in question? Indeed, a multi-instance activity that invokes a subprocess is completed, by default, if all the subprocess instances it spawns have completed.

Also, a subprocess may be executed multiple times as a result of unsafeness in the parent process model. If a process is not 1-safe, it may happen that one of its activities invokes a subprocess once, and while the subprocess instance spawned by this invocation is still executing, the same activity is executed again and thus invokes the subprocess a second time, thus leading to two subprocess instances that execute concurrently. Again, if an exception is thrown by one of these instances and is caught by an exception handler attached to the invocation activity, it is unclear whether this exception would only affect that subprocess instance, or all subprocess instances spawned by the invocation activity.

(16)

OR-Join Gateway The BPMN specification states that an OR-join gateway “will wait for (synchronize) all Tokens that have been produced upstream” and that the “Process flow SHALL continue when the signals (Tokens) arrive from all of the incoming Sequence Flow that are expecting a signal based on the upstream structure of the Process”. However, the notion of “upstream” is unclear, especially when the OR-join is part of a cycle in the process model, in which case the OR-join is “upstream” with respect to itself. Thus, situations may occur in which the firing of a given OR-join depends on whether or not this same OR-join may eventually fire, leading to a vicious cycle. The semantics of OR-join gateways has been extensively studied for other process modeling languages, most notably YAWL [17]. It is perhaps best for the BPMN specification to adopt this same semantics.

6 Analysis of BPMN Models

The mapping from BPMN to Petri nets presented in Section 3 serves as a specification for a tool that transforms BPMN models into Petri nets. This section shows how we implemented such a tool (Section 6.1) and how it can be used to semantically analyse the BPMN models (Section 6.2).

6.1 Tool Implementation

Figure 10 shows the structure of the tool that we implemented. The tool uses standard file formats to keep it as open as possible. It is freely available at http://is.tm.tue.nl/staff/rdijkman/cbd.html#transformer.

Figure 10.Structure of the BPMN to Petri net transformation tool.

The tool takes an XML Metadata Interchange (XMI) [9] file that contains the model as input. XMI is a standardized file format for storing models, such that if there is agreement on the meta-model, the XMI is tool independent. In that way all tools that conform to the XMI standard and a meta-model can seamlessly exchange models (which are instances of that meta-model). Seamless

(17)

exchange of models would also be possible between our transformation tool and graphical modelling tools. However, to the best of our knowledge, no meta-model has been standardized for BPMN yet. Pending such a standard we defined our own meta-model, which follows straightforwardly from the BPMN specification.

This meta-model is given and illustrated in the Appendix.

We use the ILog BPMN Modeller⁹ as a graphical editor to create BPMN models. Since this tool does not generate standard XMI output, we need a simple pre-processor to transform the tools output into XMI.

The transformation tool subsequently transforms the BPMN model into a Petri net and exports the Petri net as a PNML file. PNML [8] is a standardized file format for storing Petri net models. PNML files can be read by a number of Petri net modelling and analysis tools.

6.2 Analysis and Examples

We used ProM [15], which accepts PNML files as input, to semantically analyse the BPMN models. ProM can check for the following properties:

– Absence of dead tasks: this is to ensure that there are no tasks that can never be performed within a model. It can be checked through the absence of dead transitions within the corresponding net.

– Absence of incomplete process executions: this is to ensure that a process instance can always complete, and that when the process instance completes, no tasks of that process instance are left active. This property can be checked through the absence of deadlocks and the absence of states with trash tokens in the corresponding net. A trash token is a token that remains in the net when the corresponding process has already completed. Such a token signi- fies that, even though the process has completed, tasks may be active. The marking representing the process completion is a marking where only the sink place is marked and all the other places are empty. Any dead marking that is different from this marking represents either a deadlock (the sink place is unmarked and no transitions can be enabled) or a state with trash tokens (in addition to the sink place, some other places are also marked).

We tested these properties for a selection of BPMN models that can be downloaded along with the tool. Figure 11 shows three of these BPMN models and their mappings to Petri nets as obtained from our transformation tool. The first process, an order process in (a), may not complete properly. If the credit card check fails, the process will complete but the preparation of products still remains active. The second process, a travel itinerary process in (b), does not complete at all. The reason is that initially there is a choice toeither confirm the itinerary or discuss it with the client, while the process only completes if both these tasks are executed. The third process, an answer process in (c), contains dead tasks, because an e-mail is never sent. This might indicate a design error, e.g., the designer forgot to draw a flow between the two data-based decision gateways at the start of the process.

9 http://www.ilog.com/products/jviews/diagrammer/bpmnmodeler/

(18)

Figure 11.Examples of BPMN process models and transformations to Petri nets.

7 Related Work

To the best of our knowledge, this paper documents the first attempt at defining a formal semantics for a comprehensive subset of BPMN. On the other hand, formal semantics have been defined for other informal languages that share common features with BPMN. For example, [2] defines a mapping from a language called workflow task structures into Workflow nets while [1] provides a similar mapping for Event-driven Process Chains (EPCs). Task structures are composed of tasks, AND split and join gateways, and XOR split and join gateways. Task structures support subprocess invocation, but not exception handling or multiple instances of subprocesses as in BPMN, thus making their mapping to Petri nets easier. A task structure can have multiple sink tasks, like BPMN can have multiple end events. In task structures the intended termination semantics is that ofimplicit termination as defined in [13] – that is, an instance of the process model is considered to be completed when one of the sink tasks has been performed and no other task is active or enabled. To map this feature into Petri nets, the mapping defined in [2] uses so-called “shadow places” that keep track of the number of active parallel streams. Termination is detected once the number of streams goes back to zero. However, this solution only works when the resulting net is bounded and in addition, it uses weighted arcs with potentially large weights. This idea can be used to extend the proposed BPMN mapping to deal with implicit termination, but it has an adverse effect on the complexity of analysis algorithms due to the use of weighted arcs.

Next, the mapping of EPCs in terms of Workflow nets (a subclass of Petri nets) provided in [1] is similar to the one for task structures discussed above.

The main difference is that EPCs have OR-join connectors, which is present as OR-join gateways in BPMN.

(19)

BPMN also shares many common features with Business Process Execution Language [7] for which a number of formal semantics in terms of Petri nets and other formal models of concurrency have been defined [12, 14]. In particular, BPEL provides an equivalent to XOR and AND gateways, although in a block-structured manner, i.e., every AND-split (XOR decision) gateway has a corresponding AND-join (XOR merge gateway) with which it forms a block that has a single entry and a single exit point. BPEL also supports error handling using a throw-catch style. The Petri net mapping for exception handling in BPMN given in this paper is partially based on the one developed for BPEL in [14].

8 Conclusions

The BPMN standard specification is relatively detailed when it comes to specifying syntactic constraints on BPMN models, but it is unsystematic and sometimes inconsistent when it comes to defining their semantics. The lack of formal semantics of BPMN hinders on the development of tool support for checking the correctness of BPMN models from a semantic perspective. This paper has taken a first step to address this gap by providing a mapping from a comprehensive subset of BPMN to Petri nets. The mapping has been implemented in a tool and its application to verifying the soundness and liveness of BPMN models has been tested using the ProM framework. In addition, this formalisation exercise has permitted us to unveil a number of issues in the BPMN specification and to suggest possible solutions.

The proposed mapping does not fully deal with: (i) parallel multi-instance activities; (ii) exception handling in the context of subprocesses that are executed multiple times concurrently; and (iii) OR-join gateways. These three missing features coincide with the limitations of Petri nets that motivated the design of YAWL [4]: a workflow definition language that extends Petri nets with a number of high-level features. In future work we plan to adapt the proposed mapping so that it can generate YAWL nets, especially in those cases where a translation to Petri nets is not feasible. The resulting YAWL nets can then be analysed using techniques such as those described in [17]. Of course, the tradeoff is that verification of YAWL nets is computationally more complex than the corresponding verification problems on Petri nets.

References

1. W.M.P. van der Aalst. Formalization and Verification of Event-driven Process Chains. Information and Software Technology, 41(10):639–650, 1999.

2. W.M.P. van der Aalst and A.H.M. ter Hofstede. Verification of Workflow Task Structures: A Petri-net-based Approach. Information Systems, 25(1):43–69, 2000.

3. W.M.P. van der Aalst and A.H.M. ter Hofstede. Workflow patterns: On the expres- sive power of (Petri-net-based) workflow languages (invited talk). InProceedings of 4th Workshop on the Practical Use of Coloured Petri Nets and CPN Tools (CPN 2002), volume 560 ofDAIMI, pages 1–20. University of Aarhus, Demark, 2002.