A semi-automatic approach for workflow staff assignment - Faculty of ...

Y. Liu et al. / Computers in Industry 59 (2008) 463–476 471 

represents the activity number, which is defined in decreasing 

order of their training set sizes. 

In Fig. 7, the first row of subfigures (subfigure a–c) illustrates 

each activity’s training set size. The vertical axis represents the 

number of training samples for each activity. In the real 

situation, it reveals the number of times that the target activity 

has been executed. The second row of subfigures illustrates the 

number of classes for each training set. The vertical axis 

represents the class number. In our approach, the class is 

represented by actor’s identity. Therefore, this number 

demonstrates how many different actors have executed the 

target activity. The last row of subfigures depicts the number of 

features for each training set. The vertical axis is the feature 

number. Because we use previous activities of the target activity 

as features, this number demonstrates how many activities in a 

workflow model have the precedence relationship with the 

target activity. 

For example, in enterprise A (the first column of subfigures 

in Fig. 7), the property of the first activity corresponds to the left 

most pixel of each subfigure. This activity is a review activity of 

a CAD drawing design process, which is the second activity of 

the workflow model. 58 different actors executed it for 316 

times in the past. Therefore, the training set of this activity 

contains 316 training samples, 58 classes and 2 features (one 

feature for the actor of the first activity and the other for the 

initiator of the workflow instance). It is important to mention 

that, in Definition 3, the start node is also included in the 

precedence activity sequences to represent the instantiation 

event of workflow, so there is always one more feature than the 

number of activities. 

According to Fig. 7, we may find that the size of the 

training set is not the same for each activity, which implies a 

different execution frequency. However, this unbalanced 

execution frequency does not have much impact on the 

number of past actors (i.e., the second row of subfigures), 

which means the activities are designed to be undertaken by 

many actors. Moreover, judging from the number of features, 

we can see that the activities that need run-time staff 

assignment are located in different places of the workflow 

model. 

4.2. Prediction accuracy of classifiers 

Fig. 8 shows the prediction accuracy of 10-fold crossvalidation 

for the selected activities in three enterprises. In 

order to provide a reference, we connected the pixels of Naïve 

Bayes classifiers using dotted lines. As shown in the figure, the 

prediction accuracy is not the same for all the selected 

Fig. 8. Prediction accuracy of classifiers in three enterprises. 

activities. Some of them are well predicted by the generated 

classifier (>80%), but some of them are not. However, most 

classifiers have achieved a prediction accuracy that is greater 

than 50%. Just few activities’ staff assignment is poorly 

predicted in three enterprises. 

4.3. Overall performance of learning algorithms 

Table 4 lists the overall prediction accuracy of three 

algorithms. We use the percentage of correct predictions as a 

measure of overall performance. According to the result, in 

three enterprises, all the algorithms have achieved an overall 

prediction accuracy of well over 75%. SVM generally performs 

the best, but only marginally. 

Table 5 lists the training and testing time of three algorithms. 

The testing time of three algorithms is very quick, but the 

training time differs greatly. Naïve Bayes performs the best in 

terms of time consumption. In contrast, SVM’s training time is 

very long. 

4.4. Analysis of some special cases 

As a further investigation, we evaluated some special cases. 

Table 6 lists the properties of some worst predicted activities in 

three enterprises (the first, third and fifth rows). Their related 

workflow models are depicted in Fig. 9. As a comparison, we 

also provide some of the best predicted activities in Table 6 (the 

second, fourth and sixth rows). 

Table 4 

Overall prediction accuracy in three enterprises 

Enterprise Number of predictions Number of correct predictions Prediction accuracy (%) 

C4.5 Naïve Bayes SVM C4.5 Naïve Bayes SVM 

A 3,447 2,739 2,532 2,740 79.460 73.455 79.489 

B 13,483 10,669 10,294 10,712 79.129 76.348 79.448 

C 24,347 20,028 19,766 20,178 82.261 81.185 82.877

Previous page

Next page

1

2

3

4

5

6

7

8

9

10

11

12

13

14

A semi-automatic approach for workflow staff assignment - Faculty of ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?