17.04.2015 Views

DARPA ULTRALOG Final Report - Industrial and Manufacturing ...

DARPA ULTRALOG Final Report - Industrial and Manufacturing ...

DARPA ULTRALOG Final Report - Industrial and Manufacturing ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

A DI value greater than one implies that the particular<br />

parameter can help in discriminating between the situations<br />

(more discrimination power). We calculated DI values for<br />

38 behavioral parameters <strong>and</strong> selected those parameters with<br />

DI values larger than one. This resulted in 10 parameters,<br />

comprising of eight statistical <strong>and</strong> two deterministic<br />

parameters, as shown in Table III. ‘# of events’ from ‘Task<br />

Arrival’ <strong>and</strong> ‘Time to Solution’ was the most discriminating<br />

behavioral parameter. Note that ‘# of events’ is the same for<br />

both time series’ as arrived tasks are processed.<br />

TABLE III<br />

DISCRIMINABILITY INDEX OF BEHAVIORAL PARAMETERS<br />

Rank DI Time Series Parameter<br />

1 2477.5 Task Arrival/Time to Solution # of events<br />

2 5.7 Time to Solution(G) Variance<br />

3 5.1 Time to Solution(G) Radius<br />

4 4.4 Time to Solution(G) Average<br />

5 4.2 Time to Solution(G) Maximum<br />

6 2.9 Queue length # of events<br />

7 2.8 Queue length Maximum<br />

8 2.2 Queue length AMI<br />

9 1.2 Queue length Average<br />

10 1.1 Time to Solution(C) L_Expo<br />

(G): Generation, (C): Completion<br />

V. SITUATION IDENTIFICATION<br />

Results from preliminary experimentation shows that 10<br />

of the 38 behavioral parameters have better discriminating<br />

power in the stress space. By using them as features we<br />

identify the stress situations using k-nearest neighbor<br />

classification algorithm.<br />

A. k-nearest neighbor algorithm<br />

k-nearest neighbor algorithm, one of the instance-based<br />

learning methods, is conceptually straightforward. The<br />

summary reported here is based on [14]. In this algorithm<br />

learning is simply storing the training instances, in which<br />

each instance corresponds to a point in the n-dimensional<br />

feature space. Given a new query instance k nearest<br />

neighbors are retrieved from memory <strong>and</strong> used to classify<br />

the new query instance. The nearest neighbors of an instance<br />

are defined in terms of the st<strong>and</strong>ard Euclidean distance. One<br />

problem in this algorithm is the sensitivity to noise axes in<br />

high dimensional problems. One possible solution would be<br />

to normalize each feature. However, normalization does not<br />

resolve this problem because Euclidean distance can become<br />

very noisy for high dimensional problems where only a few<br />

of the features carry classification information. The solution<br />

to this problem is to modify the Euclidean metric by a set of<br />

weights that represents the information content or goodness<br />

of each feature. Therefore, given a set of weights w the<br />

distance between two normalized instances x i <strong>and</strong> x j with m<br />

features can be calculated as in (2).<br />

m<br />

∑<br />

d( x , x ) = w[<br />

k](<br />

x [ k]<br />

− x [ k])<br />

(2)<br />

i<br />

j<br />

k = 1<br />

B. Empirical results<br />

i<br />

We performed 200 experiments with the same<br />

experimental configuration as in Fig. 2 to construct the<br />

database of training instances. Each training instance is<br />

represented with the 10 behavioral parameters. In this<br />

experimentation each agent’s OpTempo (Low/High) <strong>and</strong> the<br />

number of agent 1 (0/1/2) are r<strong>and</strong>omly chosen. Given a<br />

new instance we select 20 nearest neighbors (10% of the<br />

population of training instances) from the database <strong>and</strong><br />

estimate the stress situations of the agents in the society by<br />

using those neighbors. To address the effectiveness of DI we<br />

use 12 different sets of weights to calculate the distance. In<br />

the first 10 sets only one parameter is considered with other<br />

parameters’ weights equal to zero. We weight the<br />

parameters equally in the 11th set <strong>and</strong> proportionally to DI<br />

in the 12th set.<br />

We estimated the stress situations for 100 new instances<br />

using those 12 different sets of weights. To eliminate the<br />

noise from those agents that has no significant effect on the<br />

behavioral parameters, we removed the agents from our<br />

analysis that we cannot identify more than 2/3 by using any<br />

weight set. Through this procedure only 8 agents are<br />

selected. The results of correct estimation from different<br />

weight sets are shown in Fig. 4.<br />

Correctness (%)<br />

75<br />

70<br />

65<br />

60<br />

55<br />

50<br />

1 2 3 4 5 6 7 8 9 10<br />

On the whole, performance of behavioral parameters to<br />

identify the stress situations is quite correlated with DI of<br />

the parameters. As a parameter’s DI ranked high its<br />

estimation accuracy is also high. And, when weighted<br />

equally the performance is in the middle. But, when we<br />

weight proportionally to DI the performance becomes the<br />

highest. This result demonstrates the effectiveness of DI to<br />

get the goodness of the behavioral parameters for situation<br />

identification. Fig. 5 shows the performance for each agent<br />

when we weight proportionally to DI. As agents located<br />

farther from the TAO the performance becomes degraded.<br />

j<br />

DI Rank<br />

2<br />

Weighted w ith DI<br />

Weighted equally<br />

Fig. 4. Correct estimation using different weight sets

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!