DARPA ULTRALOG Final Report - Industrial and Manufacturing ...

More documents

Recommendations

Info

DMAS Controller. The functional units of the architecture are distributed but for each community that forms part of a MAS society, O ∗ will be calculated by a single agent. We now examine the functionality and role offered by each component of the framework in greater detail. 4.1 Self-Monitoring Capability Any system that wants to control itself should possess a clear specification of the scope of the variables it has to monitor. The TechSpecs is a distributed structure that supports this purpose by housing meta-data about all variables, X, that have to be monitored in different portions of the community (refer [14]). The data/statistics collected in a distributed way, is then aggregated to assist in control alternatives by the top-level controller that each community will possess. The attributes that need to be tracked are formulated in the form of measurement points (MP ). For example, one simple measurement could be specified as {what = delay, when = every packet, how = timestamp at receiving emd − timestamp at sending end } which is subsequenly stored in an MP . Each agent can look up its own TechSpecs and from time-to-time forward a measurement to its superior. The superior can analyze this information (eg. calculate statistics such as mean or variance) and/or add to this information and forward it again. We have measurement points for time-periods, timestamps, operating-modes, control and generic vector-based measurements. These measurement points can be chained for tracking information for a flow such that information is tagged-on at every point the flow traverses. For the sake of reliability, the information that is contained in these agents is replicated at several points, so that in the absence of packets reaching on time or not reaching at all, previously stored packets and their corresponding information can be utilized for control purposes. 4.2 Self-Modeling Capability One of the key features of this framework is that it has the capability to choose a type of performance model for analysing the current system configuration from several queueing model templates provided. The type of model that is utilized is based on the accuracy, the computation time and the history of effectiveness of the model. For example, a simulation based queueing model may be very accurate but cannot evaluate enough alternatives in limited time, in which case an analytical model (such as BCMP [5], QNA [38]) is preferred. The inputs to the model builder are the flows that traverse the network (F ), the types of packets (T ) and the current configuration of the network. If at a given time, we know that there are n agents interconnected in a hierarchical fashion then the role of this unit is to represent that information in the required template format (Q). The current number of agents is known to the controller by tracking the measurement points. For example, if there is no response from an agent for a sufficient period of time, then for the purpose of modeling, the controller may assume the agent to be non-existent. In this way dynamic configurations can be handled. On the other hand, TechSpecs do mandate connections according to superior-subordinate relationships thereby maintaining the flow structure at all times. Once the modeling is complete, the MAS has to capability to analyze its current performance using the selected type of model. The MAS does have the flexibility, to choose another model template for a different iteration. 4.3 Self-Evaluating Capability The evaluation capability, the first step in control, allows the MAS to examine its own performance under a given set of plausible conditions. This prediction of performance is used for the elimination of control alternatives that may lead to instabilities. Our notion of performance evaluation is similar to [34]. While Tesauro et al. [34] compute the resource level utility functions (based on the application manager’s knowledge of the system performance model) that can be combined to obtain a globally optimal allocation of resources, we predict the performance of the MAS as a function of its operating modes in real-time (within Queueing Model) and then use it to calculate its global utility (some more differences are pointed out in Section 4.4). By introducing a level of indirection, we may get some desirable properties because we separate an application’s domain-specific utility computation from performance prediction (or analysis). This theoretically enables us to predict the performance of any application whose TechSpecs are clearly defined and then compute the application-specific utility. In both cases, control alternatives are picked based on best-utility. We discuss the notion of control alternatives in Section 4.4. Also, our performance metrics (and hence utility) are based on service level attributes such as endto-end delay and latency, which is a desirable attribute of autonomic systems [34]. When plan, update and control tasks (as mentioned in Section 3) flow in this heterogeneous network of agents in predefined routes (called flows), the processing and wait times of tasks at various points in the network are not alike. This is because the configuration (number of agents allocated on a node), resource availability (load due to other contending software) and environmental conditions at each agent is different. In addition, the tasks themselves can be of varying qualities or fidelities that affects the time taken to process that task. Under these conditions, performance is
Symbol Table 1. Notation Description N Total # of nodes in the community λ ij Average arrival rate of class j at node i 1/µ ijk Average processing time of class j at node i at quality k M total number of classes T i Routing probability matrix for class i W ijk Steady state waiting time for class j at node i at quality k Q ij Set of qualities at which a class j task can be processed at node i estimated on the basis of the end-to-end delay involved in a “sense-plan-respond” cycle. The primary performance prediction tool that we use are called Queueing Network Models (QNM) [5]. The QNM is the representation of the agent community in the queueing domain. As the first step of performance estimation, the agent community needs to be translated into a queueing network model. Table 1 provides the notations used is this section. Inputs and outputs at a node are regarded as tasks. The rate at which tasks of class j are received at node i is captured by the arrival rate (λ ij ). Actions by agents consume time, so they get abstracted as processing rates (µ ij ). Further, each task can be processed at a quality k ∈ Q ij , that causes the processing rates to be represented as µ ijk . Statistics of processing times are maintained at each agent in PDB to arrive at a linear regression model between quality k and µ ijk . Flows get associated with classes of traffic denoted by the index j. If a connection exists between two nodes, this is converted to a transition probability p ij , where i is the source and j is the target node. Typically, we consider flows originating from the environment, getting processed and exiting the network making the agent network an open queueing network [5]. Since we may typically have multiple flows through a single node, we consider multi-class queueing networks where the flows are associated with a class. Performance metrics such as delays for the “senseplan-respond” cycle is captured in terms of average waiting times, W ijk . As mentioned earlier, TechSpecs is a convenient place where information such as flows and Q ij can be embedded. The choice of QNM depends on the number of classes, arrival distribution and processing discipline as well as a suggestion C by the DMAS controller that makes this choice based upon history of prior effectiveness. Some analytical approaches to estimate performance can be found in [5, 38]. In the context of agent networks, Jackson and BCMP queueing networks to estimate the performance in [14]. By extending this work we support several templates of queueing models (such as BCMP [5], Whitt’s QNA [38], Jackson [5], M/G/1, a simulation) that can be utilized for performance prediction. 4.4 Self-Controlling Capability In contrast to [34], we deal with optimization of the domain utility of a single MAS that is distributed, rather than allocating resources in an optimal fashion to multiple applications that have a good idea of their utility function (through policies). As mentioned before opmodes allow for trading-off quality of service (task quality and response time) and performance. We are assuming there is a maximum ceiling R on the amount of resources, and the available resources fluctuate depending on stresses S = S e +S a , where S e are the stresses from the environment (i.e. multiple contending applications, changes in the infrastructural or physical layers) and S a are the application stresses (i.e. increased tasks). The DMAS controller receives from the measurement points (MP ) a measurement of the actual performance P and a vector of other statistics (relating to X). Also at the top-level the overall utility (U) is U(P, S) = ∑ w n x n is known where x n is the actual utility component and w n is the associated weight specified by the user or another superior agent. We cannot change S, but we can adjust P to get better utility. Since P depends on O, which is a vector of opmodes collected from the community, we can use the QNM to find O ∗ and hence P ∗ that maximizes U(P, S) for a given S from within the set OS. In words, we find the vector of opmodes (O ∗ ) that maximizes domain utility at current S and opmodes O. This computation is performed in the Utility Calculator module using a learned utility model based on UDB. In addition to differences pointed out thus far, here are some more differences between this work and [34]: • Tesauro et al. [34] assume that the application manager in Unity has a model of system performance, which we do not assume. Although they allude to a modeler module, they do not explain the details of their performance model. We use a queueing network model that is constructed in real-time to estimate the performance for any set of opmodes O ′ by taking the current opmodes O and scaling them appropriately based on observed histories (X) to X ′ in the Control Set Evaluator. • Because of the interactions involved and complexity of performance modeling [19, 27], it may be timeconsuming to utilize statistical inferencing and learning mechanisms in real-time. This is why we use an analytical queueing network model to estimate performance quickly.
Page 1 and 2: Ultra*Log PSU/IAI Final Report for
Page 3 and 4: Contents Contents .................
Page 5: Executive Summary Ultra*Log is a De
Page 8 and 9: 2.3 Gnanasambandam, N., Lee, S., Ku
Page 10 and 11: 6 Characterization and analysis of
Page 12 and 13: timizing simultaneously the link de
Page 14 and 15: where ∆ 1 (j) ≥ 0 and ∆ 2 (i,
Page 16 and 17: Table 2. GA Results Agent N A D max
Page 18 and 19: connected component, in which a pat
Page 20 and 21: Random Small-world Scale-free with
Page 22 and 23: Growth mechanisms Start with a smal
Page 24 and 25: Table 2. The proposed network’s c
Page 26 and 27: ¡ ¡ ¢ £ ¤ ¥ ¦ £ § ¨ © ¥
Page 28 and 29: © © ¤ ¢ ¨ ¤ £ ¦ ¨ © §
Page 32 and 33: 1000 500 0 -500 -1000 0.2 0.4 0.6 0
Page 34 and 35: tems. Proceedings of the Second Joi
Page 36 and 37: Proceedings of the 1st Open Cougaar
Page 44 and 45: 1 SITUATION IDENTIFICATION USING DY
Page 46 and 47: 3 3.2 Behavior In SSC society an ag
Page 48 and 49: 5 All the behavior parameters may n
Page 50 and 51: Estimating Global Stress Environmen
Page 52 and 53: chaotic deterministic time series.
Page 54 and 55: 100% 1 95% 2 14 15 64% TAO 4 64% 62
Page 56 and 57: interactions as a dynamical system
Page 58 and 59: ehavior states under varying system
Page 60 and 61: Extensive testing and validation of
Page 62 and 63: 5 Conclusions and Future Research T
Page 64 and 65: 2 to function critically even under
Page 66 and 67: 4 points into a corresponding inner
Page 68 and 69: 6 stresses well. The Warehouse 1 ag
Page 70 and 71: © £ ¨ ¥ ¤ § ¢ ¥ ©
Page 72 and 73: Table 1: Notation Symbol Descriptio
Page 74 and 75: 4 Conclusions and Future Work 4.1 C
Page 76 and 77: O, U Stresses Physical/Infrastructu
Page 78 and 79: Manuscript for IEEE Transactions on
Page 80 and 81:
Manuscript for IEEE Transactions on
Page 82 and 83:
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
2 discuss problem domain and in Sec
Page 108 and 109:
4 t Si t ∫ + ( ) i ) t When RA i
Page 110 and 111:
6 S UB i ∑ = P LI / LI . (16) i p
Page 112 and 113:
8 policies while underutilizing in
Page 114 and 115:
architecture based on both the Grid
Page 116 and 117:
to machines (network topology) and
Page 118 and 119:
component is a member of one of the
Page 120 and 121:
denotes the immediate predecessors
Page 122 and 123:
limit of large number of tasks. If
Page 124 and 125:
Min-min heuristic algorithm Step 1:
Page 126 and 127:
We set up eight different experimen
Page 128 and 129:
0. 4 8057 8237 0.5 6439 6635 0.6 53
Page 130 and 131:
pp. 191-200. [3] I. Foster, C. Kess
Page 132 and 133:
Technology, Cambridge, MA, 1995. [2
Page 134 and 135:
Manuscript for IEEE TRANSACTIONS ON
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Coordinating Control Decisions of S
Page 158 and 159:
directly. It has coarser scale. It
Page 160 and 161:
Understanding Agent Societies Using
Page 162 and 163:
• For tasks, the UID of the direc
Page 164 and 165:
for tracking the dependencies betwe
Page 166 and 167:
within the monitored enclave. With
Page 168 and 169:
Figure 1. Agent Hierarchy in CPE So
Page 170 and 171:
Figure 3. The World Model CPY Agent
Page 172 and 173:
Table 2. TechSpecs: Infrastructure
Page 174 and 175:
terms of the average waiting times
Page 176 and 177:
Acknowledgements The work described
Page 178 and 179:
key realization to tackle this prob
Page 180 and 181:
are or ignored perturbations. The e
Page 182 and 183:
sustained oscillations. If the inst
Page 184 and 185:
solved by biological systems for li
Page 186 and 187:
non-linear and non-stationarity. Ne
Page 188 and 189:
D k : Outflow from high priority qu
Page 190 and 191:
the system starts to perform self-s
Page 192 and 193:
sustained in a supply chain. Resour
Page 194 and 195:
which are necessary to make good mo
Page 196 and 197:
focusing on the computational compl
Page 198 and 199:
line of research that deals with ne
Page 200 and 201:
ealizable. But the inherent complex
Page 202 and 203:
Min, H. and Zhou, G., 2002, Supply
Page 204 and 205:
In the rest of this paper we report
Page 206:
shows relatively irregular behavior
show all

DARPA ULTRALOG Final Report - Industrial and Manufacturing ...

Create successful ePaper yourself

Delete template?

Save as template?