DARPA ULTRALOG Final Report - Industrial and Manufacturing ...

More documents

Recommendations

Info

Proceedings of the 1st Open Cougaar Conference 2 promising approach to deal with the large-scale systems is multiagent systems (MAS), we agentify the components in purely control point of view. In MAS, agents address the scalability issue by computing solutions locally and then using this information in a social way. In this paper we develop a multiagent-based adaptive control mechanism with scalability and predictability to support survivability of large-scale networks. Specifically, in Section 2, we discuss problem domain and in Section 3 formally define the problem in detail. We review previous control approaches in Section 4. We design an adaptive control mechanism in Section 5 and show empirical results in Section 6. Finally, we discuss implications and possible extensions of our work in Section 7. 2. Problem domain The networks we study in this paper represent distributed and component-based architectures. As an instance, Cougaar (Cognitive Agent Architecture: http://www.cougaar.org) developed by DARPA (Defense Advanced Research Project Agency), follows such an architecture for building large-scale multiagent systems. Recently, there have been efforts to combine the technologies of agents and components to improve the way of building large-scale software systems [8][9][10]. While component technology focuses on reusability, agent technology focuses on processing complex tasks as a community. Cougaar is in line with this trend. In Cougaar a software system is comprises of agents and an agent of components (called plugins). The task flow structure in those systems is that of components as a combination of intra-agent and inter-agent task flows. As the agents in Cougaar can be distributed both from geographical and information content sense, the networks implemented in Cougaar have distributed and componentbased architecture. UltraLog (http://www.ultralog.net) networks are military supply chain planning systems implemented in Cougaar. Agents in those networks represent organizations in military supply chains. The objective of an UltraLog network is to provide appropriate logistics plan to a military operational plan. The system produces a logistics plan by decomposing the operational plan into logistics tasks and processing them through a task flow structure. The system makes initial planning for a given operation and continuous replanning in the execution mode to cope with logistics plan deviations or operational plan changes. As the scale of operation increases there can be thousands of agents working together to generate a logistics plan. Initial planning or replanning generates a logistics plan as a global solution, which is an aggregate of individual schedules built by plugins through their task flow structure. Each plugin can implement one of its available implementation alternatives which trade off processing time and quality of the schedule. Quality of service is determined by two metrics, quality of logistics plan and plan completion time. These two metrics directly affect the performance of the operation. Planning and replanning of UltraLog networks are the instances of the current research problem. An UltraLog network cannot work in isolation from outside world because they utilize external databases and users should be able to access the system. This inevitable connection to the outside makes the system exposed to malicious attacks in addition to accidental failure. Now, the question is how can we make this system survivable to generate high quality logistics plans in a timely manner in the presence of accidental failures and malicious attacks? 3. Problem specification In this Section we formally define the problem by detailing the network model. We concentrate on computational CPU resources assuming that the system is computation-bounded. 3.1. Network model We define four elements of the network to clarify its mechanics: network configuration, implementation alternatives, quality of service, and stress environment. Network configuration A network is composed of a set of agents A with each agent located in its own machine. Task flow structure of the network, which defines precedence relationship between agents, is an acyclic directed graph with each link assigned a positive real number. A link number l ij (i≠j) indicates the number of tasks generated for successor agent j when agent i processes a task in its queue. Once accumulated tasks for a successor agent becomes over one, the corresponding integer number of tasks are sent to the successor agent. By using real numbers we can represent wide range of task flow structure including noninteger aggregation and expansion. A problem given to a network is decomposed in terms of root tasks for some agents. And, those tasks are propagated through task flow structure. Implementation alternatives An agent can have multiple implementation alternatives to process a task. Different alternatives trade off CPU time and solution value with more CPU time
Proceedings of the 1st Open Cougaar Conference 3 resulting in higher solution value. As we can find optimal mixed alternatives, an agent has a monotonically increasing convex function, say value function, with CPU time as a function of value. We call the value in the function as value mode that the agent can select as its decision variable. A value function is defined with three components as: 〈 f i ( vi ), vi(min) , vi(max) This function says that an agent i’s expected CPU time to process a task is f i (v i ) with a value mode v i and v i(min) ≤ v i ≤ v i(max) . Quality of service A problem given to the network is decomposed to root tasks for some agents and those tasks are propagated through task flow structure. The service provided by the network is to produce a global solution to the given problem, which is an aggregate solution of the partial solutions from processing tasks. QoS of the network is determined by the value of global solution and the cost of completion time for generating global solution. The value of global solution is the summation of partial solution values. And, the cost of completion time is determined by a cost function CCT(T), which is a monotonically increasing function with completion time T. Consider that v i d denotes the value mode used to process d th task and e i the number of tasks processed to completion by agent i. Then, QoS can be calculated as: Stress environment QoS ei = ∑∑ i∈ A d = 1 d i 〉 v − CCT( T) Survivability stresses, such as accidental failures and malicious attacks, affect the system by consuming resources directly or indirectly through activating defense mechanisms as remedies against them. For example, “denial of service” attack consumes resources directly while relevant defense mechanism also consumes resource in terms of resistance, recognition, and recovery [1]. We consider both of survivability stresses and remedies as stress environment from the viewpoint of the agents in the network. The stress environment space is a high-dimensional and also evolving one [11][12]. But, as we concentrate on computational CPU resources a stress environment can be regarded as a set of threads residing in the machines of the network and sharing resources with the agents. The threads, say stressors, can have some priorities or weights for resource allocation under admission or can be stealing resources without admission. 3.2. Problem definition In this paper we develop an adaptive control mechanism with scalability and predictability to support the survivability of large-scale networks. The system needs to adapt to the changing stress environment to provide high QoS utilizing implementation alternatives (v) as: arg max v QoS We discuss several characteristics of the problem that will be helpful in understanding the problem and developing appropriate control mechanism: • Large-scale network: The network can be large-scale as the number of agents and nodes increase with the scale of the given problem to the network. • Finite time horizon: The time horizon for a network to generate a global solution is finite. • Indecomposable QoS: QoS is not decomposable to individual elements’ performance because one of the two conflicting QoS elements is the completion time that is common throughout the network. • Complex dynamics: Agents interact with each other through task flow and with stressors through sharing resources. As those interactions are in parallel to control actions the dynamics of the system is intrinsically complex especially in large-scale networks. • Non-availability of statistics: Statistics such as arrival rates or service rates are not fixed or given. But, they are changing as the system evolves. In addition, the stress environment changes. 4. Control approaches in dynamic systems In general in dynamic systems, centralized and decentralized control approaches are used. 4.1. Centralized approaches There are three centralized control approaches, dynamic programming (DP), reinforcement learning (RL), and model predictive control (MPC). Dynamic programming (DP) solves optimality equation to produce reactive strategies in terms of optimal closed-loop control policy, which is a rule specifying optimal action as a function of state and time [13]. It assumes that the structure of dynamic model is fixed and the model parameters are known in advance. DP gives absolutely optimal policy but the complexity in solving optimality equation grows exponentially with the dimension of the state space. RL is an adaptive version of DP to develop a
Page 1 and 2: Ultra*Log PSU/IAI Final Report for
Page 3 and 4: Contents Contents .................
Page 5: Executive Summary Ultra*Log is a De
Page 8 and 9: 2.3 Gnanasambandam, N., Lee, S., Ku
Page 10 and 11: 6 Characterization and analysis of
Page 12 and 13: timizing simultaneously the link de
Page 14 and 15: where ∆ 1 (j) ≥ 0 and ∆ 2 (i,
Page 16 and 17: Table 2. GA Results Agent N A D max
Page 18 and 19: connected component, in which a pat
Page 20 and 21: Random Small-world Scale-free with
Page 22 and 23: Growth mechanisms Start with a smal
Page 24 and 25: Table 2. The proposed network’s c
Page 26 and 27: ¡ ¡ ¢ £ ¤ ¥ ¦ £ § ¨ © ¥
Page 28 and 29: © © ¤ ¢ ¨ ¤ £ ¦ ¨ © §
Page 30 and 31: DMAS Controller. The functional uni
Page 32 and 33: 1000 500 0 -500 -1000 0.2 0.4 0.6 0
Page 34 and 35: tems. Proceedings of the Second Joi
Page 38 and 39: Proceedings of the 1st Open Cougaar
Page 44 and 45: 1 SITUATION IDENTIFICATION USING DY
Page 46 and 47: 3 3.2 Behavior In SSC society an ag
Page 48 and 49: 5 All the behavior parameters may n
Page 50 and 51: Estimating Global Stress Environmen
Page 52 and 53: chaotic deterministic time series.
Page 54 and 55: 100% 1 95% 2 14 15 64% TAO 4 64% 62
Page 56 and 57: interactions as a dynamical system
Page 58 and 59: ehavior states under varying system
Page 60 and 61: Extensive testing and validation of
Page 62 and 63: 5 Conclusions and Future Research T
Page 64 and 65: 2 to function critically even under
Page 66 and 67: 4 points into a corresponding inner
Page 68 and 69: 6 stresses well. The Warehouse 1 ag
Page 70 and 71: © £ ¨ ¥ ¤ § ¢ ¥ ©
Page 72 and 73: Table 1: Notation Symbol Descriptio
Page 74 and 75: 4 Conclusions and Future Work 4.1 C
Page 76 and 77: O, U Stresses Physical/Infrastructu
Page 78 and 79: Manuscript for IEEE Transactions on
Page 86 and 87:
Manuscript for IEEE Transactions on
Page 88 and 89:
Page 90 and 91:
Page 92 and 93:
Page 94 and 95:
Page 96 and 97:
Page 98 and 99:
Page 100 and 101:
Page 102 and 103:
Page 104 and 105:
Page 106 and 107:
2 discuss problem domain and in Sec
Page 108 and 109:
4 t Si t ∫ + ( ) i ) t When RA i
Page 110 and 111:
6 S UB i ∑ = P LI / LI . (16) i p
Page 112 and 113:
8 policies while underutilizing in
Page 114 and 115:
architecture based on both the Grid
Page 116 and 117:
to machines (network topology) and
Page 118 and 119:
component is a member of one of the
Page 120 and 121:
denotes the immediate predecessors
Page 122 and 123:
limit of large number of tasks. If
Page 124 and 125:
Min-min heuristic algorithm Step 1:
Page 126 and 127:
We set up eight different experimen
Page 128 and 129:
0. 4 8057 8237 0.5 6439 6635 0.6 53
Page 130 and 131:
pp. 191-200. [3] I. Foster, C. Kess
Page 132 and 133:
Technology, Cambridge, MA, 1995. [2
Page 134 and 135:
Manuscript for IEEE TRANSACTIONS ON
Page 136 and 137:
Page 138 and 139:
Page 140 and 141:
Page 142 and 143:
Page 144 and 145:
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Coordinating Control Decisions of S
Page 158 and 159:
directly. It has coarser scale. It
Page 160 and 161:
Understanding Agent Societies Using
Page 162 and 163:
• For tasks, the UID of the direc
Page 164 and 165:
for tracking the dependencies betwe
Page 166 and 167:
within the monitored enclave. With
Page 168 and 169:
Figure 1. Agent Hierarchy in CPE So
Page 170 and 171:
Figure 3. The World Model CPY Agent
Page 172 and 173:
Table 2. TechSpecs: Infrastructure
Page 174 and 175:
terms of the average waiting times
Page 176 and 177:
Acknowledgements The work described
Page 178 and 179:
key realization to tackle this prob
Page 180 and 181:
are or ignored perturbations. The e
Page 182 and 183:
sustained oscillations. If the inst
Page 184 and 185:
solved by biological systems for li
Page 186 and 187:
non-linear and non-stationarity. Ne
Page 188 and 189:
D k : Outflow from high priority qu
Page 190 and 191:
the system starts to perform self-s
Page 192 and 193:
sustained in a supply chain. Resour
Page 194 and 195:
which are necessary to make good mo
Page 196 and 197:
focusing on the computational compl
Page 198 and 199:
line of research that deals with ne
Page 200 and 201:
ealizable. But the inherent complex
Page 202 and 203:
Min, H. and Zhou, G., 2002, Supply
Page 204 and 205:
In the rest of this paper we report
Page 206:
shows relatively irregular behavior
show all

DARPA ULTRALOG Final Report - Industrial and Manufacturing ...

Create successful ePaper yourself

Delete template?

Save as template?