Elektronika 2009-11.pdf - Instytut SystemÃ³w Elektronicznych

More documents

Recommendations

Info

[4] Bose A., i in.: Analysis of manufacturing blocking systems with Network Calculus. Performance Evaluation, vol. 63, pp. 1216- 1234, 2006. [5] Economou A., Fakinos D. Product form stationary distributions for queuing networks with blocking and rerouting. Queuing Systems, vol. 30(3/4), pp. 251-260, 1998. [6] Gomez-Corral A., Matros M. E. Performance of two-stage tandem queues with blocking: The impact of several flows of signals. Performance Evaluation, vol. 63, pp. 910-938, 2006. [7] Gupta U. C., i in. Discrete-time single-server finite-buffer under discrete Markovian arrival process with vacations. Performance Evaluation, vol. 64, pp. 1-19, 2007. [8] Kim C. S., i in. The BMAP/G/1-> ·/PH/1/M tandem queue with feedback and losses. Performance Evaluation, vol. 64, pp. 802- 818, 2007. [9] Mei van der R. D., i in. Response times in a two-node queuing network with feedback. Performance Evaluation, vol. 49, pp. 99- 110, 2002. [10] Oniszczuk W. Analysis of an Open Linked Series Three-Station Network with Blocking. In Advances in Information Processing and Protection, J. Pejaś, K. Saeed (Eds), Springer: New York, pp. 419-429, 2007. [11] Stewart W. J. Introduction to the Numerical Solution of Markov Chains. Princeton University Press: New Jersey, 1994. A system automating repairs of IT systems (System automatyzujący naprawy systemów IT) MSc MAREK KAMIŃSKI Gdańsk University of Technology Technology, Faculty of Electronics, Telecommunication Lufthansa Systems Poland, Sp. z o.o., Gdańsk Very fast and progressive informatization of almost all domains of our lives became an undisputed fact. Because huge number of IT systems of various kinds needs supervision and maintenance 24 hours a day and 365 days a year, many IT companies provide nowadays to their customers a new service offer of remote monitoring, technical support and assistance in taking care of their systems. Monitoring is relatively easy in realization because lots of monitoring systems have been already developed [1] and they usually accomplish their task in a satisfactory way. We distinguish traditional monitoring systems [2-7] (usually having centralized monitoring logic), and ones dedicated to monitoring grid or cluster structures [8-11] (usually having monitoring logic distributed), but regardless of the kind of monitoring system, monitoring aims to give administrators of monitored systems a clear indication of what is wrong. The next step is to solve the problem (to repair the system), so integrating monitoring and repair aspects seems natural. However, repairs are usually more complicated than monitoring, as they often involve manual and time-consuming interventions of administrators, but observations made by the author of this article show that, in many cases, even they can be automated and, moreover, integrated with the monitoring task. Their automation is a complex problem, which is still the area of active research. The already proposed solutions are usually too trivial to handle complex repairs in the real world situations. Others have vulnerabilities, excluding their industrial application. Some solutions use the concept of timeouts or retrying failed actions finite number of times, hoping the problem will not reoccur, or the concept of event handlers [4-6], being simple executions of remote commands [7], telling the system which program to execute on undesired monitoring results. Some solutions address only automation of administrative tasks [12,13] and do not integrate with monitoring. Other attempts focus on describing the architecture the system should have, to facilitate automatic repair [14-16]. Also efforts were made to integrate the Nagios Monitoring System [4-6] with the CFengine system (in particular situation, regarding network problems [17]), and some works show the directions, monitoring and healing (repairing) can go [18]. Goal and context This article focuses mainly on the Repair Management System (RMS), that is one of the parts of the developed Repair Management Framework, aming at automating the process of repairing IT systems. The RMF consists also of the Repair Management Model (RMM) and the repair library. Both of them were formally and precisely specified, using the Z notation [19-21] and are described in [22]. The RMM introduces two mathematical models (model of monitoring and model of repair processes), general enough to cover the existing problems and solutions, while RMS, with its complex architecture, uses those concepts as fundamentals, to incorporate and exploit existing monitoring solutions into triggering and conducting repairs automatically. The repair library introduces the abstraction layer (in a form of API = Application Programmers Interface), to allow for easy constructing of the repair algorithms, taking advantage of the programming language, they are embedded in, and of set of predefined routines (procedures and function), hiding away from programmers many unimportant details, regarding monitoring and repair of those particular problems they solve. All parts of the RMF have been already instantiated and are under tests in the Lufthansa Systems Poland Company. Definition of the solved problem The problem addressed in this article may be briefly defined as follows: • IT company continuously supervises work of many objects of various kinds, 54 ELEKTRONIKA 11/<strong>2009</strong>
• information on their state comes from various sources (e.g.: monitoring systems, emails being sent by maintenance tasks executed regularly, customers), • those objects exist on machines connected by network having complex topology, • undesired state of specified objects or their combination indicates problems that must be solved (usually as soon as possible), • big number of repairs is conducted manually, while they are repeatable and it is noticeable that they may be formally and precisely specified. Main ideas and terms The following subsections contain explanation of main ideas and terms used in the developed solution and introduce unified naming convention, regarding both of discussed problems: Monitoring Analysis of multiple monitoring systems allowed for creating more general description of monitoring, which can be perceived as a process of examining states of groups of objects of any kind, and of informing about any abnormalities. Each object may assume precisely determined number of, mutually exclusive, states taking their values from precisely determined domain. Some values mean the states are correct, while the other ones mean the object is associated with a problem that needs resolving. We may introduce simple method of identification of those objects, consisting on associating each of them with the complex variable of precisely determined name. Monitoring becomes then a process of continuous checking (or observing) the values of many variables, and of extracting from their space, those variables that do not fulfill required constraints. Binding the names of those variables with the real monitored objects is done in monitoring procedures. Because most of those objects may be placed in leaves of a tree, representing hierarchy of the monitored world, it is convenient to create space of variables with qualified names, resembling directory structure of hard disk drive, the DNS (Domain Name System of Internet names), or the MIB tree of SNMP (Simple Net Management Protocol). Staying with the analogy of directory structure, objects (so variables) are represented by files, while directories represent units grouping them. eclipse.oracle.TESTDB.fs.usage may be, for example, a variable representing percent usage of filesystem of machine eclipse, belonging to DBMS (DataBase Management System) oracle, identified by SID (System IDentifier) TESTDB. Values of those variables are changeable in time and frequency of their changes depends on applied monitoring mechanisms, on frequency of processing by them the data coming from monitored objects, on stability of their states and on their character. Sometimes, there is also a need to bind a state with some details. Summarizing, we may say: • state of monitored object is represented by a value of complex variable associated with it, • each variable has a name, identifying its position in a monitoring tree unambiguously, • their values are changeable in time and can be assigned optional details. Monitoring is a process of continuous checking (or observing) the values of variables making up the monitoring tree. Problem is a situation when one or more variables have undesired values. Undesired values, so problems, are expressed by predicates of problem-related variables of the monitoring tree. Precisely saying, problem is defined as a situation when any of those predicates is unfulfilled (when it is false), and a scope of problem is a term referring to a set of problem-related variables, falsifying the problem-related predicate. Repair Each problem, the RMS is able to solve, is associated with one or more repair procedures and repair is the process of using those procedures to bring monitored objects back from undesired states to desired states. Repair is initiated (triggered) by detecting in the monitoring tree set of such unprocessed variables, that make up the scope of the problem, that has been associated with at least one repair procedure. Repair procedures are interactive algorithms, written in high-level programming language, using the standard flow control instructions and taking advantage of other features of language, they are embedded in. Those algorithms execute, so called, corrective steps, whose invocations may be perceived as calls of external routines (procedures or functions). External routines are well-known programming languages concept and, like internal ones, they resemble black boxes having precisely determined interface (getting determined input and returning determined output) but, unlike internal ones, having their bodies defined outside the algorithms calling them. In case of the RMS system, they are exported, before their executions, to remote machines, related to problems that need them to be solved, and they are executed on those machines with access rights of users defined in repair algorithms. Programmers writing repair procedures may use means of expressions encapsulated in routines provided by the repair library (the repair API), and each repair is started up in environment containing all necessary information on the particular problem initiating it (and with the use of the repair library, this information is easily retrievable). Construction of the RMS Ideas and considerations drafted in the previous subsections, as well as existing industrial reality and already employed monitoring systems, have lead to the following construction of the RMS: Architecture of the RMS Architecture of the RMS has been shown in the Figure 1. MON- JAMI and HVRMONITOR correspond to systems described in [2,3]. The RMS collects data coming from various sources, initializes repairs, manages their whole lifecycle and each repair may assume one of the following states: • pending: need for repair was detected and appropriate repair is awaiting its starting up, • running: repair has been started up and its realization is in progress, • suspended: it has been suspended, as it needs additional information, • finished: it has been finished (regardless of its result) and its result has been saved. Features of the RMS The most important features of the RMS are the following: • it starts up subsequent repairs in parallel (not sequentially), • starting up follows the order requests for repairs are coming into the repairs queue, ELEKTRONIKA 11/<strong>2009</strong> 55
Page 5 and 6:
konstrukcje technologie zastosowani
Page 7 and 8: Streszczenia artykułów • Summar
Page 13 and 14: Medical pattern intelligent recogni
Page 15 and 16: Fig. 2. Start graph Z and the set o
Page 17 and 18: Both models are created of the basi
Page 19 and 20: • in some cases in the methods of
Page 21 and 22: 4. Random operators: three types of
Page 23 and 24: useful. In work [1] is stressed, th
Page 25 and 26: Tabl. 1. Fuzzy sets of the objects,
Page 27 and 28: In addition, one has to remember th
Page 29 and 30: The correction of digital images ob
Page 31 and 32: Tabl. 4. MD error before and after
Page 33 and 34: tionship then determines which indi
Page 35 and 36: this goal. A user’s public key au
Page 37 and 38: a) will be or could be broken, or b
Page 39 and 40: Ontology-based approach to scada sy
Page 41 and 42: erarchy of vulnerability classes wi
Page 43 and 44: and the set of tasks is divided int
Page 45 and 46: tasks: T 0 , T 4 , T 5 , T 6 and T
Page 47 and 48: According to presented function CSF
Page 49 and 50: The efficient data authentication i
Page 51 and 52: Fig. 3. Example run of three rounds
Page 53 and 54: Fig. 2. Possible scenarios of data
Page 55 and 56: e necessary, in worst case - also s
Page 57: dicates the number of tasks at the
Page 61 and 62: • data regarding their identity (
Page 63 and 64: k = 1, 2, ..., m and m is the numbe
Page 65 and 66: more powerful statistical (algorith
Page 67 and 68: Signal to Noise Ratio (PSNR). We pr
Page 69 and 70: [23] W3C - Web Services Glossary -
Page 71 and 72: Associated with every N-point M-dim
Page 73 and 74: The SMS-B system architecture (Sour
Page 75 and 76: Technika próżni i technologie pr
Page 77 and 78: wniosku i w konsekwencji za rok 200
Page 79 and 80: Poniżej przedstawiono krótki kome
Page 81 and 82: Wspomnienie Edward Leja (1937-2009)
Page 83 and 84: Zastosowanie technik immunoenzymaty
Page 85 and 86: z pasty węglowej, zaś odniesienia
Page 87 and 88: [33] Biani A., Centi S. Tombrlli S.
Page 89 and 90: Problemem bowiem w pracach instytut
Page 91 and 92: 1985 - Zdzisław Dorywalski 1986 -
Page 93 and 94: Rys. 4. Uroczyste wręczanie świad
Page 95 and 96: Imię i Nazwisko Patenty Wzory uży
Page 97 and 98: zadań określonych przez użytkown
Page 99 and 100: Ocena sugerowanych w ankiecie metod
Page 101: Zaprenumeruj wiedzę fachową 2010
Page 104 and 105: Radary pasywne - nowa technika radi
Page 106 and 107: c) stopniowo przesuwać jeden przeb
Page 108 and 109:
W przypadku wykorzystania nadajnika
Page 110 and 111:
kreślić to, że owale Cassiniego
Page 112 and 113:
Dla każdego kierunku odbieranej fa
Page 114 and 115:
chomych, które w ogólnym przypadk
Page 116 and 117:
Rys. 19. Fragment zobrazowania SS3
Page 118 and 119:
Zbigniew Czekała jest projektantem
Page 120 and 121:
• wytypowaniu statków zobowiąza
Page 122 and 123:
• używanie właściwego osprzęt
Page 124 and 125:
W jednomodowym szklanym włóknie t
Page 126 and 127:
Światłowody scyntylacyjne W środ
Page 128 and 129:
Parametry materiałowe szklanego w
Page 130 and 131:
126 ELEKTRONIKA 11/2009
Page 132 and 133:
Literatura [1] Yamane M., Asahara Y
Page 134 and 135:
Rys.1. Przebiegi testowe na wyprowa
Page 136 and 137:
nież pasmo emisji od około 500 MH
show all

Elektronika 2009-11.pdf - Instytut SystemÃ³w Elektronicznych

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?