Back Room Front Room 2

More documents

Recommendations

Info

MINING THE RELATIONSHIPS IN THE FORM OF THE PREDISPOSING FACTORS AND CO-INCIDENT FACTORS AMONG NUMERICAL DYNAMIC ATTRIBUTES IN TIME SERIES DATA SET BY USING THE COMBINATION OF SOME EXISTING TECHNIQUES Suwimon Kooptiwoot, M. Abdus Salam School of Information Technologies, The University of Sydney, Sydney, Australia Email:{ suwimon, msalam }@it.usyd.edu.au Keywords: Temporal Mining, Time series data set, numerical data, predisposing factor, co-incident factor Abstract: Temporal mining is a natural extension of data mining with added capabilities of discovering interesting patterns, inferring relationships of contextual and temporal proximity and may also lead to possible causeeffect associations. Temporal mining covers a wide range of paradigms for knowledge modeling and discovery. A common practice is to discover frequent sequences and patterns of a single variable. In this paper we present a new algorithm which is the combination of many existing ideas consists of the reference event as proposed in (Bettini, Wang et al. 1998), the event detection technique proposed in (Guralnik and Srivastava 1999), the large fraction proposed in (Mannila, Toivonen et al. 1997), the causal inference proposed in (Blum 1982) We use all of these ideas to build up our new algorithm for the discovery of multivariable sequences in the form of the predisposing factor and co-incident factor of the reference event of interest. We define the event as positive direction of data change or negative direction of data change above a threshold value. From these patterns we infer predisposing and co-incident factors with respect to a reference variable. For this purpose we study the Open Source Software data collected from SourceForge website. Out of 240+ attributes we only consider thirteen time dependent attributes such as Page-views, Download, Bugs0, Bugs1, Support0, Support1, Patches0, Patches1, Tracker0, Tracker1, Tasks0, Tasks1 and CVS. These attributes indicate the degree and patterns of activities of projects through the course of their progress. The number of the Download is a good indication of the progress of the projects. So we use the Download as the reference attribute. We also test our algorithm with four synthetic data sets including noise up to 50 %. The results show that our algorithm can work well and tolerate the noise data. 1 INTRODUCTION Time series mining has wide range of applications and mainly discovers movement pattern of the data, for example stock market, ECG(Electro-cardiograms) weather patterns, etc. There are four main types of movement of data: 1. Long-term or trend pattern 2. Cyclic movement or cyclic variation 3. Seasonal movement or seasonal variation 4. Irregular or random movements. Full sequential pattern is kind of long term or trend movement, which can be cyclic pattern like machine working pattern. The idea is to catch the error pattern which is different from the normal pattern in the process of that machine. For trend movement, we try to find the pattern of change that I. Seruca et al. (eds.), Enterprise Information Systems VI, © 2006 Springer. Printed in the Netherlands. 135 can be used to estimate the next unknown value at the next time point or in any specific time point in the future. A periodic pattern repeats itself throughout the time series and this part of data is called a segment. The main idea is to separate the data into piecewise periodic segments. There can be a problem with periodic behavior within only one segment of time series data or seasonal pattern. There are several algorithms to separate segment to find change point, or event. There are some works (Agrawal and Srikant 1995; Lu, Han et al. 1998) that applied the Apriorilike algorithm to find the relationship among attributes, but these work still point to only one dynamic attribute. Then we find the association 135–142.
136 ENTERPRISE INFORMATION SYSTEMS VI among static attributes by using the dynamic attribute of interest. The work in (Tung, Lu et al. 1999) proposed the idea of finding inter-transaction association rules such as RULE1: If the prices of IBM and SUN go up, Microsoft’s will most likely (80 % of time) go up the next day. Event detection from time series data (Guralnik and Srivastava 1999) utilises the interesting event detection idea, that is, a dynamic phenomenon is considered whose behaviour changes over time to be considered as a qualitatively significant change. It is interesting if we can find the relationship among dynamic attributes in time series data which consist of many dynamic attributes in numerical form. We attempt to find the relationship that gives the factors that can be regarded as the cause and effect of the event of interest. We call these factors as the predisposing and co-incident factors with respect to a reference variable. The predisposing factor can tell us the event of other dynamic attributes which mostly happen before the reference event happens. And the co-incident factor can tell us the event of other dynamic attributes which mostly happens at the same time or a little bit after the reference event happens. 2 TEMPORAL MINING An interesting work in (Roddick and Spiliopoulou 2002), they review research related to the temporal mining and their contributions related to various aspects of the temporal data mining and knowledge discovery and also briefly discuss the relevant previous work . In majority of time series analysis, we either focus on prediction of the curve of a single time series or the discovery of similarities among multiple time series. We call time dependent variable as dynamic variable and call time independent variable as static variable. Trend analysis focuses on how to estimate the value of dynamic attribute of interest at the next time point or at a specific time point in the future. There are many kinds of patterns depending on application data. We can separate pattern types into four groups. 1. Cyclic pattern is the pattern which has the exact format and repeat the same format to be cyclic form, for example, ECG, tidal cycle, sunrise-sunset 2. Periodic pattern is the pattern which has the exact format in only part of cycle and repeat this exact format at the same part of cycle, for example, “Everyday morning at 7:30-8:00 am., Sandy has breakfast”, the rest of the day Sandy has many activities which no exact pattern. The cycle of the day, every day the exact format happen only in the morning at 7:30-8:00 am. 3. Seasonal pattern is the pattern which is a sub type of cyclic pattern and periodic pattern. There is a pattern at a specific range of time during the year’s cycle, for example, half year sales, fruit season, etc. 4. Irregular pattern is the pattern which doesn’t have the exact pattern in cycle, for example, network alarm, computer crash. 2.1 Trend Analysis Problem Trend analysis works are normally done by finding the cyclic patterns of one numerical dynamic attribute of interest. Once we know the exact pattern of this attribute, we can forecast the value of this attribute in the future. If we cannot find the cycle pattern of this attribute, we can use moving average window or exponential moving average window (Weiss and Indurkhya 1998; Kantardzic 2003) to estimate the value of this dynamic attribute at the next time point or at the specific time point in the future. 2.2 Pattern Finding Problem In pattern finding problem, we have to find the change point to be the starting point and the end point of cycle or segment. Then we try to look for the segment or cycle pattern that is repeated in the whole data set. To see which type of pattern it is, that is, the cyclic pattern of periodic pattern, seasonal pattern or irregular pattern and observe the pattern. The pattern matching problem is to find the way of matching the segment patterns. Pattern matching, can be exact pattern (Dasgupta and Forrest 1995) or rough pattern matching (Hirano, Sun et al. 2001; Keogh, Chu et al. 2001; Hirano and Tsumoto 2002), depending on the data application. The exact pattern is, for example, the cycle of machine working. The rough pattern is, for example, the hormone level in human body. Another problem is the multi-scale pattern matching problem as seen in (Ueda and S.Suzuki 1990; Hirano, Sun et al. 2001; Hirano and Tsumoto 2001) to match patterns in different time scales. One interesting work is Knowledge discovery in Time Series Databases (Last, Klein et al. 2001) . Last et al. proposed the whole process of knowledge discovery in time series data bases. They used signal processing techniques and the information-theoretic fuzzy approach. The computational theory of perception (CTP) is used to reduce the set of extracted rules by fuzzification and aggregation.
Page 1 and 2:
www.dbeBooks.com - An Ebook Library
Page 3 and 4:
A C.I.P. Catalogue record for this
Page 5 and 6:
vi TABLE OF CONTENTS PART 1 - DATAB
Page 7 and 8:
viii TABLE OF CONTENTS A WIRELESS A
Page 9 and 10:
CONFERENCE COMMITTEE Honorary Presi
Page 11 and 12:
Hung, P. (AUSTRALIA) Jahankhani, H.
Page 13 and 14:
Invited Papers
Page 15 and 16:
2 ENTERPRISE INFORMATION SYSTEMS VI
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
10 ENTERPRISE INFORMATION SYSTEMS V
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
EVOLUTIONARY PROJECT MANAGEMENT: MU
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
MANAGING COMPLEXITY OF ENTERPRISE I
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
ASSESSING EFFORT PREDICTION MODELS
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
ORGANIZATIONAL AND TECHNOLOGICAL CR
Page 77 and 78:
Page 79 and 80:
ASAP Processes W Project Kickoff A
Page 81 and 82:
Page 83 and 84:
since it means changing the relevan
Page 85 and 86:
ACME-DB: AN ADAPTIVE CACHING MECHAN
Page 87 and 88:
ACME-DB: AN ADAPTIVE CACHING MECHAN
Page 89 and 90:
Hit Rate (%) ACME-DB: AN ADAPTIVE C
Page 91 and 92:
5.3.6 Effect on all Policies by Int
Page 93 and 94:
ACKNOWLEDGEMENTS We would like to t
Page 95 and 96: This paper describes the data sampl
Page 97 and 98: The major limit A of the operation
Page 99 and 100: RELATIONAL SAMPLING FOR DATA QUALIT
Page 101 and 102: TOWARDS DESIGN RATIONALES OF SOFTWA
Page 103 and 104: ((Brown et al., 1998), pp. 159-190,
Page 105 and 106: sive data filtering, presentation (
Page 107 and 108: • implementation changes in servi
Page 109 and 110: MEMORY MANAGEMENT FOR LARGE SCALE D
Page 111 and 112: Data Rate [bytes/sec] 1600000 14000
Page 113 and 114: E.g., DV Camcorder Camera Packets (
Page 115 and 116: Procedure CheckOptimal (n rs 1 ...n
Page 117 and 118: MEMORY MANAGEMENT FOR LARGE SCALE D
Page 119 and 120: PART 2 Artificial Intelligence and
Page 121 and 122: 110 ENTERPRISE INFORMATION SYSTEMS
Page 127 and 128: DYNAMIC MULTI-AGENT BASED VARIETY F
Page 145: 134 ENTERPRISE INFORMATION SYSTEMS
Page 169 and 170: AN EXPERIENCE IN MANAGEMENT OF IMPR
Page 179 and 180: ANALYSIS AND RE-ENGINEERING OF WEB
Page 181 and 182: Module p0 Customer Itinerary Send I
Page 183 and 184: departments: the marketing dept. se
Page 185 and 186: • An acyclic module M is usable,
Page 187 and 188: BALANCING STAKEHOLDER’S PREFERENC
Page 189 and 190: The scenarios will provide an abstr
Page 191 and 192: BALANCING STAKEHOLDER'S PREFERENCES
Page 193 and 194: BALANCING STAKEHOLDER'S PREFERENCES
Page 195 and 196: A POLYMORPHIC CONTEXT FRAME TO SUPP
Page 197 and 198:
A POLYMORPHIC CONTE XT FRAME TO SUP
Page 199 and 200:
Page 201 and 202:
Page 203 and 204:
FEATURE MATCHING IN MODEL-BASED SOF
Page 205 and 206:
combination of solution domains - a
Page 207 and 208:
• features for all the concepts:
Page 209 and 210:
Existence In both steps it is possi
Page 211 and 212:
matching strategies are maximal, mi
Page 213 and 214:
TOWARDS A META MODEL FOR DESCRIBING
Page 215 and 216:
elementary message (ae-message) can
Page 217 and 218:
problems on a pragmatic level, whic
Page 219 and 220:
TOWARDS A META MODEL FOR DESCRIBING
Page 221 and 222:
INTRUSION DETECTION SYSTEMS USING A
Page 223 and 224:
INTRUSION DETECTION SYSTEMS USING A
Page 225 and 226:
(k� 1) (k) (k�1) (k) p � w
Page 227 and 228:
Table 5: Performance Comparison of
Page 229 and 230:
A USER-CENTERED METHODOLOGY TO GENE
Page 231 and 232:
carries out the second phase of the
Page 233 and 234:
1 1 1 1 2 PROCESS STORE ENTITY EDGE
Page 235 and 236:
constraints rules. KOGGE (Ebert et
Page 237 and 238:
PART 4 Software Agents and Internet
Page 239 and 240:
230 ENTERPRISE INFORMATION SYSTEMS
Page 241 and 242:
Page 243 and 244:
Page 245 and 246:
Page 247 and 248:
Page 249 and 250:
Page 251 and 252:
Page 253 and 254:
Page 255 and 256:
Page 257 and 258:
Page 259 and 260:
Page 261 and 262:
Page 263 and 264:
Page 265 and 266:
Page 267 and 268:
Page 269 and 270:
Page 271 and 272:
Page 273 and 274:
Page 275 and 276:
Page 277 and 278:
Page 279 and 280:
Page 281 and 282:
Page 283 and 284:
Page 285 and 286:
CABA 2 L A BLISS PREDICTIVE COMPOSI
Page 287 and 288:
y disables evidencing severe mental
Page 289 and 290:
Figure 4: Symbols emission in DAR-H
Page 291 and 292:
they evidence a generalization erro
Page 293 and 294:
A METHODOLOGY FOR INTERFACE DESIGN
Page 295 and 296:
and mobility loss coupled with redu
Page 297 and 298:
Tradeoff: The default input is poss
Page 299 and 300:
Sessions on the VABS which are held
Page 301 and 302:
A CONTACT RECOMMENDER SYSTEM FOR A
Page 303 and 304:
A CONTACT RECOMMENDER SYSTEM FOR A
Page 305 and 306:
- "Going to the sources". The user
Page 307 and 308:
Here are the matrix W and P for our
Page 309 and 310:
EMOTION SYNTHESIS IN VIRTUAL ENVIRO
Page 311 and 312:
theme turns out to be surprisingly
Page 313 and 314:
6 FROM FEATURES TO SYMBOLS 6.1 Face
Page 315 and 316:
sions are described by different em
Page 317 and 318:
Electronic Imaging 2000 Conference
Page 319 and 320:
to the accessibility problem for th
Page 321 and 322:
Figure 3: Hierarchical View of the
Page 323 and 324:
4 CONCLUSIONS AND FUTURE WORK On th
Page 325 and 326:
PERSONALISED RESOURCE DISCOVERY SEA
Page 327 and 328:
Page 329 and 330:
profile, the user defines his curre
Page 331 and 332:
Page 333 and 334:
Abdelkafi, N. .....................
show all

Back Room Front Room 2

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?