Xiao Liu PhD Thesis.pdf - Faculty of Information and Communication ...
Xiao Liu PhD Thesis.pdf - Faculty of Information and Communication ...
Xiao Liu PhD Thesis.pdf - Faculty of Information and Communication ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
K-MaxSDev) gains a large improvement on the average accuracy <strong>of</strong> interval<br />
forecasting over the other representative forecasting strategies. The average<br />
accuracy <strong>of</strong> our K-MaxSDev is 80% while that <strong>of</strong> the other strategies are mainly<br />
around 40% to 50%. Among the other 6 strategies, AR behaves the best while the<br />
others have similar performance except MEAN which behaves poorly compared<br />
with the others. However, such results actually demonstrate the highly dynamic<br />
performance <strong>of</strong> the underlying services in cloud workflow systems.<br />
To conclude, in comparison to other representative forecasting strategies in<br />
traditional workflow systems, our forecasting strategy can effectively address the<br />
problems <strong>of</strong> limited sample size <strong>and</strong> frequent turning points, <strong>and</strong> hence achieve<br />
higher accuracy.<br />
4.5 Summary<br />
Interval forecasting for activity durations in cloud workflow systems is <strong>of</strong> great<br />
importance since it is related to most <strong>of</strong> cloud workflow QoS <strong>and</strong> non-QoS<br />
functionalities such as load balancing, workflow scheduling <strong>and</strong> temporal<br />
verification. However, predicting accurate duration intervals is very challenging due<br />
to the dynamic nature <strong>of</strong> cloud computing infrastructures. Most <strong>of</strong> the current work<br />
focuses on the prediction <strong>of</strong> CPU load to facilitate the forecasting <strong>of</strong> computation<br />
intensive activities. In fact, the durations <strong>of</strong> scientific cloud workflow activities are<br />
normally dominated by the average performance <strong>of</strong> the workflow system over their<br />
lifecycles. Therefore, the forecasting strategies in cloud workflow systems should be<br />
able to adapt to the characteristics <strong>of</strong> scientific cloud workflow activities.<br />
In this chapter, rather than fitting conventional linear time-series models which<br />
are not ideal, we have investigated the idea <strong>of</strong> pattern based time-series forecasting.<br />
Specifically, based on a novel non-linear time-series segmentation algorithm named<br />
K-MaxSDev, a statistical time-series pattern based forecasting strategy which<br />
consists <strong>of</strong> four major functional components: duration series building, duration<br />
pattern recognition, duration pattern matching, <strong>and</strong> duration interval forecasting,<br />
have been proposed. K-MaxSDev includes an <strong>of</strong>fline time-series pattern discovery<br />
process <strong>and</strong> an online forecasting process. The simulation experiments conducted in<br />
our SwinDeW-C cloud workflow system have demonstrated that our time-series<br />
66