21.01.2014 Views

Xiao Liu PhD Thesis.pdf - Faculty of Information and Communication ...

Xiao Liu PhD Thesis.pdf - Faculty of Information and Communication ...

Xiao Liu PhD Thesis.pdf - Faculty of Information and Communication ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

K-MaxSDev) gains a large improvement on the average accuracy <strong>of</strong> interval<br />

forecasting over the other representative forecasting strategies. The average<br />

accuracy <strong>of</strong> our K-MaxSDev is 80% while that <strong>of</strong> the other strategies are mainly<br />

around 40% to 50%. Among the other 6 strategies, AR behaves the best while the<br />

others have similar performance except MEAN which behaves poorly compared<br />

with the others. However, such results actually demonstrate the highly dynamic<br />

performance <strong>of</strong> the underlying services in cloud workflow systems.<br />

To conclude, in comparison to other representative forecasting strategies in<br />

traditional workflow systems, our forecasting strategy can effectively address the<br />

problems <strong>of</strong> limited sample size <strong>and</strong> frequent turning points, <strong>and</strong> hence achieve<br />

higher accuracy.<br />

4.5 Summary<br />

Interval forecasting for activity durations in cloud workflow systems is <strong>of</strong> great<br />

importance since it is related to most <strong>of</strong> cloud workflow QoS <strong>and</strong> non-QoS<br />

functionalities such as load balancing, workflow scheduling <strong>and</strong> temporal<br />

verification. However, predicting accurate duration intervals is very challenging due<br />

to the dynamic nature <strong>of</strong> cloud computing infrastructures. Most <strong>of</strong> the current work<br />

focuses on the prediction <strong>of</strong> CPU load to facilitate the forecasting <strong>of</strong> computation<br />

intensive activities. In fact, the durations <strong>of</strong> scientific cloud workflow activities are<br />

normally dominated by the average performance <strong>of</strong> the workflow system over their<br />

lifecycles. Therefore, the forecasting strategies in cloud workflow systems should be<br />

able to adapt to the characteristics <strong>of</strong> scientific cloud workflow activities.<br />

In this chapter, rather than fitting conventional linear time-series models which<br />

are not ideal, we have investigated the idea <strong>of</strong> pattern based time-series forecasting.<br />

Specifically, based on a novel non-linear time-series segmentation algorithm named<br />

K-MaxSDev, a statistical time-series pattern based forecasting strategy which<br />

consists <strong>of</strong> four major functional components: duration series building, duration<br />

pattern recognition, duration pattern matching, <strong>and</strong> duration interval forecasting,<br />

have been proposed. K-MaxSDev includes an <strong>of</strong>fline time-series pattern discovery<br />

process <strong>and</strong> an online forecasting process. The simulation experiments conducted in<br />

our SwinDeW-C cloud workflow system have demonstrated that our time-series<br />

66

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!