27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Fig. 1 an d the data elements in Fig. 2 ar e considered when<br />

creating and training the predictive models.<br />

A. Fault Prediction Models<br />

Four models are created to identify weather patterns that are<br />

most likely to result in a fault event using the NN, KSVM,<br />

RPART, and NB algorithms. The models were constructed by<br />

taking weather data points joined to fault events, as well as<br />

random weather data s amples when no fault events were<br />

recorded in the selected IOU substations. Since there were a<br />

large number of weather points for the times at which a fault<br />

did not occur, the random sample of records is taken making<br />

sure that all months and days and hours in the day are covered<br />

in the sample. The dataset contained a t otal of 3471 records<br />

(1725 with faults and 1746 without faults) of which 2430 were<br />

used for training each of the four models and 1041 for testing<br />

the models (see Fig. 3). The selection of records without faults<br />

was done in a random fashion. The best-performing model is<br />

the one created with the feed-forward trained by a multi-layer<br />

perceptron back-propagation NN algorithm (see shaded area in<br />

Fig. 3). This trained model produced an accuracy of 75%, an<br />

average precision of 77%, an average recall of 73%, and an f-<br />

measure of 75%. Table I shows a sample of the results<br />

generated in Rapid Miner for the NN model.<br />

B. Zone Prediction Models<br />

The four zone prediction models were trained by<br />

considering historical fault data from the IOU grid and weather<br />

data. Of the 1725 records with faults and weather data, 70%<br />

were used for training and 30% for testing the trained models.<br />

The output of these models predict in what zone (AMZ, UMZ,<br />

PMZ) on the IOU grid the fault occurred. The best-performing<br />

model was the one created training a Neural Network<br />

algorithm, as shown in the shaded area in Fig. 3. The model<br />

contains one hidden layer with 20 nodes. The model produced<br />

an accuracy of 66%, an average precision of 69%, an average<br />

recall of 68%, and an f-measure of 68%.<br />

C. Substation Prediction Models<br />

The four substation prediction models were trained by<br />

considering historical fault data from the IOU grid and weather<br />

data. Of the 1725 records with faults and weather data, 70%<br />

were used for training and 30% for testing the trained models.<br />

The output of these models predicts the IOU substation ID<br />

where the fault occurred. The best performing model was the<br />

one created with the RPART algorithm, as shown in the shaded<br />

area in Fig. 3. This trained model produced an accuracy of<br />

59%, an average precision of 66%, an average recall of 54%,<br />

and an f-measure of 59%.<br />

D. Infrastructure Prediction Models<br />

The four infrastructure prediction models were trained by<br />

considering historical fault data from the IOU grid and weather<br />

data. Of the 1725 records with faults and weather data, 70%<br />

were used for training and 30% for testing the trained models.<br />

The output of these models predicts the type of infrastructure,<br />

OH (overhead) or UG (underground) on the section of the IOU<br />

grid where the fault occurred. The best-performing model was<br />

the one created training a Neural Network algorithm, as shown<br />

in the shaded area in Fig. 3. The model produced an accuracy<br />

of 77%, an average precision of 62%, an average recall of 52%,<br />

and an f-measure of 57%.<br />

Weather_Lightning_<br />

Fault_No_Fault<br />

Table<br />

TABLE I.<br />

Fault<br />

Prediction<br />

Models<br />

No Faults<br />

Predicted<br />

Models<br />

NN<br />

KSVM<br />

RPART<br />

NB<br />

Faults<br />

Predicted<br />

Models<br />

NN<br />

KSVM<br />

RPART<br />

NB<br />

Historical<br />

substation data<br />

table joined to<br />

the weather and<br />

other properties<br />

Zone<br />

Prediction<br />

Models<br />

Substation<br />

Prediction<br />

Models<br />

Infrastructure<br />

Prediction<br />

Models<br />

Feeder<br />

Prediction<br />

Models<br />

Figure 3. Machine Learning Models<br />

Weather, Lightning,<br />

Infrastructure, date, time<br />

table (only faults)<br />

Zone<br />

Predicted<br />

Substation<br />

Predicted<br />

Infrastructure<br />

Predicted<br />

Feeder<br />

Predicted<br />

Models<br />

NN<br />

KSVM<br />

RPART<br />

NB<br />

Models<br />

NN<br />

KSVM<br />

RPART<br />

NB<br />

Models<br />

NN<br />

KSVM<br />

RPART<br />

NB<br />

Models<br />

NN<br />

KSVM<br />

RPART<br />

NB<br />

NEURAL NETWORK OUTPUT OF FAULT PREDICTION MODEL<br />

E. Feeder Prediction Models<br />

The four feeder prediction models were trained by<br />

considering historical fault data from the IOU grid and weather<br />

data. Of the 1725 records with faults and weather data, 70%<br />

were used for training and 30% for testing the trained models.<br />

The output of these models predicts the IOU Feeder where the<br />

fault occurred. The best-performing model was the one created<br />

with the recursive partitioning algorithm, as shown in the<br />

shaded area in Fig. 3. This trained model produced an accuracy<br />

of 74%, an average precision of 79%, an average recall of 70%,<br />

and an f -measure of 74%.<br />

F. Comparative f-measures of Predictive Models<br />

Fig. 4 displays a graph with the average f-measure values<br />

from the four different models created for each analysis. The f-<br />

measure is calculated as the harmonic mean of precision and<br />

recall, using the formula given in Eq. 1. Precision pertains to<br />

the fraction of classified set of data points that have been<br />

correctly classified. Precision can be viewed also as the<br />

probability that a (randomly selected, using the uniform<br />

distribution) retrieved data point is relevant. Recall is the<br />

fraction of the actual set of data points that have been correctly<br />

461

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!