58 3 Algorithms <strong>and</strong> <strong>Techniques</strong>endd 1 =d 1 ∪ X 2 ;Train c 1 based on d 1 <strong>and</strong> c 2 based on d 2 ;U=U-X 1 -X 2 ;endThe advantages of the co-training approach are: (1) the method is simple <strong>and</strong> based onself-training, <strong>and</strong> it can be wrapped to other more complex classifiers; <strong>and</strong> (2) it is not sosensitive to the early error compared with self-training. The disadvantage of the co-trainingis that the assumption s<strong>ee</strong>ms to be too strong that the features of a dataset can not always bepartitioned independently into two subsets.3.4.3 Generative ModelsThe key idea of generative model based on semi-supervised learning approach is that, we constructan appropriate model which can best “match” the data <strong>and</strong> as a result, the classificationcan be easily done. The essential part for constructing the model is to find the proper valuesof the model parameters.There is an assumption that both the labeled <strong>and</strong> the unlabeled datasets have the samekind of model with similar parameters. One common strategy used is to employ EM-like (i.e.,expectation maximization [72]) algorithm to estimate the most like values of the parameters.Generally, the model to be chosen is application oriented <strong>and</strong> can be different accordingto different users, e.g., mixture of Gaussian, naive Bayes, <strong>and</strong> so forth. Algorithm 3.12 showsan example pseudo code of combining naive Bayes model <strong>and</strong> EM procedure [192, 53].Algorithm 3.12: The generative model based algorithmInput: A labeled dataset L, an unlabeled dataset UOutpot: All data with labeled classBuild the initial model M based on L;For i=0,1,2,... while results are not satisfactory doFor each data d ∈ U doE-step: estimate Pr(c|d,M) using a naive Bayes classifier, where c is a cluster;endFor each cluster c doM-step: estimate θ c to build the next model M i+1 , where θ is the parameter vector;endendThe advantages of the generative model based approach are: (1) the method is clear <strong>and</strong>embedded in a probabilistic framework; <strong>and</strong> (2) is very effective if the model is correct tomatch the data. The main drawbacks of the co-training are: (1) the appropriate model is difficultto choose <strong>and</strong> a wrong model may weaken the effectiveness of the approach; (<strong>and</strong> 2) EMis prone to local maxima. If global maximum is very different from local maximum, then th<strong>ee</strong>ffectiveness of the method is weakened.

