23.02.2015 Views

Machine Learning - DISCo

Machine Learning - DISCo

Machine Learning - DISCo

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C that contains every teachable concept relative to X. The set C of all definable<br />

target concepts corresponds to the power set of X-the set of all subsets of X-<br />

which contains ICI = 2IXI concepts. Suppose that instances in X are defined by<br />

n boolean features. In this case, there will be 1x1 = 2" distinct instances, and<br />

therefore ICI = 21'1 = 2' distinct concepts. Of course to learn such an unbiased<br />

concept class, the learner must itself use an unbiased hypothesis space H = C.<br />

Substituting I H I = 22n into Equation (7.2) gives the sample complexity for learning<br />

the unbiased concept class relative to X.<br />

Thus, this unbiased class of target concepts has exponential sample complexity<br />

under the PAC model, according to Equation (7.2). Although Equations (7.2)<br />

and (7.5) are not tight upper bounds, it can in fact be proven that the sample<br />

complexity for the unbiased concept class is exponential in n.<br />

I 7.3.3.2 K-TERM DNF AND K-CNF CONCEPTS<br />

I1<br />

It is also possible to find concept classes that have polynomial sample complexity,<br />

but nevertheless cannot be learned in polynomial time. One interesting example is<br />

the concept class C of k-term disjunctive normal form (k-term DNF) expressions.<br />

k-term DNF expressions are of the form TI v T2 v . . - v Tk, where each term 1;:<br />

is a conjunction of n boolean attributes and their negations. Assuming H = C, it<br />

is easy to show that I HI is at most 3"k (because there are k terms, each of which<br />

may take on 3" possible values). Note 3"k is an overestimate of H, because it is<br />

double counting the cases where = I;. and where 1;: is more_general-than I;..<br />

Still, we can use this upper bound on I HI to obtain an upper bound on the sample<br />

complexity, substituting this into Equation (7.2).<br />

which indicates that the sample complexity of k-term DNF is polynomial in<br />

1/~, 116, n, and k. Despite having polynomial sample complexity, the computational<br />

complexity is not polynomial, because this learning problem can be shown<br />

to be equivalent to other problems that are known to be unsolvable in polynomial<br />

time (unless RP = NP). Thus, although k-term DNF has polynomial sample<br />

complexity, it does not have polynomial computational complexity for a learner<br />

using H = C.<br />

The surprising fact about k-term DNF is that although it is not PAClearnable,<br />

there is a strictly larger concept class that is! This is possible because<br />

the larger concept class has polynomial computation complexity per example and<br />

still has polynomial sample complexity. This larger class is the class of k-CNF<br />

expressions: conjunctions of arbitrary length of the form TI A T2 A. .. A I;., where<br />

each is a disjunction of up to k boolean attributes. It is straightforward to show<br />

that k-CNF subsumes k-DNF, because any k-term DNF expression can easily be

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!