Fast subtree kernels on graphs - VideoLectures
Fast subtree kernels on graphs - VideoLectures
Fast subtree kernels on graphs - VideoLectures
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong><br />
Nino Shervashidze<br />
joint work with Karsten Borgwardt<br />
Machine Learning and Computati<strong>on</strong>al Biology Research Group<br />
Max Planck Institute for Biological Cybernetics, Tübingen<br />
Max Planck Institute for Developmental Biology, Tübingen<br />
9 December 2009<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 1
Introducti<strong>on</strong><br />
Introducti<strong>on</strong><br />
◮ Kernels are inner products in some feature space H:<br />
k(x, x ′ )=〈φ(x), φ(x ′ )〉.<br />
◮ Intuitively, k(x, x ′ ) is a measure of similarity of x and x ′ .<br />
◮ x and x ′ can be vectors, but also strings, trees, <strong>graphs</strong>.<br />
◮ Kernels are used within kernel methods in<br />
◮ classificati<strong>on</strong> (SVM),<br />
◮ regressi<strong>on</strong>,<br />
◮ feature selecti<strong>on</strong>,<br />
◮ two-sample problems, etc.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 2
Introducti<strong>on</strong><br />
Introducti<strong>on</strong><br />
◮ Kernels are inner products in some feature space H:<br />
k(x, x ′ )=〈φ(x), φ(x ′ )〉.<br />
◮ Intuitively, k(x, x ′ ) is a measure of similarity of x and x ′ .<br />
◮ x and x ′ can be vectors, but also strings, trees, <strong>graphs</strong>.<br />
◮ Kernels are used within kernel methods in<br />
◮ classificati<strong>on</strong> (SVM),<br />
◮ regressi<strong>on</strong>,<br />
◮ feature selecti<strong>on</strong>,<br />
◮ two-sample problems, etc.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 2
Introducti<strong>on</strong><br />
Introducti<strong>on</strong><br />
◮ Kernels are inner products in some feature space H:<br />
k(x, x ′ )=〈φ(x), φ(x ′ )〉.<br />
◮ Intuitively, k(x, x ′ ) is a measure of similarity of x and x ′ .<br />
◮ x and x ′ can be vectors, but also strings, trees, <strong>graphs</strong>.<br />
◮ Kernels are used within kernel methods in<br />
◮ classificati<strong>on</strong> (SVM),<br />
◮ regressi<strong>on</strong>,<br />
◮ feature selecti<strong>on</strong>,<br />
◮ two-sample problems, etc.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 2
Introducti<strong>on</strong><br />
Introducti<strong>on</strong><br />
◮ Kernels are inner products in some feature space H:<br />
k(x, x ′ )=〈φ(x), φ(x ′ )〉.<br />
◮ Intuitively, k(x, x ′ ) is a measure of similarity of x and x ′ .<br />
◮ x and x ′ can be vectors, but also strings, trees, <strong>graphs</strong>.<br />
◮ Kernels are used within kernel methods in<br />
◮ classificati<strong>on</strong> (SVM),<br />
◮ regressi<strong>on</strong>,<br />
◮ feature selecti<strong>on</strong>,<br />
◮ two-sample problems, etc.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 2
Introducti<strong>on</strong><br />
Why graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g>?<br />
For instance, they can be used in graph classificati<strong>on</strong>.<br />
figure by Koji Tsuda<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 3
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
◮ Graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> usually count matching sub<strong>graphs</strong> (Haussler, 1999)<br />
◮<br />
paths, walks, cycles, graphlets, etc.<br />
◮ All sub<strong>graphs</strong> kernel is at least as hard to compute as isomorphism<br />
checking (Gärtner et al., 2003)<br />
◮ Restricted classes of sub<strong>graphs</strong>: better runtime (and no isomorphism<br />
checking)<br />
◮ But we still need graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> that<br />
◮ can take into account node and edge labels<br />
◮ are efficient to compute even <strong>on</strong> large <strong>graphs</strong><br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 4
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
◮ Graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> usually count matching sub<strong>graphs</strong> (Haussler, 1999)<br />
◮<br />
paths, walks, cycles, graphlets, etc.<br />
◮ All sub<strong>graphs</strong> kernel is at least as hard to compute as isomorphism<br />
checking (Gärtner et al., 2003)<br />
◮ Restricted classes of sub<strong>graphs</strong>: better runtime (and no isomorphism<br />
checking)<br />
◮ But we still need graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> that<br />
◮ can take into account node and edge labels<br />
◮ are efficient to compute even <strong>on</strong> large <strong>graphs</strong><br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 4
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
◮ Graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> usually count matching sub<strong>graphs</strong> (Haussler, 1999)<br />
◮<br />
paths, walks, cycles, graphlets, etc.<br />
◮ All sub<strong>graphs</strong> kernel is at least as hard to compute as isomorphism<br />
checking (Gärtner et al., 2003)<br />
◮ Restricted classes of sub<strong>graphs</strong>: better runtime (and no isomorphism<br />
checking)<br />
◮ But we still need graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> that<br />
◮ can take into account node and edge labels<br />
◮ are efficient to compute even <strong>on</strong> large <strong>graphs</strong><br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 4
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
◮ Graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> usually count matching sub<strong>graphs</strong> (Haussler, 1999)<br />
◮<br />
paths, walks, cycles, graphlets, etc.<br />
◮ All sub<strong>graphs</strong> kernel is at least as hard to compute as isomorphism<br />
checking (Gärtner et al., 2003)<br />
◮ Restricted classes of sub<strong>graphs</strong>: better runtime (and no isomorphism<br />
checking)<br />
◮ But we still need graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g> that<br />
◮ can take into account node and edge labels<br />
◮ are efficient to compute even <strong>on</strong> large <strong>graphs</strong><br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 4
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
10<br />
9<br />
Subtree kernel (Ram<strong>on</strong> and Gaertner, 2003)<br />
Runtime for labeled <strong>graphs</strong><br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
1<br />
100 200 300 400 500 600 700 800 900 1000<br />
Graph size<br />
100 <strong>graphs</strong>, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height 3, alphabet size 25, max. degree n/2, n 2 /2 edges<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 5
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
Runtime for labeled <strong>graphs</strong><br />
10<br />
9<br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
Subtree kernel (Ram<strong>on</strong> and Gaertner, 2003)<br />
<str<strong>on</strong>g>Fast</str<strong>on</strong>g> Random Walk (Vishwanathan et al., 2007)<br />
1<br />
100 200 300 400 500 600 700 800 900 1000<br />
Graph size<br />
100 <strong>graphs</strong>, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height 3, alphabet size 25, max. degree n/2, n 2 /2 edges<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 5
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
Runtime for labeled <strong>graphs</strong><br />
10<br />
9<br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
Subtree kernel (Ram<strong>on</strong> and Gaertner, 2003)<br />
<str<strong>on</strong>g>Fast</str<strong>on</strong>g> Random Walk (Vishwanathan et al., 2007)<br />
Shortest Path (Borgwardt and Kriegel, 2005)<br />
1<br />
100 200 300 400 500 600 700 800 900 1000<br />
Graph size<br />
100 <strong>graphs</strong>, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height 3, alphabet size 25, max. degree n/2, n 2 /2 edges<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 5
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
Runtime for labeled <strong>graphs</strong><br />
10<br />
9<br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
Subtree kernel (Ram<strong>on</strong> and Gaertner, 2003)<br />
<str<strong>on</strong>g>Fast</str<strong>on</strong>g> Random Walk (Vishwanathan et al., 2007)<br />
Shortest Path (Borgwardt and Kriegel, 2005)<br />
3-Graphlet (Shervashidze et al., 2009)<br />
1<br />
100 200 300 400 500 600 700 800 900 1000<br />
Graph size<br />
100 <strong>graphs</strong>, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height 3, alphabet size 25, max. degree n/2, n 2 /2 edges<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 5
Introducti<strong>on</strong><br />
Overview<br />
Overview of graph <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
Runtime for labeled <strong>graphs</strong><br />
10<br />
9<br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
Subtree kernel (Ram<strong>on</strong> and Gaertner, 2003)<br />
<str<strong>on</strong>g>Fast</str<strong>on</strong>g> Random Walk (Vishwanathan et al., 2007)<br />
Shortest Path (Borgwardt and Kriegel, 2005)<br />
3-Graphlet (Shervashidze et al., 2009)<br />
Weisfeiler-Lehman <str<strong>on</strong>g>subtree</str<strong>on</strong>g> kernel (this talk)<br />
1<br />
100 200 300 400 500 600 700 800 900 1000<br />
Graph size<br />
100 <strong>graphs</strong>, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height 3, alphabet size 25, max. degree n/2, n 2 /2 edges<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 5
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
◮ Informally, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> iteratively look at neighborhoods of nodes.<br />
◮ Unfolding the structure over iterati<strong>on</strong>s, we get a tree-like pattern,<br />
called “<str<strong>on</strong>g>subtree</str<strong>on</strong>g>” or “tree-walk” in the literature.<br />
1<br />
2<br />
3<br />
1<br />
2<br />
3<br />
6<br />
6<br />
4<br />
5<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 6
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
◮ Informally, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> iteratively look at neighborhoods of nodes.<br />
◮ Unfolding the structure over iterati<strong>on</strong>s, we get a tree-like pattern,<br />
called “<str<strong>on</strong>g>subtree</str<strong>on</strong>g>” or “tree-walk” in the literature.<br />
1<br />
2<br />
3<br />
1<br />
2<br />
3<br />
6<br />
6<br />
5<br />
4<br />
1 3 1 2 4 5 1 5<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 6
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
◮ Informally, <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> iteratively look at neighborhoods of nodes.<br />
◮ Unfolding the structure over iterati<strong>on</strong>s, we get a tree-like pattern,<br />
called “<str<strong>on</strong>g>subtree</str<strong>on</strong>g>” or “tree-walk” in the literature.<br />
1<br />
2<br />
3<br />
1<br />
2<br />
3<br />
6<br />
6<br />
5<br />
4<br />
1 3 1 2 4 5 1 5<br />
Subtree of height 2 rooted at the node 1<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 6
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm (1968)<br />
Given two <strong>graphs</strong> G and G ′<br />
1<br />
1<br />
1<br />
1<br />
1<br />
1<br />
1<br />
1 1<br />
1<br />
1<br />
1<br />
Are they n<strong>on</strong>-isomorphic?<br />
1-dimensi<strong>on</strong>al WL algorithm may answer this questi<strong>on</strong>.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 7
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 1<br />
Each iterati<strong>on</strong> of the 1-dimensi<strong>on</strong>al WL test comprises the following steps:<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? Yes.<br />
C<strong>on</strong>tinue.<br />
1<br />
1<br />
1,111<br />
1,11<br />
1<br />
1<br />
1, 11<br />
1,111<br />
1 1<br />
1 1<br />
1,1111 1,11<br />
1,11 1,11<br />
1<br />
1<br />
1,1111<br />
1,11<br />
1<br />
1<br />
1,111<br />
1,111<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 8
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 1<br />
Each iterati<strong>on</strong> of the 1-dimensi<strong>on</strong>al WL test comprises the following steps:<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
1,111<br />
1,11<br />
1, 11<br />
1,111<br />
1,1111 1,11<br />
1,11 1,11<br />
1,1111<br />
1,11<br />
1,111<br />
1,111<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? Yes.<br />
C<strong>on</strong>tinue.<br />
1, 11<br />
1, 11<br />
1, 11<br />
1, 11<br />
1, 11<br />
1, 11<br />
1,111<br />
1,111<br />
1,111<br />
1,111<br />
1,1111<br />
1,1111<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 8
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 1<br />
Each iterati<strong>on</strong> of the 1-dimensi<strong>on</strong>al WL test comprises the following steps:<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
1, 11<br />
1, 11<br />
1, 11<br />
1, 11<br />
1, 11<br />
1, 11<br />
1,111<br />
1,111<br />
1,111<br />
1,111<br />
1,1111<br />
1,1111<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? Yes.<br />
C<strong>on</strong>tinue.<br />
1, 11<br />
1,111<br />
1,1111<br />
2<br />
3<br />
4<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 8
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 1<br />
Each iterati<strong>on</strong> of the 1-dimensi<strong>on</strong>al WL test comprises the following steps:<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? Yes.<br />
C<strong>on</strong>tinue.<br />
1,111<br />
1,11<br />
3<br />
2<br />
1, 11<br />
1,111<br />
2<br />
3<br />
1,1111 1,11<br />
1,11 1,11<br />
4 2<br />
2 2<br />
1,1111<br />
1,11<br />
4<br />
2<br />
1,111<br />
1,111<br />
3<br />
3<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 8
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 1<br />
Each iterati<strong>on</strong> of the 1-dimensi<strong>on</strong>al WL test comprises the following steps:<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? Yes.<br />
C<strong>on</strong>tinue.<br />
3<br />
2<br />
2<br />
3<br />
4<br />
2<br />
2 2<br />
4<br />
2<br />
3<br />
3<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 8
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 2<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3<br />
2<br />
2<br />
3<br />
4 2<br />
2 2<br />
4<br />
2<br />
3<br />
3<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? No.<br />
Output YES<br />
3,242<br />
2, 43<br />
4,2332<br />
2,42<br />
4,3322<br />
3,324<br />
Overall complexity -<br />
O(hm) for h iterati<strong>on</strong>s<br />
2,33<br />
3,242<br />
2,34 2,24<br />
2,33<br />
3,243<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 9
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 2<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
3,242<br />
2, 43<br />
4,2332<br />
2,42<br />
4,3322<br />
3,324<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
2,33<br />
3,242<br />
2,34 2,24<br />
2,33<br />
3,243<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? No.<br />
Output YES<br />
3,224<br />
2, 34<br />
4,2233<br />
2,24<br />
4,2233<br />
3,234<br />
Overall complexity -<br />
O(hm) for h iterati<strong>on</strong>s<br />
2,33<br />
3,224<br />
2,34 2,24<br />
2,33<br />
3,234<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 9
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 2<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? No.<br />
Output YES<br />
Overall complexity -<br />
O(hm) for h iterati<strong>on</strong>s<br />
2, 34<br />
3,224 4,2233 2,24<br />
2,33<br />
2,34 2,24<br />
3,224<br />
2,24<br />
2, 34<br />
2,24 2, 34<br />
2,33<br />
3,224<br />
2,33<br />
3,224<br />
4,2233<br />
3,234<br />
3,234<br />
2,33<br />
3,234<br />
3,234<br />
4,2233<br />
4,2233<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 9
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 2<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? No.<br />
Output YES<br />
Overall complexity -<br />
O(hm) for h iterati<strong>on</strong>s<br />
2,24<br />
2,24<br />
2, 34<br />
2, 34<br />
2,33<br />
2,33<br />
3,224<br />
3,224<br />
2,24 5 3,224<br />
2,33<br />
6 3,234<br />
2,34<br />
7 4,2233<br />
3,234<br />
3,234<br />
4,2233<br />
4,2233<br />
8<br />
9<br />
10<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 9
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 2<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
3,224<br />
2, 34<br />
4,2233<br />
2,24<br />
4,2233<br />
3,234<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
2,33<br />
3,224<br />
2,34 2,24<br />
2,33<br />
3,234<br />
Are the label sets of G<br />
and G ′ identical? No.<br />
Output YES<br />
8<br />
7<br />
10<br />
5<br />
10<br />
9<br />
Overall complexity -<br />
O(hm) for h iterati<strong>on</strong>s<br />
6<br />
8<br />
7 5<br />
6<br />
9<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 9
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 2<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
8<br />
7<br />
10<br />
5<br />
10<br />
9<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? No.<br />
Output YES<br />
6<br />
8<br />
7 5<br />
6<br />
9<br />
Overall complexity -<br />
O(hm) for h iterati<strong>on</strong>s<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 9
Introducti<strong>on</strong><br />
Subtree <str<strong>on</strong>g>kernels</str<strong>on</strong>g><br />
The 1-dimensi<strong>on</strong>al Weisfeiler-Lehman algorithm: Iterati<strong>on</strong> 2<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
8<br />
7<br />
10<br />
5<br />
10<br />
9<br />
3. Relabeling O(n)<br />
Are the label sets of G<br />
and G ′ identical? No.<br />
Output YES<br />
6<br />
8<br />
7 5<br />
6<br />
9<br />
Overall complexity -<br />
O(hm) for h iterati<strong>on</strong>s<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 9
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Differences between test and kernel<br />
WL kernel vs isomorphism test<br />
The test<br />
◮ checks sets of node labels of<br />
two <strong>graphs</strong> for identity after<br />
each iterati<strong>on</strong><br />
◮ stops when the sets become<br />
different or when number of<br />
iterati<strong>on</strong>s reaches n<br />
◮ is computed in O(hm)<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 10
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Differences between test and kernel<br />
WL kernel vs isomorphism test<br />
The test<br />
◮ checks sets of node labels of<br />
two <strong>graphs</strong> for identity after<br />
each iterati<strong>on</strong><br />
◮ stops when the sets become<br />
different or when number of<br />
iterati<strong>on</strong>s reaches n<br />
◮ is computed in O(hm)<br />
The kernel<br />
◮ counts matching pairs of<br />
labels in two <strong>graphs</strong> after<br />
each iterati<strong>on</strong><br />
◮ the number of iterati<strong>on</strong>s h is<br />
a parameter of the algorithm<br />
(in practice h of 2 or 3 gives<br />
the best results)<br />
◮ is computed in O(hm)<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 10
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> a pair of <strong>graphs</strong>: Initializati<strong>on</strong><br />
5<br />
2<br />
2<br />
5<br />
4<br />
3<br />
4<br />
3<br />
G 1<br />
1<br />
1<br />
2<br />
G ′<br />
Initial feature vector representati<strong>on</strong>s of G and G ′ :<br />
φ (G) = (2, 1, 1, 1, 1)<br />
0<br />
φ (G’) = (1, 2, 1, 1, 1)<br />
0<br />
12345<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 11
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> a pair of <strong>graphs</strong>: Iterati<strong>on</strong> 1<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
4<br />
5<br />
1<br />
2<br />
1<br />
3<br />
4<br />
2<br />
1<br />
5<br />
2<br />
3<br />
3. Relabeling O(n)<br />
5,234<br />
2,35<br />
2,45<br />
5,234<br />
Update feature vector<br />
representati<strong>on</strong>s of G and<br />
G ′ .<br />
k (1)<br />
WL (G, G′ ) = 11.<br />
4,1135<br />
1,4<br />
3,245<br />
1,4<br />
4,1235<br />
1,4<br />
3,245<br />
2,3<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 12
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> a pair of <strong>graphs</strong>: Iterati<strong>on</strong> 1<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Update feature vector<br />
representati<strong>on</strong>s of G and<br />
G ′ .<br />
k (1)<br />
WL (G, G′ ) = 11.<br />
5,234 2,35<br />
2,45 5,234<br />
4,1135<br />
3,245 4,1235 3,245<br />
1,4 1,4<br />
1,4 2,3<br />
1,4<br />
2,3<br />
2,35<br />
6<br />
7<br />
8<br />
3,245<br />
4,1135<br />
4,1235<br />
10<br />
11<br />
12<br />
2,45 9 5,234<br />
13<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 12
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> a pair of <strong>graphs</strong>: Iterati<strong>on</strong> 1<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Update feature vector<br />
representati<strong>on</strong>s of G and<br />
G ′ .<br />
k (1)<br />
WL (G, G′ ) = 11.<br />
1,4<br />
2,3<br />
2,35<br />
6<br />
7<br />
8<br />
3,245<br />
4,1135<br />
4,1235<br />
2,45 9 5,234<br />
13<br />
13 8 9 13<br />
11 10 12<br />
10<br />
6 6 6 7<br />
10<br />
11<br />
12<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 12
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> a pair of <strong>graphs</strong>: Iterati<strong>on</strong> 1<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Update feature vector<br />
representati<strong>on</strong>s of G and<br />
G ′ .<br />
k (1)<br />
WL (G, G′ ) = 11.<br />
13 8 9 13<br />
11 10 12<br />
10<br />
6 6 6 7<br />
φ (G) = (2, 1, 1, 1, 1, 2, 0, 1, 0, 1, 1, 0, 1)<br />
1<br />
φ (G’) = (1, 2, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1)<br />
1<br />
Initializati<strong>on</strong><br />
12345<br />
1st iterati<strong>on</strong><br />
6 7 8 9 10 11 12 13<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 12
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> a pair of <strong>graphs</strong>: Iterati<strong>on</strong> 1<br />
1. Multiset-label<br />
determinati<strong>on</strong> and<br />
sorting<br />
O(m) via bucket sort<br />
2. Label compressi<strong>on</strong><br />
O(m) via radix sort<br />
3. Relabeling O(n)<br />
Update feature vector<br />
representati<strong>on</strong>s of G and<br />
G ′ .<br />
k (1)<br />
WL (G, G′ ) = 11.<br />
13 8 9 13<br />
11 10 12<br />
10<br />
6 6 6 7<br />
φ (G) = (2, 1, 1, 1, 1, 2, 0, 1, 0, 1, 1, 0, 1)<br />
1<br />
φ (G’) = (1, 2, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1)<br />
1<br />
Initializati<strong>on</strong><br />
12345<br />
1st iterati<strong>on</strong><br />
6 7 8 9 10 11 12 13<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 12
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> a pair of <strong>graphs</strong> more formally<br />
The Weisfeiler-Lehman kernel <strong>on</strong> two <strong>graphs</strong> G and G ′ is defined as:<br />
where<br />
k (h)<br />
WL (G, G′ )= ∣ ∣{(s i (v),s i (v ′ ))|f(s i (v)) = f(s i (v ′ )),<br />
i ∈ {0, . . . , h},v ∈ V, v ′ ∈ V ′ } ∣ ∣,<br />
◮ s i (v) is the sorted multiset-label of node v in iterati<strong>on</strong> i,<br />
◮ f is an injective label compressi<strong>on</strong> functi<strong>on</strong>,<br />
◮ the sets {f(s i (v))|v ∈ V ∪ V ′ } and {f(s j (v))|v ∈ V ∪ V ′ } are<br />
disjoint for all i ≠ j,<br />
◮ s 0 (v) is the original label of v in case of labeled <strong>graphs</strong> and 0 in case<br />
of unlabeled <strong>graphs</strong>,<br />
◮ and f(s 0 (v)) = s 0 (v).<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 13
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> N <strong>graphs</strong><br />
◮ Naive computati<strong>on</strong> of our kernel <strong>on</strong> N <strong>graphs</strong> is O(N 2 hm).<br />
◮ Instead, perform the following steps for all <strong>graphs</strong> in each iterati<strong>on</strong>:<br />
1. Multiset-label determinati<strong>on</strong> and sorting<br />
2. Label compressi<strong>on</strong> via hashing<br />
3. Relabeling<br />
◮ WL kernel for all pairs can be computed in<br />
◮ In practice the first term dominates the runtime.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 14
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> N <strong>graphs</strong><br />
◮ Naive computati<strong>on</strong> of our kernel <strong>on</strong> N <strong>graphs</strong> is O(N 2 hm).<br />
◮ Instead, perform the following steps for all <strong>graphs</strong> in each iterati<strong>on</strong>:<br />
1. Multiset-label determinati<strong>on</strong> and sorting<br />
2. Label compressi<strong>on</strong> via hashing<br />
3. Relabeling<br />
◮ WL kernel for all pairs can be computed in<br />
◮ In practice the first term dominates the runtime.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 14
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> N <strong>graphs</strong><br />
◮ Naive computati<strong>on</strong> of our kernel <strong>on</strong> N <strong>graphs</strong> is O(N 2 hm).<br />
◮ Instead, perform the following steps for all <strong>graphs</strong> in each iterati<strong>on</strong>:<br />
1. Multiset-label determinati<strong>on</strong> and sorting<br />
2. Label compressi<strong>on</strong> via hashing<br />
3. Relabeling<br />
◮ WL kernel for all pairs can be computed in O(Nhm + N 2 hn).<br />
◮ In practice the first term dominates the runtime.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 14
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Definiti<strong>on</strong>s<br />
The Weisfeiler-Lehman kernel <strong>on</strong> N <strong>graphs</strong><br />
◮ Naive computati<strong>on</strong> of our kernel <strong>on</strong> N <strong>graphs</strong> is O(N 2 hm).<br />
◮ Instead, perform the following steps for all <strong>graphs</strong> in each iterati<strong>on</strong>:<br />
1. Multiset-label determinati<strong>on</strong> and sorting<br />
2. Label compressi<strong>on</strong> via hashing<br />
3. Relabeling<br />
◮ WL kernel for all pairs can be computed in O(Nhm+N 2 hn).<br />
◮ In practice the first term dominates the runtime.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 14
The Weisfeiler-Lehman kernel <strong>on</strong> <strong>graphs</strong><br />
Runtime behaviour <strong>on</strong> synthetic <strong>graphs</strong><br />
Runtime comparis<strong>on</strong> of naive and hashing approaches<br />
10 5 Number of <strong>graphs</strong> N<br />
Runtime in sec<strong>on</strong>ds<br />
10 4<br />
10 3<br />
10 2<br />
10 1<br />
10 0<br />
naive<br />
with hashing<br />
Runtime in sec<strong>on</strong>ds<br />
10 1<br />
10 1 10 2 10 3<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 15
1 hour<br />
Datasets<br />
minute<br />
10 sec<br />
Experimental evaluati<strong>on</strong><br />
◮ MUTAG - mutagenic/n<strong>on</strong>-mutagenic nitro compounds for Salm<strong>on</strong>ella<br />
typhimurium<br />
85 %<br />
◮ NCI1 and NCI109 - active/inactive compounds in an anti-cancer<br />
screen<br />
80 %<br />
75 %<br />
70 %<br />
65 %<br />
60 %<br />
55 %<br />
50 %<br />
◮ D & D - enzymes/n<strong>on</strong>-enzymes<br />
Setup<br />
Dataset MUTAG NCI1 NCI109 D & D<br />
Maximum # nodes 28 111 111 5748<br />
Average # nodes 17.93 29.87 29.68 284.32<br />
# labels 7 37 54 89<br />
Number MUTAG of <strong>graphs</strong>NCI1 188 4110 NCI109 4127 D&D 1178<br />
graph size<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 16
Experimental evaluati<strong>on</strong><br />
Setup<br />
Comparis<strong>on</strong> partners<br />
Runtime for labeled <strong>graphs</strong><br />
10<br />
9<br />
8<br />
7<br />
6<br />
5<br />
4<br />
3<br />
2<br />
Subtree kernel (Ram<strong>on</strong> and Gaertner, 2003)<br />
<str<strong>on</strong>g>Fast</str<strong>on</strong>g> Random Walk (Vishwanathan et al., 2007)<br />
Shortest Path (Borgwardt and Kriegel, 2005)<br />
3-Graphlet (Shervashidze et al., 2009)<br />
Weisfeiler-Lehman <str<strong>on</strong>g>subtree</str<strong>on</strong>g> kernel (this talk)<br />
1<br />
100 200 300 400 500 600 700 800 900 1000<br />
Graph size<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 17
Experimental evaluati<strong>on</strong><br />
Results<br />
Runtime and accuracy<br />
1000 days*<br />
100 days*<br />
10 days*<br />
1 day<br />
1 hour<br />
WL<br />
RG<br />
3 Graphlet<br />
RW<br />
SP<br />
1 minute<br />
10 sec<br />
* extrapolated<br />
85 %<br />
80 %<br />
75 %<br />
70 %<br />
65 %<br />
60 %<br />
55 %<br />
50 %<br />
MUTAG NCI1 NCI109 D&D<br />
graph size<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 18
C<strong>on</strong>clusi<strong>on</strong><br />
C<strong>on</strong>clusi<strong>on</strong> and outlook<br />
◮ We have defined a <str<strong>on</strong>g>subtree</str<strong>on</strong>g> kernel <strong>on</strong> <strong>graphs</strong> that is able to deal with<br />
node and edge labels. Its computati<strong>on</strong> time is O(Nhm)<br />
◮ linear in the number of <strong>graphs</strong> N,<br />
◮ linear in <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height h,<br />
◮ linear in the number of edges in each graph, m.<br />
◮ Inexact matching of the <str<strong>on</strong>g>subtree</str<strong>on</strong>g>s?<br />
◮ C<strong>on</strong>tinuous or high-dimensi<strong>on</strong>al node labels?<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 19
C<strong>on</strong>clusi<strong>on</strong><br />
C<strong>on</strong>clusi<strong>on</strong> and outlook<br />
◮ We have defined a <str<strong>on</strong>g>subtree</str<strong>on</strong>g> kernel <strong>on</strong> <strong>graphs</strong> that is able to deal with<br />
node and edge labels. Its computati<strong>on</strong> time is O(Nhm)<br />
◮ linear in the number of <strong>graphs</strong> N,<br />
◮ linear in <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height h,<br />
◮ linear in the number of edges in each graph, m.<br />
◮ Inexact matching of the <str<strong>on</strong>g>subtree</str<strong>on</strong>g>s?<br />
◮ C<strong>on</strong>tinuous or high-dimensi<strong>on</strong>al node labels?<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 19
C<strong>on</strong>clusi<strong>on</strong><br />
C<strong>on</strong>clusi<strong>on</strong> and outlook<br />
◮ We have defined a <str<strong>on</strong>g>subtree</str<strong>on</strong>g> kernel <strong>on</strong> <strong>graphs</strong> that is able to deal with<br />
node and edge labels. Its computati<strong>on</strong> time is O(Nhm)<br />
◮ linear in the number of <strong>graphs</strong> N,<br />
◮ linear in <str<strong>on</strong>g>subtree</str<strong>on</strong>g> height h,<br />
◮ linear in the number of edges in each graph, m.<br />
◮ Inexact matching of the <str<strong>on</strong>g>subtree</str<strong>on</strong>g>s?<br />
◮ C<strong>on</strong>tinuous or high-dimensi<strong>on</strong>al node labels?<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 19
C<strong>on</strong>clusi<strong>on</strong><br />
Acknowledgements<br />
We would like to thank Kurt Mehlhorn, Pascal Schweitzer, and Erik Jan<br />
van Leeuwen for fruitful discussi<strong>on</strong>s.<br />
N. Shervashidze, K. Borgwardt <str<strong>on</strong>g>Fast</str<strong>on</strong>g> <str<strong>on</strong>g>subtree</str<strong>on</strong>g> <str<strong>on</strong>g>kernels</str<strong>on</strong>g> <strong>on</strong> <strong>graphs</strong> NIPS 20