Tamino XQuery User Guide - Software AG Documentation
Tamino XQuery User Guide - Software AG Documentation
Tamino XQuery User Guide - Software AG Documentation
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Text Retrieval<br />
Wildcard Characters<br />
In contrast to the maskcard character, which matches exactly one character, the wildcard character<br />
matches zero or more characters in a word. By default, the wildcard character is an asterisk "*".<br />
Consider the following query:<br />
let $text := text{"one, two"}<br />
return tf:containsAdjacentText($text, 1, "one", "*", "*")<br />
This query returns false, since tf:containsAdjacentText expects two word tokens adjacent to<br />
"one".<br />
If you use the default tokenizer, i.e. the white space-separated tokenizer, then the wildcard character<br />
is always the asterisk "*" (Unicode value U+002A).<br />
Using the Japanese Tokenizer<br />
If you use the Japanese tokenizer, all of the following characters are recognized as wildcard characters:<br />
Unicode Name<br />
ASTERISK<br />
ARABIC FIVE POINTED STAR<br />
ASTERISK OPERATOR<br />
HEAVY ASTERISK<br />
SMALL ASTERISK<br />
FULL WIDTH ASTERISK<br />
Code Value<br />
U+002A<br />
U+066D<br />
U+2217<br />
U+2731<br />
U+FE61<br />
U+FF0A<br />
Note: In contrast to the standard white space-separated tokenizer, this definition of wildcard<br />
characters is fixed and cannot be changed.<br />
The Japanese tokenizer does not support wildcard characters in the middle of a word, since there<br />
are no explicit delimiter characters. So " * " will be treated as " *" adj " ".<br />
The example queries below focus on the contents of the patient/submitted/diagnosis nodes to<br />
show the effect of performing search operations with or without wildcard characters on segmentation<br />
of Japanese words.<br />
100<br />
<strong>XQuery</strong> <strong>User</strong> <strong>Guide</strong>