"); - for (Token token : tokens) { - if (token.isMatch()) { - html.append(""); - } - html.append(token.getFragment()); - if (token.isMatch()) { - html.append(""); - } + @Override + public void emit(Emit emit) { + emits.add(emit); } - html.append("
"); - System.out.println(html); +}; +``` + +In many cases you may want to do perform tasks with both the non-matching +and the matching text. Such implementations may be better served by using +`Trie.tokenize()`. The `tokenize()` method allows looping over the +corpus to deal with matches as soon as they are encountered. Here's an +example that outputs key words as italicized HTML elements: + +```java +String speech = "The Answer to the Great Question... Of Life, " + + "the Universe and Everything... Is... Forty-two,' said " + + "Deep Thought, with infinite majesty and calm."; + +Trie trie = Trie.builder().ignoreOverlaps().onlyWholeWords().ignoreCase() + .addKeyword("great question") + .addKeyword("forty-two") + .addKeyword("deep thought") + .build(); + +Collection"); + +for (Token token : tokens) { + if (token.isMatch()) { + html.append(""); + } + html.append(token.getFragment()); + if (token.isMatch()) { + html.append(""); + } +} + +html.append("
"); +System.out.println(html); ``` You can also emit custom outputs. This might for example be useful to implement a trivial named entity recognizer. In this case use a PayloadTrie instead of a Trie: ```java - class Word { - private final String gender; - public Word(String gender) { - this.gender = gender; - } +class Word { + private final String gender; + public Word(String gender) { + this.gender = gender; } - - PayloadTrie