Douglas Lovell
360d76193a
Express paramaters as final
2017-10-13 21:17:04 -03:00
Douglas Lovell
82d36386fd
Add comments, rename variables, remove unused lookahead
2017-10-13 20:47:21 -03:00
Douglas Lovell
b11c655ccf
Merge pull request #1 from stillleben/heartbeat
...
Heartbeat improvements from stillleben
2016-08-03 14:38:57 -06:00
Benni
165f18e581
Remove unused varables, remove duplicate tests (both "nonOverlappingWordTransitions" and "nonOverlappingWholeWords" contain exactly the same code)
2016-02-17 18:05:42 +01:00
Benni
06675a6074
Add support for punctuation in text: Those characters will form a separate token now.
2016-02-17 17:56:11 +01:00
Douglas Lovell
3839e406ce
we want the keywords back in all cases after all, but the start and end indexes refer to the text
2015-11-02 17:50:27 -07:00
Douglas Lovell
d764017abe
use word transitions for whole word only mode
2015-11-02 17:14:28 -07:00
Douglas Lovell
f89b000894
progress on leading and trailing white space problem
2015-11-02 11:49:44 -07:00
Douglas Lovell
9aa9695d38
version change for heartbeat changes
2015-11-02 11:49:14 -07:00
Douglas Lovell
c0d89cec2d
add a few word transition tests
2015-10-30 16:25:04 -06:00
Douglas Lovell
f9c2d9d4aa
all the tests pass
2015-10-30 15:03:06 -06:00
Douglas Lovell
b7bb0cbf5b
put text positions on the transitions and track in the token stream
2015-10-30 10:58:15 -06:00
Douglas Lovell
dd5f9b25fa
fix the off by ones
2015-10-29 15:29:44 -06:00
Douglas Lovell
7514478a65
Make the transition token the hash key for a transition
2015-10-29 15:28:23 -06:00
Douglas Lovell
8560af8cce
test trie in word transition mode
2015-10-28 17:03:29 -06:00
Douglas Lovell
51940af6e7
test state with word transitions
2015-10-28 16:57:37 -06:00
Douglas Lovell
1f63ae71d4
StringBuffer, StringIterator are old school
2015-10-28 16:56:27 -06:00
Douglas Lovell
4be3e115b6
add builder method for setting word transitions
2015-10-28 14:10:53 -06:00
Douglas Lovell
a646f233a5
improve the option name
2015-10-28 11:17:46 -06:00
Douglas Lovell
f05026cf90
added word transitions. all tests pass
2015-10-28 10:59:45 -06:00
Douglas Lovell
6283bf039d
first pass refactoring for word transitions
2015-10-27 16:55:35 -06:00
robert-bor
3393e4f51f
Issue #26 tokens report if they are 100% whitespace
2015-09-27 20:56:04 +02:00
robert-bor
438e546245
Issue #25 match tokens report back whether they are whole words or not
2015-09-27 18:22:38 +02:00
robert-bor
877a56c956
Issue #24 removed if condition that checked for empty emit strings, whereas emit() always returns a collection
2015-09-27 17:58:55 +02:00
robert-bor
b42c664796
Issue #24 big cleanup, removed all post-processing methods for whole words and non-overlapping sequences and integrated the same functionality closer to the AC algorithm.
2015-09-27 17:56:42 +02:00
robert-bor
b274844b75
Issue #24 stopOnHit removed; the functionality has been replaced by the superior firstMatch
2015-09-27 14:40:04 +02:00
robert-bor
bfaa32b20e
Issue #24 tokenize() method implementation extracted to separate class
2015-09-27 14:37:36 +02:00
robert-bor
5203efbbcb
Extra explanation on containsMatch
2015-09-23 20:56:24 +02:00
robert-bor
a46177415f
Updated README.md documentation
2015-09-23 08:39:13 +02:00
robert-bor
e365689391
v0.3.0
v0.3.0
2015-09-22 22:27:30 +02:00
robert-bor
dc27d6e3e9
pull #17 changes adopted to implement a whole word check on the entire keyword, including whitespaces.
2015-09-22 22:22:20 +02:00
robert-bor
76ae8222ea
issue #12 adopted the suggestion by yim1990 with a small change, so that the keyword emit is lowercased as well
2015-09-22 22:10:19 +02:00
robert-bor
e2c5334234
pull #14 implemented pull request by rripken for containsMatch and firstMatch
2015-09-22 22:02:30 +02:00
robert-bor
4633b1ba2a
Merge branch 'rripken-master' into feature/footprint-reduction
...
Conflicts:
src/main/java/org/ahocorasick/trie/Trie.java
src/test/java/org/ahocorasick/trie/TrieTest.java
2015-09-22 20:38:29 +02:00
robert-bor
30f003c5ae
Issue #18 fixed link to broken PDF, now points to http://cr.yp.to/bib/1975/aho.pdf
2015-09-22 20:22:24 +02:00
robert-bor
c18e030459
Merge branch 'SubOptimal-contrib' into feature/footprint-reduction
2015-09-22 20:18:52 +02:00
robert-bor
023c253c93
Issue #16 #20 #21 adopted pull request from remen which makes sure the failure states are constructed as part of the trie construction. This prevents the NPE which the referenced issues are complaining about.
2015-09-22 20:14:48 +02:00
robert-bor
fcefdfdaf9
Merge branch 'remen-master' into feature/footprint-reduction
...
Conflicts:
src/main/java/org/ahocorasick/trie/Trie.java
2015-09-22 20:06:04 +02:00
robert-bor
b85f8fc08f
Issue #22 added possibility to stop processing on generating at least one emit
2015-09-22 19:31:05 +02:00
robert-bor
4399e42b99
Issue #23 apply CharSequence to top-level parseText as well
2015-09-22 06:25:50 +02:00
robert-bor
055e13c298
Issue #23 removed the ParseConfiguration, rely on CharSequence instead
2015-09-21 22:03:59 +02:00
robert-bor
88799fb3da
Issue #23 added callback handler concept which omits the custom setting up of a list, but instead places direct calls to the handler. The handler are only supported on the lowest level of aho-corasick, ie no overlap, whole words and token support
...
Also added the possibility to pass a reader to the same level as above.
2015-09-21 21:09:26 +02:00
Petter Remen
9bce51e001
Issue #16 Use builder pattern to create Trie
...
Previously, there was a race condition in Trie#parseText since
it called constructFailureStates on first run without synchronization.
See https://github.com/robert-bor/aho-corasick/issues/16
This commit fixes this by using the builder pattern in order to
create a fully initialized Trie.
N.B. This changes the API
2015-07-03 12:29:31 +02:00
Frank Dietrich
285a74c37f
fix broken link to the white paper
2015-04-30 01:10:47 +02:00
ryan
d1478c7480
HashMap has better performance in my test cases.
2014-10-06 13:34:03 -07:00
ryan
a46e7dfe1d
Fixed formatting changes.
2014-10-06 11:02:01 -07:00
ryan
df503bae43
Added method and tests for a faster path to return the first match.
2014-10-06 10:52:35 -07:00
robert-bor
25eeef5168
v0.2.4 with bugfix #10
v0.2.4
2014-08-27 08:44:06 +02:00
robert-bor
2b125d2689
Issue #10 make sure that State emits a specific match only once
2014-08-27 08:42:46 +02:00
robert-bor
c96c57399a
update README.md
2014-08-26 10:11:05 +02:00