public class NgramExtractor extends Object
| Modifier and Type | Method and Description |
|---|---|
Map<String,Integer> |
extractCountedGrams(CharSequence text) |
List<String> |
extractGrams(CharSequence text)
Creates the n-grams for a given text in the order they occur.
|
NgramExtractor |
filter(NgramFilter filter) |
List<Integer> |
getGramLengths() |
static NgramExtractor |
gramLength(int gramLength) |
static NgramExtractor |
gramLengths(Integer... gramLength) |
NgramExtractor |
textPadding(char textPadding)
To ensure having border grams, this character is added to the left and right of the text.
|
public static NgramExtractor gramLength(int gramLength)
public static NgramExtractor gramLengths(Integer... gramLength)
public NgramExtractor filter(NgramFilter filter)
public NgramExtractor textPadding(char textPadding)
Example: when textPadding is a space ' ' then a text input "foo" becomes " foo ", ensuring that n-grams like " f" are created.
If the text already has such a character in that position (eg starts with), it is not added there.
textPadding - for example a space ' '.@NotNull public List<String> extractGrams(@NotNull CharSequence text)
Example: extractSortedGrams("Foo bar", 2) => [Fo,oo,o , b,ba,ar]
text - @NotNull public Map<String,Integer> extractCountedGrams(@NotNull CharSequence text)
Copyright © 2015. All rights reserved.