Abstract
The current string matching algorithms nearly can not afford the burden of large memory demand when the patters amount increases dramatically. Matching automaton can not be established at all when the amount of patterns is at least tens of thousands. We present a solution to the problem of super large scale patterns matching (SLSPM). In our design, a matching trie is divided into one block matching trie and many general character matching tries if possible. During a block matching procedure our block matching automaton (trie) does not read the current state first. Instead, the automaton first reads the current text block symbol and decides whether it will be matched or not by a hash function. Then, the automaton looks for the current state in the states set in which all the states recognize the same current text block symbol. After the current state is found the automaton continues to read the next text block symbol. The theoretical analysis shows that under the worst case the proposed algorithm takes O(n) time approximately, where n is the length of the text. The experiment results show that our design matches only a little faster than the advanced Aho-Corasick because in the advanced Aho-Corasick the entire possible transition information has been stored. The advantage of SLSPM is that under software environment SLSPM is not slower than AC during the matching procedure, and also at least tens of thousands patterns can be loaded into the hybrid automatons of SLSPM so that it can be used well for super large scale patters matching environment.
| Original language | English |
|---|---|
| Pages (from-to) | 1147-1158 |
| Number of pages | 12 |
| Journal | Jisuanji Xuebao/Chinese Journal of Computers |
| Volume | 37 |
| Issue number | 5 |
| DOIs | |
| State | Published - May 2014 |
Keywords
- Algorithm
- Hybrid automaton
- Information security
- Network security
- String matching
- Super large scale patterns matching
Fingerprint
Dive into the research topics of 'Pattern matching for super large patterns set'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver