The security of data and network is in the hands of intrusion detection systems, virus scanners and spam filters, which are all critically based on string matching. But network links are getting faster and faster, and string matching is getting more and more difficult to per-form in real time. Traditional processors are not keeping up with the performance demands, whereas specialized hard-ware will never be able to compete with commodity hard-ware in terms of cost effectiveness, reusability and ease of programming.
We can manage the thing by the Cell processor when employed for string matching algorithms based on Deterministic Finite-state Automata(DFA). read about the Cell processor here and here.
The Cell BE is a heterogeneous, multi-core chip capable of massive floating point processing, optimized for compute-intensive workloads and broadband, rich media applications. It is composed of one 64-bit Power Processor Element (PPE), 8 specialized co-processors called Synergistic Processing Elements (SPE), a high-speed memory controller and a high-bandwidth bus interface.
The PPE is responsible for running the operating system and coordinating the SPEs. It is a traditional 64-bit PowerPC processor core with a VMX unit, 32 KB Level 1 instruction cache, 32 KB Level 1 data cache, and 512 KB Level 2 cache. The PPE is a dual issue, in-order execution design, 2-way SMT, running at 3.2 GHz.
The DFA reads in the input one symbol at a time and performs a state transition according to the current state and the value of the input. If the destination state s is final, then the current string is recognized. The acceptor enters a final state whenever a portion of the stream matches.
The optimized representation of the DFA is as follows. We represent the State Transition Table (STT) as a data structure in the form of a complete table of words, having a row for each state and a column for each of the possible inputs.
When more throughput is needed, multiple identical tiles can be used concurrently on distinct (with a minor overlap-ping) portions of the input streams. We say that two DFAs combined in this way are “in parallel “.The two tiles have an identical STT and therefore recognize the same dictionary. Since they operate on two portions of the input streams , which are separate (except for a small overlapping region, to allow matching of strings which cross the boundary), the combined throughput is effectively doubled.
Conmtributed by : Partha Sarathi Mondal, Faculty, Dept of IT, MIT
No comments:
Post a Comment