APPREAL (APProximate REduction of Automata and Languages) tool used to obtain experimental results in TACAS '18 paper: Approximate Reduction of Finite Automata for High-Speed Network Intrusion Detection

This is an artifact containing the tool APPREAL (APProximate REduction of Automata and Languages) used to obtain experimental results in the following paper accepted for publication at TACAS'18:<br>M. Ceska, V. Havlena, L. Holik, O. Lengal, and T. Vojnar. <i>Approximate Reduction of Finite Automata for High-Speed Network Intrusion Detection.</i><div><br></div><div>The approach outlines in the related TACAS'18 paper is implemented in a Python prototype in this repository. Other file type include openly accessible C-specific <b>.c</b> and<b> .h </b>formats, .html, Perl (.pcre) and openly-accessible text file formats.</div><div><br></div><div>The subdirectory <b>.src </b>contains source code relating to the reduction of NFAs (nondeterministic finite automata) obtained from PCREs (Perl compatible regular expressions) that occur in [Snort](<a href="https://www.snort.org">https://www.snort.org</a>) rules:</div><div><br></div><div>* `experiments/` - the setting of our experiments</div><div>* `netbench/` - the Netbench tool that we use to transform PCREs into NFAs</div><div>* `preproc/` - a bunch of small programs used for pre-processing network traffic PCAP files</div><div>* `reduce/` - the tool performing the reduction itself</div><div>* `regexps/` - regular expressions that we have collected</div><div><br></div><div>The subdirectory <b>.packages </b>contains required software packages and a shell file with bash script (install.sh) to install these packages,</div><div><div><br></div><div><div><p><b>Virtual Machine</b></p></div><div><br></div><div>The artifact is prepared to run on the TACAS'18 artifact evaluation virtual machine available here: <a href="https://doi.org/10.6084/m9.figshare.5896615.v1">https://doi.org/10.6084/m9.figshare.5896615.v1</a></div><div><br></div><div>2048 MiB of memory should be sufficient to reproduce the results from Tables 1a and 1b in the related proceedings paper, for reproducing the experiments from Tables 2 and 3, it is recommended to set the memory to 8192 MiB or more.</div></div><div><br></div><div>Detailed instructions on how to reproduce results from the TACAS'18 paper are available in <b>README.txt</b></div><div><b><br></b></div><div><b>Background</b></div><div><b><br></b></div><div>The related TACAS'18 paper considers the problem of approximate reduction of non-deterministic automata that appear in hardware-accelerated network intrusion detection systems (NIDSes). We define an error distance of a reduced automaton from the original one as the probability of packets being incorrectly classified by the reduced automaton (wrt the probabilistic distribution of packets in the network traffic). We use this notion to design an approximate reduction procedure that achieves a great size reduction (much beyond the state-of-the-art language preserving techniques) with a controlled and small error. We have implemented our approach and evaluated it on use cases from SNORT, a popular NIDS. Our results provide experimental evidence that the method can be highly efficient in practice, allowing NIDSes to follow the rapid growth in the speed of networks.<br></div></div>