figshare
Browse
file.pdf (159.02 kB)

Polyglot: Automatic Extraction of Protocol Message Format using Dynamic Binary Analysis

Download (159.02 kB)
journal contribution
posted on 2007-01-01, 00:00 authored by Juan Caballero, Heng Yin, Zhenkai Liang, Dawn Song
Protocol reverse engineering, the process of extracting the application-level protocol used by an implementation, without access to the protocol specification, is important for many network security applications. Recent work [17] has proposed protocol reverse engineering by using clustering on network traces. That kind of approach is limited by the lack of semantic information on network traces. In this paper we propose a new approach using program binaries. Our approach, shadowing, uses dynamic binary analysis and is based on a unique intuition—the way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format. We have implemented our approach in a system called Polyglot and evaluated it extensively using real-world implementations of five different protocols: DNS, HTTP, IRC, Samba and ICQ.We compare our results with the manually crafted message format, included in Wireshark, one of the state-of-the-art protocol analyzers. The differences we find are small and usually due to different implementations handling fields in different ways. Finding such differences between implementations is an added benefit, as they are important for problems such as fingerprint generation, fuzzing, and error detection.

History

Date

2007-01-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC