Date of Original Version



Conference Proceeding

Rights Management

Abstract or Description

Fingerprinting is a widely used technique among the networking and security communities for identifying different implementations of the same piece of networking software running on a remote host. A fingerprint is essentially a set of queries and a classification function that can be applied on the responses to the queries in order to classify the software into classes. So far, identifying fingerprints remains largely an arduous and manual process. This paper proposes a novel approach for automatic fingerprint generation, that automatically explores a set of candidate queries and applies machine learning techniques to identify the set of valid queries and to learn an adequate classification function. Our results show that such an automatic process can generate accurate fingerprints that classify each piece of software into its proper class and that the search space for query exploration remains largely unexploited, with many new such queries awaiting discovery. With a preliminary exploration, we are able to identify new queries not previously used for fingerprinting.