Date of Original Version
Abstract or Table of Contents
Fingerprinting is a widely used technique among the networking and security communities for identifying different implementations of the same piece of networking software running on a remote host. A ﬁngerprint is essentially a set of queries and a classiﬁcation function that can be applied on the responses to the queries in order to classify the software into classes. So far, identifying ﬁngerprints remains largely an arduous and manual process. This paper proposes a novel approach for automatic ﬁngerprint generation, that automatically explores a set of candidate queries and applies machine learning techniques to identify the set of valid queries and to learn an adequate classiﬁcation function. Our results show that such an automatic process can generate accurate ﬁngerprints that classify each piece of software into its proper class and that the search space for query exploration remains largely unexploited, with many new such queries awaiting discovery. With a preliminary exploration, we are able to identify new queries not previously used for ﬁngerprinting.