n his campaign for the U.S. presidency from 1975 to 1979, Ronald Reagan delivered over 1000 radio broadcasts. For over 600 of these we have direct evidence of Reagan's authorship of the text of the speeches, in the form of yellow pads, with material written ``in his own hand''. The aim of this study was to determine the authorship of 314 of the broadcasts for which no direct evidence is available.

Peter Hannaford had been Reagan's main aide in drafting texts for the radio addresses during the years 1976-79, whereas the situation was less clear in 1975, thus we learned both how to discriminate between the writing styles of Reagan and Hannaford, and we focused on stylistic differences between Reagan and the undistinguished pool of his collaborators to properly address the prediction problem for speeches delivered in different epochs. We explored a wide range of off-the-shelf classification methods as well as fully Bayesian Poisson and Negative-Binomial models for word counts. Simple majority voting reinforced the cross-validated accuracies of our predictions on speeches of known authorship, that settled beyond 90% in most cases. We produced separate sets of predictions using the most accurate classification methods and the fully Bayesian models, for the 314 speeches whose author is uncertain. All the predictions agree on 135 of the ``unknown'' speeches, whereas the fully Bayesian models agree on 289 of them. We further approximated log-odds of authorship as a measure of the strength of our predictions.