Date of Original Version
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)
Copyright 2011 ACL
Abstract or Description
We consider the problem of predicting measurable responses to scientific articles based primarily on their text content. Specifically, we consider papers in two fields (economics and computational linguistics) and make predictions about downloads and within-community citations. Our approach is based on generalized linear models, allowing interpretability; a novel extension that captures first-order temporal effects is also presented. We demonstrate that text features significantly improve accuracy of predictions over metadata features like authors, topical categories, and publication venues.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 594-604.