• Aha, D. W., Kibler, D., and Albert, M. K. (1991). Instance-based learning algorithms. Mach. Learn., 6(1): 3766.
  • Aizawa, A. (2000). The feature quantity: an information theoretic perspective of tfidf-like measures. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '00, pages 104–111, New York, NY, USA. ACM.
  • Baierlein, R. (1971). Atoms and Information Theory: An Introduction to Statistical Mechanics. W.H. Freeman and Company.
  • Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., and Slattery, S. (1998). Learning to extract symbolic knowledge from the world wide web. In Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, AAAI '98/IAAI '98, pages 509–516, Menlo Park, CA, USA. American Association for Artificial Intelligence.
  • Fano, R. M. (1961). Transmission of Information: A Statistical Theory of Communication. MIT Press.
  • Ke, W., Mostafa, J., and Fu, Y. (2007). Collaborative classifier agents: studying the impact of learning in distributed document classification. In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, JCDL '07, pages 428–437, New York, NY, USA. ACM.
  • Knight, K. (1999). Mining online text. Commun. ACM, 42(11): 5861.
  • Kullback, S. and Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22: 7986.
  • Lang, K. (1995). Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331–339.
  • Lewis, D. D., Yang, Y., Rose, T. G., and Li, F. (2004). Rcv1: A new benchmark collection for text categorization research. J. Mach. Learn. Res., 5: 361397.
  • Liu, T., Liu, S., Cheng, Z., and Ma, W.-Y. (2003). An evaluation on feature selection for text clustering. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), Washington DC.
  • Lovins, J. B. (1968). Development of a stemming algorithm. Mechanical Translation and Computational Linguistics, 11: 2231.
  • Manning, C. D., Raghavan, P., and Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press, 1 edition.
  • Robertson, S. (2004). Understanding inverse document frequency: on theoretical arguments for idf. Journal of Documentation, 60: 503520.
  • Robertson, S. and Zaragoza, H. (2009). The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieva, 3(4): 333389.
  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Comput. Surv., 34(1): 147.
  • Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27: 379423 and 623–656.
  • Shaw, D. and Davis, C. H. (1983). Entropy and information: A multidisciplinary overview. Journal of the American Society for Information Science, 34(1): 6774.
  • Siegler, M. and Witbrock, M. (1999). Improving the suitability of imperfect transcriptions for information retrieval from spoken documents. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 505508. IEEE Press.
  • Spärck-Jones, K. (2004). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 60: 493502.
  • Taulbee, O. E. (1965). Invited papers: classification in information storage and retrieval. In Proceedings of the 1965 20th national conference, ACM '65, pages 119–137, New York, NY, USA. ACM. Chairman-House, R. W.
  • Witten, I. H. and Frank, E. (2005). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco, 2nd edition.
  • Yang, Y. and Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In Proceedings of the Fourteenth International Conference on Machine Learning, ICML '97, pages 412–420, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
  • Zhang, D., Wang, J., and Si, L. (2011). Document clustering with universum. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR '11, pages 873–882, New York, NY, USA. ACM.