Research Article
Cross-validation of neural network applications for automatic new topic identification
Article first published online: 10 DEC 2007
DOI: 10.1002/asi.20696
Copyright © 2008 Wiley Periodicals, Inc., A Wiley Company
Issue

Journal of the American Society for Information Science and Technology
Volume 59, Issue 3, pages 339–362, 1 February 2008
Additional Information
How to Cite
Ozmutlu, H. C., Cavdur, F. and Ozmutlu, S. (2008), Cross-validation of neural network applications for automatic new topic identification. Journal of the American Society for Information Science and Technology, 59: 339–362. doi: 10.1002/asi.20696
Publication History
- Issue published online: 29 JAN 2008
- Article first published online: 10 DEC 2007
- Manuscript Revised: 3 MAR 2007
- Manuscript Accepted: 3 MAR 2007
- Manuscript Received: 5 APR 2005
- Abstract
- Article
- References
- Cited By
Abstract
The purpose of this study is to provide results from experiments designed to investigate the cross-validation of an artificial neural network application to automatically identify topic changes in Web search engine user sessions by using data logs of different Web search engines for training and testing the neural network. Sample data logs from the FAST and Excite search engines are used in this study. The results of the study show that identification of topic shifts and continuations on a particular Web search engine user session can be achieved with neural networks that are trained on a different Web search engine data log. Although FAST and Excite search engine users differ with respect to some user characteristics (e.g., number of queries per session, number of topics per session), the results of this study demonstrate that both search engine users display similar characteristics as they shift from one topic to another during a single search session. The key finding of this study is that a neural network that is trained on a selected data log could be universal; that is, it can be applicable on all Web search engine transaction logs regardless of the source of the training data log.

1532-2890/asset/olbannerleft.gif?v=1&s=d833098325c9f1060bcbee51adf276c155608167)
1532-2890/asset/olbannercenter.gif?v=1&s=661179918edb4fa732edfd3408eb050a6ce87809)
1532-2890/asset/olbannerright.gif?v=1&s=1ef8a363944134c502cbffa1937878a71b4cc635)