Research Article
Language evolution and the spread of ideas on the Web: A procedure for identifying emergent hybrid word family members
Article first published online: 6 JUN 2006
DOI: 10.1002/asi.20437
Copyright © 2006 Wiley Periodicals, Inc., A Wiley Company
Issue

Journal of the American Society for Information Science and Technology
Volume 57, Issue 10, pages 1326–1337, August 2006
Additional Information
How to Cite
Thelwall, M. and Price, L. (2006), Language evolution and the spread of ideas on the Web: A procedure for identifying emergent hybrid word family members. J. Am. Soc. Inf. Sci., 57: 1326–1337. doi: 10.1002/asi.20437
Publication History
- Issue published online: 13 JUL 2006
- Article first published online: 6 JUN 2006
- Manuscript Revised: 31 AUG 2005
- Manuscript Accepted: 31 AUG 2005
- Manuscript Received: 26 JAN 2005
- Abstract
- Article
- References
- Cited By
Abstract
Word usage is of interest to linguists for its own sake as well as to social scientists and others who seek to track the spread of ideas, for example, in public debates over political decisions. The historical evolution of language can be analyzed with the tools of corpus linguistics through evolving corpora and the Web. But word usage statistics can only be gathered for known words. In this article, techniques are described and tested for identifying new words from the Web, focusing on the case when the words are related to a topic and have a hybrid form with a common sequence of letters. The results highlight the need to employ a combination of search techniques and show the wide potential of hybrid word family investigations in linguistics and social science.

1532-2890/asset/olbannerleft.gif?v=1&s=d833098325c9f1060bcbee51adf276c155608167)
1532-2890/asset/olbannercenter.gif?v=1&s=661179918edb4fa732edfd3408eb050a6ce87809)
1532-2890/asset/olbannerright.gif?v=1&s=1ef8a363944134c502cbffa1937878a71b4cc635)