Acquisition of synonyms using relations in product specification tables



In this paper, we propose a method for acquisition of synonyms from product specifications. Product specifications are presented in tabular form. We utilize structure information, such as attribute-value pairs, for the acquisition process. First, we extract target words from attributes in specifications for the acquisition process and then vectorize the words using words in their value field. Next we compress the vector space by using Latent Semantic Indexing (LSI). Finally, we classify the words by using a k-means method. The k-means methods have some problems, such as initial clusters in the first step and the number of appropriate clusters. We solve the problems by using domain-knowledge and statistical information. © 2007 Wiley Periodicals, Inc. Syst Comp Jpn, 38(12): 25–36, 2007; Published online in Wiley InterScience (www.interscience. DOI 10.1002/scj.20827