SEARCH

SEARCH BY CITATION

Abstract

In this paper, we investigate the difference between metadata generated by users and authors. Delicious tags and HTML keyword META tags associated with the same set of web pages on topics related to semantic web are collected, forming two datasets (i.e., Delicious dataset and HTML dataset). Comparisons of the two datasets in micro and macro vocabulary overlap as well as classification of web pages are analyzed. The results show that (1) overlap between the two datasets exists; (2) non-overlapped tags in Delicious dataset reveal systematic deficiency of social tagging systems; non-overlapped tags in HTML dataset expose organization-oriented contents; and (3) Delicious dataset tends to cluster web pages according to their popularity and subject area while HTML dataset clusters the web pages according to different websites/authors.