SEARCH

SEARCH BY CITATION

Keywords:

  • natural language watermarking;
  • agglutinative language;
  • morphological division;
  • syntactic dependency tree

ABSTRACT

We present a robust and adaptive-capacity watermarking algorithm for agglutinative languages. All processes, including the selection of sentences to be watermarked, watermark embedding, and watermark extraction, are based on syntactic dependency trees. We show that it is more robust to use syntactic dependency trees than the surface forms of sentences in text watermarking. For the agglutinative languages, we embed watermark using the two main characteristics of the languages. First, because a word consists of several morphemes, we can watermark sentences using morphological division/combination without deep linguistic analysis. Second, they permit relatively free word order, so we can move a syntactic constituent within its clause. Finally, to increase the information-hiding capacity, we adaptively compute the number of watermark bits to be embedded for each sentence.

We perform three kinds of evaluation: perceptibility, robustness, and capacity of our method. High capacity is achieved by dynamically determining possibly embedded watermark bits for each sentence. The secret rank based on a syntactic dependency tree strengthens robustness of our method. Finally, we show that the displacement of syntactic constituents and morphological division/combination does not affect the style and naturalness of the text. Copyright © 2011 John Wiley & Sons, Ltd.