A Lifestyle-Based Approach for Delivering Personalized Advertisements in Digital Interactive Television

Authors

  • George Lekakos,

    Corresponding author
    1. Senior researcher and Doctoral candidate at the ELTRUN research group (www.eltrun.gr) of the Athens University of Economics & Business (AUEB). His research is in the area of Adaptive systems, focusing on the use of machine learning techniques for the provision of personalized content in emerging interactive platforms. He has published more than 15 papers in international journals and conferences and he has been actively involved in several European IST projects since 1999 while he currently acts as an expert reviewer for the European Commission.
    Search for more papers by this author
  • George M. Giaglis

    Corresponding author
    1. Assistant Professor of eBusiness at the Department of Management Science and Technology of the Athens University of Economics and Business, Greece. His main teaching and research interests lie in the areas of eBusiness, emphasising on mobile and wireless applications and services. He has published more than 50 research articles in leading journals and international conferences and is a member of the editorial board of the International Journal of Mobile Communications and the Logistics Information Management Journal. He is the permanent secretary of the International Conference on Mobile Business and permanent Track Co-Chair in the European Simulation Symposium. Since 2001, he is the Director of the ELTRUN Wireless Research Centre (www.eltrun.gr/wrc), pursuing research in mobile and wireless business applications.
    Search for more papers by this author

Address: ELTRUN The E-Business Center, Department of Management Science and Technology, Athens University of Economics and Business, 47A Evelpidon Street, 113 62, Athens, Greece. Tel: ±30-210-8203687.

Abstract

This paper presents a lifestyle-based approach for the delivery of personalized advertisements in digital interactive television. The theoretical basis of the approach is analyzed, and two variations are discussed. The first (segmentation variation) relies on interaction-based classification of users into lifestyle segments, while the second (similarities variation) is based on the identification of similarities among users based on demographic and TV program preferences data. In both variations, the user's interest is predicted by aggregating lifestyle neighbors' preferences. Results from an empirical validation, in the form of a laboratory experiment, are also presented in order to provide further evidence on the effectiveness and usefulness of the proposed approach when compared with machine learning algorithms, such as classification and nearest neighborhood. The superiority of the proposed approach is also demonstrated against user modeling evaluation methodologies, as well as against traditional marketing targeting practices.

Introduction

The vast majority of existing research on personalization is concerned with computer-based systems of some kind or other. In this paper, we discuss the potential application of personalization principles in the context of 30-sec advertisements shown to viewers in a television environment. Personalization of advertisements in Interactive TV (iTV) refers to the delivery of advertisements tailored to the individual viewer's profile on the basis of user needs and interests. Several studies have revealed (Hawkins, Best, & Coney, 1998; iMedia, 2001) that less than 20% of the viewers are happy with the broadcasted advertisements. Indeed, the majority of viewers find them annoying and intrusive to their primary objective, which is to be entertained or informed through watching TV programs. Personalizing advertisements, i.e. providing viewers with messages that they are most likely to be interested in, offers marketers the opportunity to increase the accuracy of their targeting, while at the same time providing viewers with messages that increase their satisfaction in terms of interest in the advertised product, thus increasing the message's communication effect.

The work reported so far in personalization over iTV platforms mainly concerns personalized recommendation of TV programs (e.g. Ardissino, Portis, Torasso, Bellifemine, Chiarotto, & Difino, 2001; Das & Horst, 1998; Gutta, Kuparati, Lee, Martino, Schaffer, & Zimmerman, 2000; Smyth & Cotter, 2000), personalized news (Maybury, 2001), personalized interactive documentaries (Nardon, Pianesi, & Zancanaro, 2002), and adaptive learning over Digital TV (Masthoff & Luckin, 2002). In this paper we built upon and extend the previous results of lifestyle based classification reported in Lekakos and Giaglis, 2002. More specifically, in the next section the context and background work of our research are presented, followed by an analytical presentation of our approach. Further user modeling issues, including the data acquisition mechanism, are then presented, followed by experimental results towards assessing the effectiveness of the proposed approach. The paper concludes with a discussion on achievements, limitations, and further research issues.

Context and Background Work

In order to design an effective personalization approach we consider targeting methods from marketing and advertising literature, and combine them with personalization methods from the literature of adaptive systems. As will be shown, methods and techniques from both domains can act in concert to overcome limitations of each method.

Personalization can be approached from many different angles depending on the unique characteristics and attributes of the application domain considered. For example, personalization of advertisements in TV environments can be very different from personalization of content in Web pages. In the literature of adaptive systems, the goal or task for using the system constitutes the fundamental feature upon which personalization is being built (Brusilovsky, 2001). For example, in intelligent tutoring systems the user has a learning goal, while in adaptive information retrieval systems the user has search goals, typically indicated by keywords. In personalized e-commerce systems, the user's task is to search and eventually purchase products; thus the main adaptation objective involves the personalized selection and presentation of relevant links and product features, possibly re-arranging the structure of the Web page in support of user browsing activities (e.g. Ardissono & Goy, 2000). The user goal guides her interactive behavior and typically engages her in rather long interactive sessions with the system, providing the necessary data for the user-modeling component.

Conversely, in our domain there is no goal directly associated with viewing advertisements and moreover user interaction with advertisement must be kept short, taking into account the short duration of a commercial, the reluctance of viewers to get engaged in interactive sessions in the TV environment (Lee & Lee, 1995) and the disruption caused by the intrusive nature of advertising (Leather, McKechnie, & Amirkhanian, 1994) into the viewer's primary goal, which is to be entertained or informed through the TV program (Elliott & Speck, 1998; Mandese, 1992).

Although a directly observable goal cannot be associated with watching TV advertisements, viewers might be more interested in certain advertisements that provide them with information associated with their consumption needs. Indeed, consumers are theoretically known to first identify their consumption needs and then continuously scan for information related to these needs, including advertisement-oriented information (Elliott & Speck, 1998; Punj & Staelin, 1983). The consumer behavior model (CBM) illustrated in Figure 1 (Hawkins, Best, & Coney, 1998) indicates that consumer needs arise as a result of numerous internal (e.g. perception and learning) and external factors (e.g. culture and social status). Such hard-to-measure factors are suitably aggregated into the consumer's self-concept and lifestyle.

Figure 1.

The Consumer Behavior Model (Hawkins et al., 1998). External and internal factors contribute to the formulation of self-concept and lifestyle, which affects the consumer decision process. During this process, experiences and acquisitions update the original external and internal influences.

Self-concept refers to the way individuals think and feel about themselves, as well as how they would like to think and feel about themselves. Their actual and desired lifestyles are the way they translate their self-concepts into daily behaviors, including purchases (which is the ultimate objective of advertising). Thus, the model has been widely used to drive the development of lifestyle-based segmentation methods, which constitute one of the most popular advertising targeting methods today. One of the most widely used lifestyle segmentation method is based on the VALS (Values and Lifestyles) concepts and methodology (http://www.sric-bi.com/VALS/), which is “the most popular lifestyle and psychographic research” (Hawkins, Best, & Coney, 1998, p. 438) and divides the whole population into eight clusters of consumers. However, these methods are static, they suffer from time-invariance, and should be updated with the most current consumer oriented data in order to be continuously effective (Kobsa, Koenemann, & Pohl, 2001).

The technological advances in iTV (storage and processing capabilities of set-top boxes, modem or cable return channel) enable the collection of such viewer data and the formulation of dynamic user models (behavioral modeling). In recent years, recommender systems have emerged as a sub-class of adaptive systems, in order to make personalized recommendations for information, products, or services (Sarwar, Karypis, Konstan, & Riedl, 2001). Recommender systems methods and techniques have been characterized as “the lightweight model for heavyweight applications” (Konstan, 2001, p. 314), since even one-click interactions can be utilized in the user model and processed in order to produce the recommendations. This makes recommender systems appropriate for our domain, since they allow for preserving the requirement for short interactive sessions.

Recommender systems rely on data such as explicit or implicit expression of users' interest in observed items. These expressions are measured on binary nominal (e.g. interesting/not-interesting) or numerical ordinal rating scales (e.g. 1 to 5) (Resnick, Iacovou, Suchak, Bergstrom, & Riedl, 1994). Two major approaches are then utilized for the prediction task: collaborative filtering and content-based filtering. Collaborative filtering is employed by many successful commercial recommender systems such as Amazon.com, CDNow.com and MovieFinder.com (Shafer, Konstan, & Riedl, 2001). It is based on the assumption that users who present some form of similarity in their previous preferences tend to share common behavior in the long-term. Content-based filtering is based on the assumption that the same user's previous history is a good predictor of his/her future behavior.

Content-based filtering requires textual description of items in order to achieve a machine-parsable form and complex computations to capture attributes such as aesthetics and the overall taste of a user for a given item (e.g. using utility functions). Such knowledge-intensive approaches are inappropriate in the low-interactivity domain of 30-second advertisements. Although the user's previous interaction history can be utilized to partially infer his or her future preferences, predictions based on content-based filtering restrict the spectrum of recommendations to items that are similar to the ones that the user has previously evaluated. This limitation renders the application of this technique not suitable, from a marketing perspective, for TV advertisements. Thus, in this paper we focus on collaborative filtering and we claim that, coupled with lifestyle-based methods, collaborative filtering can produce an efficient personalization strategy for TV advertisements.

Delivering Personalized TV Advertisements

The approach presented in this paper aims to combine established advertisement targeting methods, such as market segmentation based on lifestyles, with personalization practices in recommender systems, such as collaborative filtering. We claim that the combination of both methods can improve overall performance by overcoming the limitations that each method has when used in isolation. Lifestyle characteristics help in the identification of users' high-level consumption patterns, but they cannot capture the dynamic changes in user behavior through time. Conversely, interest-based user models take into account updated and more item-specific user interests; however they can work only in the presence of sufficient amounts of data. Indeed, a significant drawback of collaborative filtering is known as the sparsity problem (Billsus & Pazzani, 1998; Breese, Heckerman, & Kadie, 1998; Sarwar, Karypis, Konstan, & Riedl, 2001) that occurs in User × Item tables, since typically users evaluate only a small portion of the available items, thus leading to unreliable and misleading similarities among users. On the contrary, lifestyle segmentation and classification of users is capable of identifying reliable similarities in patterns of behavior regardless of the amount of data available.

The proposed approach has two phases: (a) to identify the lifestyle neighbors of the target user and (b) to produce a prediction of the user's interest in the target item, derived from neighbors' evaluations. These phases are described in detail in the following sub-sections, while two variations are proposed for the first phase.

Phase A: Identifying the Lifestyle Neighbors (Segmentation-based Variation)

In order to exploit existing marketing methods and available data we have utilized a version of the well-known VALS lifestyle segmentation adjusted for the local population by AGB. According to this segmentation, which has been achieved by means of a psychographic questionnaire on a sample of 11,000 consumers, the local population can be divided into 9 segments: Domestic, Withdrawn, Comme Il Faut, Unsatisfied, Conventional, Socially Aware, Carefree, Upcoming, and Critical. The segments represent for each user her ‘lifestyle neighborhood’ which can be utilized to infer predictions concerning her future preferences. In order to apply this prediction scheme the users should be classified into one of those segments exploiting suitable user data. In traditional lifestyle segmentation methods, the classification task can be performed upon the data collected by means of the psychographic questionnaire. However, the need of using questionnaires presents a significant obstacle: although it can initially assess the consumer's need and desire for products, the evaluation of the accuracy of this assessment and its consequent adjustments cannot be performed unless the user is engaged in (difficult to perform in practice and annoying for the users) re-completions of the questionnaire at regular time intervals (Balabanovic & Shoham, 1997; Kobsa, Koenemann, & Pohl, 2001). Instead, to identify lifestyle neighbors, we propose a segmentation-based approach where only a portion of the population provides the lifestyle data through questionnaires which are utilized to infer the cluster membership. Having acquired the ground truth data concerning membership of the sample users, the idea is to exploit their behavioral data, namely their interactions with the advertisements, and some basic demographic data (which are collected easily and are used as lifestyle segment descriptors) to produce classification rules that are able to classify the rest of the users without requiring the completion of the lifestyle questionnaire. As more and more data are collected from the sample, along with adjustment of the training set size, rules are being refined up to the point where the noise is minimized. The benefit of this approach is two-fold: the disengagement of the process from the need for completing time-consuming questionnaires, and the continuous evaluation and update of the segmentation process using dynamic behavioral data.

Phase A: Identifying the Lifestyle Neighbors (Similarities-based Variation)

Despite the benefits of the approach described above, there are also important limitations:

  • a.The sparsity problem affects the performance of the classification process both when developing the classification rules and when applying the rules on sparse user data. This problem also applies to first-time users (known as the cold start problem).
  • b.VALS instruments are of a proprietary nature, making reliability and validity difficult to assess (Gunter & Furnham, 1992; Mowen & Minor, 1998). This can seriously limit the generalizability of the approach by anchoring the whole process in proprietary data of doubtful quality (Beatty, Homer, & Kahle, 1998).

Thus, we have also developed a variation of our approach towards a segment-independent way of identifying neighbors directly on the individual level. Such classification should be performed on the basis of some type of suitable user data. Evidence from literature suggests that user demographics can play this role. Indeed, demographics have been successfully used to classify users in Lifestyle Finder (Krulwich, 1997) and in SeAN (Ardissono, Console, & Torre, 2001), a system for news recommendations over the Web.

However, demographic data are usually too generic to achieve accurate classification results if used in isolation. For example, in SeAN, Ardissono et al (Ardissono, Console, & Torre, 2001) had to combine demographics with user hobbies, while in cross selling of banking services (Peltier, Schibrowsky, Schultz, & Davis, 2002) demographics are used in conjunction with customer credit data.

We contend that the role of complementary data in the iTV environment can be successfully played by TV programme preferences of the users. To validate this hypothesis, we worked on a sample of 502 users who had been classified into the nine lifestyle segments by AGB. Following the approach proposed by Hair et al. (Hair, Anderson, Tatham, & Black, 1998), members of cluster 2 - who were fewer than 20 - were removed and the sizes of the remaining clusters were adjusted to the size of the smallest of them (24 members), thus leaving us with a final sample size of 192 respondents. A Chi-square test demonstrated the significance of dependence concerning the membership in one of the clusters and the combination of demographics and TV program preferences data (Chi-square observed value: 520,630; p <0,0001, a:5% and Chi-square likelihood ratio: 552,243; p<0,0001; a:5%). The results allow us to reconsider our approach and formulate a user model consisting of two parts: a lifestyle part including demographic and TV program preferences data, and a behavioral part consisting of user interactions with advertisements. The first part is employed for the identification of similarities among users, and the second part is then used for the prediction on the target advertisements.

Phase B: Predicting the user's interest

Both variations discussed above essentially result in the identification of lifestyle neighbors for the target user. In the segmentation-based variation each user is classified into an ‘existing’ lifestyle segment; thus prediction can be pre-assigned to the segment by a human expert (Figure 2). In this stereotypical (in user modeling terms) prediction technique, the target user is assigned the human-generated prediction that corresponds to his/her segment. This approach, despite its obvious restrictions in terms of producing really personalized recommendations, can prove beneficial in the case that a new item is introduced into the system for which no prior evaluation exists (in this case collaborative filtering methods typically fail to produce a prediction). However, in order to exploit the collaborative filtering approach, the neighbors' preferences for the target item should be taken into account. Thus, the prediction strategy is adjusted by assigning to the target item the value (interesting/not-interesting) of the most frequently observed value in the segment.

Figure 2.

Prediction approaches applied in a neighborhood of lifestyle neighbors

The approach described above can also be implemented in the similarities-based variation. However, the direct computation of similarities among users allow us to follow a k-Nearest Neighbor (k-NN) approach and weight each neighbor's contribution according to his/her ‘lifestyle’ distance from the target user. Exploiting further the ability of our approach to accommodate various collaborative filtering techniques, the popular top-nprediction technique (Herlocker, Konstan, Borchers, & Reidl, 1999) is another alternative to be tested. In this technique the n closest neighbors of the target user are selected to contribute in the prediction.

User Model: Data and Collection Mechanism

In the approaches described above the categories of data utilized to formulate the user model include demographic data, TV program preferences, and user-advertisement interactions. The first category represents observable characteristics of the user, while the second and third ones refer to behavioral elements. We now turn the discussion to the exact type of data needed to predict the user's interest in a given advertisement.

In order to decide on the type of data that we will employ to monitor the viewer's interest, we must take into account that the interface should be minimal in the iTV environment, as revealed by a number of usability tests (Lee & Lee, 1995; Lekakos, Chorianopoulos, & Spinellis, 2001). This affects the interactive overlay of the advertisements, i.e. the interactive buttons or icons appearing over the advertisement video. The design method chosen within the context of our research stems from the results of a user requirements survey (iMedia, 2001), the basic purpose of which was to identify the maturity of viewers and their intentions and desires towards the reception of personalized advertisements in a set top box-enabled TV environment. The survey was conducted in a sample consisting of 476 randomly chosen respondents, aged between 15-55 years old, using of personal interviews at home. The survey took place in the two largest cities of Greece, namely Athens (356 respondents) and Thessaloniki (120 respondents).

Amongst the questions of the survey (the detailed questionnaire and results can be found at iMedia), consumers were asked to indicate what type of information they would like to get as a result of their interaction with the advertisements and their most preferred modes of interaction with TV advertisements in order to get this information. Interestingly, respondents favored the ability to request personal contact with the advertiser or supplier of the product (65% of the respondents). Fifty-one percent selected the option to browse the advertisement for more information. With regard to the preferred modes of interaction, 43% of the respondents opted for the ability to ‘bookmark’ the advertisement and review it at their convenience. Users do not wish to leave off watching the TV program (by engaging in long interactive sessions) but rather prefer to be able to review an interactive advertisement at their convenience, while advertisers wish to nullify the possibility of viewers missing advertisements within an advertisement break as a result of interacting with a previous commercial.

As a result, we have implemented two types of interactive buttons on top on each advertisement:

  • a.A ‘Bookmark’ button, which adds the title of the selected advertisement in the user's personal favorites list. Upon the selection of the bookmarked advertisement from the list, a full interactive version is then displayed.
  • b.A ‘Contact me’ button by which the viewer requests to be contacted from the product/service vendor (for example for insurance or banking services).

Both interactions are interpreted as indicators of interest (coded as ‘1’), while no interaction is interpreted as absence of interest (coded as ‘0’). More specifically, the user model update has three phases:

  • a.Initiation phase: all advertisements are initially assigned a zero value since no interaction has occurred at this phase.
  • b.Updating ‘no interest’ to ‘interest’ (0 –> 1): Updates from ‘0’ to ‘1’ occur when a user interacts with an advertisement. Note that interaction may occur either at the initial views of the advertisement (for example, when the consumer's need pre-dates the initial viewing of the advertisement) or after a number of views (for example, when the need arises as a result of the advertising effect).
  • c.Updating ‘interest’ to ‘non-interest’ (1–>0): Replacement of ‘1’ by ‘0’ is applied after a certain time period (since interest eventually will fade away),. More specifically, when a user bookmarks an ad the system downloads and stores the corresponding full interactive advertisement in the ‘favorites' folder for a certain period allowing the user to review the ad at his/her convenience, possibly more than once. After that period - which is extended at each retrieval of the ad - the advertisement expires and the user model is updated (1 –> 0). Such update may occur earlier than the expiration date in the case that the user deletes from the ‘favorites’ list the specific advertisement (for example in order to reduce the ‘favorites’ list). When an ad is expired and prior to its removal from the ‘favorites’ list the system prompts the user to accept or reject this deletion. This type of interaction directly infers whether the user is still interested or not in the specific advertisement and extends the expiration date accordingly.

Empirical Validation

The objectives of our empirical research were manifold:

  • a.To compare the performance of the similarities-based variation against its segmentation-based counterpart.
  • b.To test the efficiency of both approaches against the base case of randomly selected advertisements. This test represents a typical form of user modelling test (Chin, 2001) that is required in the evaluation of predictive models (Zukerman & Albrecht, 2001) in order to establish a certain level of usefulness of the proposed approach.
  • c.To test the performance of both variations against collaborative filtering techniques, such as classification and nearest neighborhood.
  • d.To compare the performance of both variations with predictions pre-assigned to each lifestyle segment by a human expert. In the marketing literature, lifestyle and psychographic segmentation is considered as superior to traditional demographic segmentation typically based on age and sex. Thus, through this test, we can compare our approach against a well-performing popular segmentation method. It must be noted that, in this test, results of the expert-driven approach are optimized, since predictions are made on the basis of knowing the lifestyle segment that the user belongs to (which is hardly known in traditional targeting).

Research Design

For the purposes of this experiment, which was conducted in laboratory conditions, a sample of 81 individuals was used, equally drawn from university staff (academic and non-academic) and staff from the largest Greek IT company, mainly in the age groups of 18-24 (43%) and 25-34 (30%), characterized by their familiarity with the technology. The main target group of our research is DTV interactive services adopters who are mainly young viewers (Bjoerner, 2003), familiar with technology (Freeman & Lessiter, 2003). The sampling frame incorporates individuals who meet the profile of the expected adopters of interactive services. However, in future work we plan to extend the sample to incorporate more target groups (for example persons younger than 18 years old). Within this sampling frame, the basic demographic characteristics of the sample are depicted in Table 1.

Table 1.  Demographic characteristics of the sample
AgeMonthly Income (Euros)Education
15–175%0–75012%Graduate65%
18–2443%750–150017%High School21%
25–3430%1500–220019%Elementary14%
35–4415%2200–30006%  
45–547%3000±16%  
  DA30%  

The environment set-up consisted of a number of televisions networked locally with content players where both advertisements and TV programs were stored. Remote controls were given to viewers to facilitate the interaction with the system. Each user was given a brief introduction to the system functionality and was initially allowed to use it until he/she was satisfied with his/her ability to utilize the functionality without problems (especially to interact with advertisements of interest). Then, the user was engaged in repeated monitored sessions of about 30 minutes where TV content and advertisement breaks were shown.

Acquisition of Classification Rules

In order to validate the segmentation-based approach we needed to establish the performance of classification rules. For this purpose, the user's lifestyle data were pre-collected by means of a standard psychographics questionnaire used by AGB, and used to classify the users into one of the lifestyle segments. User interaction data was uploaded to the Oracle Data Mining engine (ODM) and a pre-test was performed to select the best approach for the classification task, among: ‘Tree’ (Classification and regression trees), ‘Net’ (Neural Networks), and ‘Match’ (Memory-based reasoning). The results favor the ‘Tree’ model (predicted accuracy 80.77%, 86,15% and 98,46% respectively), therefore the algorithm selected for the task was Bayesian Network supporting decision trees (called Adaptive Bayesian Network in Oracle terminology).

The results identify classification rules that can classify users into the predefined clusters according to their interactions with the advertisements and their demographic profile. The rules produced formulate a model tree form (Figure 3) which, when applied to the user data, can dynamically classify users into clusters.

Figure 3.

An example of a decision-tree representing the probabilities (conditioned on Ad 12 and Ad 3) derived from the respective Bayesian network (indicating only the most probable clusters)

In order to assess the validity of the results (classification rules) two criteria were applied:

  • Confidence (C) of the rule, which is defined as the conditional probability of the rule's result (B) given the rule's input (A), that is:C (A => B) = P (B | A).
  • Performance (P), which is a 2-dimensional matrix analyzing the error rates (E) for every combination of actual and predicted value for the whole model. The error rate (E) refers to the observed differences between the predicted value of the rule and the actual target value.

The validity of the rules according to the above criteria was evaluated for two different distributions of the original data between the training set (used for model learning), the test set (used for model refinement), and the evaluation set (used for model scoring), simulating the acquisition of more interactions over time. As far as the first criterion is concerned, the models produced by both distributions perform well, since the number of rules scoring above the 95% confidence level is over 88% in both cases(Table 2).

Table 2.  Classification rules confidence level
DistributionConfidence Level
 =1>0.95<0.95
60%-20%-20%73%90%10%
80%-10%-10%88%88%12%

Based on this analysis, the 80%-10%-10% distribution was favorably evaluated since it includes ten sub-trees with error rates below the threshold. Furthermore, as the training set increases from 60% to 80%, that is as more data are used for the learning phase of the model, the increased accuracy of the model is also represented in the percentage of users correctly classified to the cluster they actually belong with (over 83.30% for the available clusters)

Personalization Effect Measurement

We have employed Precision (P) and Recall (R) as the most appropriate measures of the personalization effect. Both measures originate from information retrieval systems (Herlocker, Konstan, Borchers & Riedl, 1999; Mathe & Chen, 1996) and are also employed in the performance evaluation of user modeling systems (Chin, 2001) in which users evaluate the personalization effect on a binary scale (interested/not interested). Precision is defined as the ratio of successful selections to the number of selections. Precision represents the probability that a selected document is relevant. Recall is defined as the ratio of successful selections to the number of relevant items. Recall represents the probability that a relevant document will be selected. For comparative reasons we use the F-measure (Lewis & Gale, 1994) assigning equal weights to both Precision and Recall:

image

To measure the personalization effect, we used a sample of 37 individuals who were shown 65 advertisements. Users were asked to interact with the advertisements if they were interested in the advertised product. Note that it may be the case that users ‘bookmark’ an advertisement not only because they are interested in the advertised product but also because they just like the ad's creative or aesthetic appearance. However, likability is a piece of information exploitable in our approach since it is mostly associated with how meaningful and relevant a commercial seems to the consumer (Biel & Bridgwater, 1990).

In the data pre-processing phase, users were pre-classified according to their answers on the psychographic questionnaire and the results show that the sample consisted mainly of members of the ‘Socially Aware’ (11 individuals) and the ‘Criticals’ (19 members) clusters. Thus, we focused our experiment on those 30 users to have a sufficient basis for valid measurements. Furthermore, we removed advertisements with fewer than five positive evaluations (i.e. those that were evaluated as ‘interesting’ by fewer than five users). This process has produced a final set of 34 advertisements. We recursively considered all users as target users and produced predictions for every item in the resulting database, following the ‘Leave-one-out’ cross-validation technique (Hair, Anderson, Tatham, & Black, 1998).

The two variations of our approach (segmentation and similarities-based) have been combined with the three prediction techniques described earlier in this paper, i.e. equally weighted contributions, weighted by their distance from the target user (k-NN technique), and selecting the n closest neighbors (top-n technique, n=10). All variations were then compared against classification-based, non-personalized, expert-based, and nearest neighborhood approaches:

  • a.Classification-based: In the processing of binary data denoting the presence or absence of user's interest, collaborative filtering can be seen as a classification task to induce a model for each user in order to classify unseen items to two or more classes, for example ‘like’ and ‘dislike’ (Basu, Hirsh, & Cohen, 1998; Billsus & Pazzani, 1999). In our tests, decision trees using CART-classification and regression tree algorithm (Breiman, Friedman, Olshen, & Stone, 1984) have been used, since we wanted to trace possible non-linear dependencies among item ratings, similarly to the technique employed by Alspector et al. (Alspector, Kolscz, & Karunanithi, 1997).
  • b.Random selection: We randomly selected 34 advertisements and measured prediction precision and recall.
  • c.Expert-based: Recommendations made by an external expert (in our case a media planning manager form a well known multinational advertising agency) were compared to actual user preferences.
  • d.Nearest neighborhood (1-NN and k-NN): We employed the nearest neighborhood algorithm (Duda & Hart, 1973), also utilized by Syskill and Webert (Pazzani & Billsus, 1997) as a comparative method for personalized selection of interesting Web pages. The algorithm finds similarities among user features (preferences in our case) by measuring matches and assigns the class of the closest neighbor (1-NN) to the target item. We also measured the results of k-NN, assigning weights to each neighbor's evaluation according to his/her (binary) Euclidean distance from the target user (Mitchell, 1997). Results were then compared to those of the similarities-based lifestyle approach, which also involved k users contributing to the prediction for the target item according to their lifestyle ‘distance’ from the target user. For both tests k was selected as the total size of the sample since the weighting scheme reduces noise from irrelevant ‘neighbors’.

Precision, Recall and F-measure for all cases are depicted in Table 3.

Table 3.  Precision, recall and F-measure of the personalization approaches
 Lifestyle basedClassification basedExpertNon personalized
 SegmentationSimilarities (top-10)Similarities (k-NN)CART1-NNk-NN  
Precision0.7480.7300.7470.7010.8330.7600.6160.502
Recall0.6700.5920.6010.5570.7210.6170.4340.476
F-value0.7010.6480.6590.6030.7640.6750.5050.481

Discussion

Paired t-tests were performed to measure significance of differences in F-measure between algorithms. The results are illustrated in Table 4, while the main findings are discussed below.

Table 4.  T-tests for significant differences among algorithms (values in bold denote significant differences)
 SegmentationSimilarities (top-10)Similarities (k-NN)Classification (CART)1-NNk-NNExpertNon-personalized
Segmentation 1.5351.1872.8132.1070.7076.6927.684
Similarities  -0.921.264.7482.0575.5904.840
Similarities (k-NN)   1.5933.8862.3356.2834.709
Classification (CART)    6.4391.9812.7893.107
1-NN     3.4558.7049.783
k-NN      6.7235.027
  • Segmentation-based vs. similarities-based: No significant differences were found in the performance of the two variations. However, the implementation of the similarities-based variation is much easier and requires fewer interaction data while it is disengaged from proprietary lifestyle databases
  • Lifestyle vs. expert and non-personalized: The lifestyle approach clearly outperforms non-personalized recommendations, thus establishing the base case for the feasibility of the proposed approach according to the user modeling evaluation methodology. Furthermore, the assignment of a human expert prediction to the lifestyle segment, although possibly useful when a new item is introduced, produces poorer results when compared to our approach, which benefits by exploiting up-to-date user information in a collaborative fashion. Moreover, it is not surprising that NN and classification techniques perform significantly better than expert-based and non-personalized ones, indicating that the exploitation of up-to-date user preferences can significantly improve the personalization effect.
  • Lifestyle vs. Collaborative: The lifestyle (segmentation-based) approach outperforms the classification-based technique using the CART algorithm and is comparable to k-NN. However, it is worth investigating the superiority of the rather simple 1-NN against all techniques. It is not surprising that NN performs well under the presence of enough data since it is capable of identifying the ‘best’ neighbor who produces the closest match of preferences to the target user. But what would the performance of 1-NN be when fewer data are available for each user? In order to test whether 1-NN can produce accurate predictions from the beginning of the personalized service, we followed the Given-2, Given-5 and Given-10 protocol as introduced by Breese et al. (1998) for the estimation of performance of collaborative filtering algorithms when two, five, and ten evaluations on items (advertisements in our case) are available for each user. The results (illustrated in Table 5) clearly demonstrate that 1-NN is, as expected, significantly affected by the reduction of available data, while lifestyle based on similarities is only marginally affected.
Table 5.  Precision, recall and F-measure for the 1-NN and similarities-based approaches, for 2, 5 and 10 available evaluations for each user
 Given 2Given 5Given 10
 1-NNSimilarities (top-10)1-NNSimilarities (top-10)1-NNSimilarities (top-10)
Precision0.5770.7230.6200.7250.6810.726
Recall0.4970.6060.4970.6150.5760.564
F-measure0.5340.6600.5520.6660.6240.635

Conclusions

In this paper we have discussed a lifestyle-based approach for the delivery of personalized advertisements in digital interactive television. We have argued for the theoretical potential of the approach and showed how to produce classification rules that are capable of classifying TV viewers into predefined clusters based on a combination of static and dynamic data, minimizing the need for long psychographic questionnaires. An alternative approach, which utilizes easy to obtain user data to overcome difficulties imposed by the proprietary nature of marketing databases, the need for regularly acquiring updates, and the sparsity effect on the accuracy of classification rules, has also been presented. Empirical evidence has been provided concerning the superiority of the lifestyle-based personalization in comparison with classification and nearest neighbor collaborative filtering techniques. The superiority of the proposed approach against traditional marketing practices and non-personalized selection of advertisements has also been demonstrated. Furthermore, since the proposed approach fulfils the low-interactivity domain requirement, it can implement a reliable long-term personalization strategy unaffected by limited user feedback data, which significantly affects collaborative filtering algorithms.

However, additional experimentation can prove beneficial in further validating extending the aforementioned results. The work presented in this paper utilizes binary data and employs certain algorithms for their processing. We believe it is worth evaluating the lifestyle approach when numerical ratings are explicitly collected, since from an information-theoretic perspective it is argued that relevance has a middle range (Spink & Greisdorf, 1999) contrary to strictly dichotomous evaluations. The social aspect of TV viewing, which would involve personalized filtering of advertisements for more than a single viewer, is another important expansion of the current work. Further research incorporating the utilization of product features as an additional data element that improves the performance of the personalization effect is also an attractive future research direction.

The above issues are part of an ongoing research program that aims to provide a scientific framework for extending the work presented in this paper towards the study of personalized promotions (including but not limited to advertisements) over other platforms, such as mobile marketing campaigns, that present similar characteristics and requirements, such as short interactive engagements and promotional messages of short duration.

Ancillary