A data‐driven procedural‐content‐generation approach for educational games
Abstract
Although game‐based learning has been increasingly promoted in education, there is a need to adapt game content to individual needs for personalized learning. Procedural content generation (PCG) offers a solution for difficulty in developing game contents automatically by algorithmic means as it can generate individually customizable game contents applicable to various objectives. In this paper, we advanced a data‐driven PCG approach benefiting from a genetic algorithm and support vector machines to automatically generate educational‐game contents tailored to individuals' abilities. In contrast to other content generation approaches, the proposed method is not dependent on designer's intuition in applying game contents to fit a player's abilities. We assessed this data‐driven PCG approach at length and showed its effectiveness by conducting an empirical study of children who played an educational language‐learning game to cultivate early English‐reading skills. To affirm the efficacy of our proposed method, we evaluated the data‐driven approach against a heuristic‐based approach. Our results clearly demonstrated two things. First, users realized greater performance gains from playing contents tailored to their abilities compared with playing uncustomized game contents. Second, this data‐driven approach was more effective in generating contents closely matching a specific player‐performance target than the heuristic‐based approach.
Lay Description
What is already known about this topic
- Educational games are mostly designed with a fixed task‐progression difficulty. Given the current broad diversity of player background, preferences, and motivations, it is typically difficult to achieve a dynamic difficulty adaptation with any single and fixed progression.
- Procedural content generation provides a solution for this difficulty by developing game contents automatically through algorithmic means with or without the involvement of a human designer.
What this paper adds
- A data‐driven approach is employed to refine the fitness function for content evaluation and the prediction of players' capabilities.
- Unlike previous approaches that have predominantly steered the generation process through designer‐defined goals or heuristics, the proposed approach does not depend on the designer intuition in aligning contents with fitness.
Implications for practice and/or policy
- The proposed framework can be applied to other educational games.
- The proposed approach offers users a different experience with every new game to enhance the game replayability.
1 INTRODUCTION
Adaptive learning is an educational method for personalized learning. Personalized learning refers to learning and instructional approaches that are driven by needs and interests of individual learners. It represents a shift from teacher‐centred learning towards student‐centred learning. It is built upon constructivist learning theory that emphasizes the critical role of student in learning by constructing personally relevant meaning. Personalized learning cannot be realized in traditional classrooms in a large scale. Instead, computers or information and communication technologies play an important role in this regard by the following: (a) adapting learning objectives, content, and progress according to individual learning needs, abilities, prior knowledge, and preferences; (b) recording individual activities for analysis; and (c) providing personalized feedback after assessing individual performance. Related studies include adaptive hypermedia (Brusilovsky & Peylo, 2003), intelligent tutoring systems (Hooshyar, Ahmad, Yousefi, Yusop, & Horng, 2015), and computer‐based pedagogical agents (Lai, Wang, & Wang, 2010).
Educational games have been increasingly promoted in recent years. Nonetheless, educational games tend to offer a static sequence of task difficulty. Whereas educational games are mostly designed with a fixed task‐progression difficulty, recent calls for dynamic tailoring of difficulty on a per‐player basis have emerged (e.g., Zook, Lee‐Urban, Riedl, et al., 2012). Given the current broad diversity of player background, preferences, and motivations, it is typically difficult to achieve a dynamic difficulty adaptation with any single and fixed progression. Procedural content generation (PCG) provides a solution for this difficulty by developing game contents automatically through algorithmic means with or without the involvement of a human designer (Shaker, Togelius, & Nelson, 2014). PCG offers several advantages over human‐crafted content development. First, it can augment existing human‐based content‐development practices to increase the availability of learning contents, thereby increasing the frequency at which learning can occur. Second, it can tailor game contents to individuals based on their abilities, thus increasing the effectiveness of user experience. Third, it can produce game contents on demand for learners without needing human instructional designers to prepare additional learning materials.
Evolutionary algorithm (EA)‐based search mechanism is a pre‐eminent method of PCG. Here, a population of game‐content instances is initially generated at random. It is subsequently evolved through iterations from several evaluation functions. Adequately defining the evaluation function for fitness is a major challenge in EA‐based PCG because this function must both capture and evaluate a human player's experience.
A data‐driven approach can provide a more accurate approach for scenario evaluation as the evaluation function is constructed based on data collected on the actual effect of a game content (Hooshyar, Yousefi, & Lim, 2017a; Hooshyar, Yousefi, & Lim, 2018; Zook, Lee‐Urban, Drinkwater, & Riedl, 2012). Because this approach does not depend on designer's intuition in applying of contents to users' particular capabilities, it can more readily adapt to new types of game content accordingly. For these reasons, an EA‐based PCG is advanced in this paper through a data‐driven approach for content generation that is implemented in an English‐language‐learning game for developing children's early reading skills.
Reading is a complex and multidimensional construct that needs to be learnt. Phonology is essential to achieve reading competence in theoretical models of reading (Byrne, 2008). Therefore, we tested an English‐language‐learning game to cultivate reading skills in young children in this study. Reading games have received scant attention in the literature. Earlier studies have indicated appropriate training methods for enhancing early literacy development (e.g., Lundberg, Frost, & Petersen, 1988; Van de Ven, De Leeuw, Van Weerdenburg, & Steenbeek‐Planting, 2017). To the best of our knowledge, this is one of the first studies attempting to develop an adaptive data‐driven language‐learning game (DLLgame) to improve early reading ability of children. Other examples that we are aware of include Graphogame (Kyle, Kujala, Richardson, Lyytinen, & Goswami, 2013), self‐explanation reading training (e.g., McNamara, 2004), and “Literate” computer game (Lyytinen, Ronimus, Alanko, Poikkeus, & Taanila, 2007) specifically developed for dyslexia.
2 RELATED WORKS
In recent years, there has been increasing interest in applying data mining and data analytics to serious games. A developing body of research attests to the potential of serious game analytics, particularly how it can illuminate a connection between gameplay patterns and game design. In a study by Horn et al. (2016), players' action traces have been clustered to reveal ways in which game mechanics affect students' thinking about their own gameplay, which in turn effects learning. Similarly, Harpstead and Aleven (2015) have used educational data mining to perform learning curve analysis on a physics game with surprising results. In forecasting student error rates, they found a previously unanticipated “short‐cut”—in which students developed strategies that successfully employed game mechanisms without any comprehension of principles of physics they were supposed to learn. In a similar vein, Hicks, Liu, and Barnes (2016) have used a zero‐inflation model to study user‐generated content within a programming game. Their study revealed that a game's variable content creation tool had an outsize effect on the appearance, affordability, and pedagogic usefulness of such user‐created content.
Another method that can create game content automatically (called PCG) is through applying data mining, data analytics, and algorithmic means to serious games. PCG has been used in a range of game‐design applications, including intelligent creation of levels (Hendrikx, Meijer, Van Der Velden, & Iosup, 2013). Recent efforts have focused extensively on adaptive games, particularly on how game contents can be made adaptive to player's preferences using game analytics. For instance, in Mario AI competition, entrants sought to develop a system capable of producing game contents for an individual user's enjoyment (Shaker et al., 2011). To date, however, there have only been a few attempts to employ PCG for educational ends. The game Refraction, for instance, teaches fractional arithmetic. It relies on PCG to create levels and introduce mathematical concepts in accordance with a given player's skills (Smith, Andersen, Mateas, & Popović, 2012). Hullett and Mateas (2009), meanwhile, have articulated a system for the creation of training scenarios within the domain of firefighting, for example, buildings partly collapsed or engulfed in flames, from which trainees must save victims. In this case, the generated scenario is the environment itself rather than any of the tasks or events taking place within it. Finally, Rodrigues, Bonidia, and Brancher (2017) have developed a math educational computer game using PCG with aim to improve motivation in users to practice math by solving problems. Results of their experiments, similar to previous related works, showed that the proposed game could successfully arouse students' interest.
Prior approaches have predominantly steered the generation process through designer‐defined goals or heuristics so that the author is obliged to dictate challenging constraints or specific evaluative heuristics to evaluate game‐content fitness. Also, a fair estimation of game contents' effect on the learning process is required by the author. In contrast, the proposed approach does not require any intuition on the part of the author or trainer in connecting game contents to players' capabilities or learning objectives. Instead, a data‐driven approach is employed to refine the fitness function for content evaluation and the prediction of players' capabilities. Moreover, this work represents a departure from previous approaches, not just in the generation but also in the evaluation of game contents regarding its impact on various learning objectives. In this respect, it does not depend on the designer intuition in aligning contents with fitness. Its effectiveness is shown in an English‐language‐learning game used to develop children's early reading skills.
3 ENHANCING READING SKILLS
As a construct, reading is highly complex. It can be divided into lower order skills—phonological awareness, grapheme‐to‐phoneme conversion, lexical recall, reading fluency—and higher order (so called “complex”) skills (reading strategies, inference, word‐to‐text integration, and reading comprehension). Our current study has addressed the former (early reading development; Van de Ven et al., 2017), although it is entirely possible that more complex skills also undergo some long‐term alterations. Literacy research has identified various cognitive skills that contribute to early reading development. Previous studies have shown a causal relationship between reading at a young age and an understanding of phonemes and letter‐sound knowledge (Hulme & Snowling, 2013). Moreover, the relationship between letter‐sounding capacity and phoneme understanding seems to be bidirectional (Kyle et al., 2013).
A great deal of research has explored what outcomes are attained through early reading intervention (e.g., see a review by Blachman et al., 2014). A general conclusion is that such interventions have a moderate impact on word recognition. A study on games such as Graphogame, a game that aims to teach young readers English grapheme‐phoneme correspondences, has shown beneficial effect of such games on English pseudoword and word decoding after roughly 11 hr of intervention (Kyle et al., 2013).
What's more, students who struggle with reading tend to suffer a lack of autonomous motivation due to external reading‐related pressure they feel. This is particularly problematic insofar as autonomous reading motivation is a major factor determining reading ability (e.g., Ashley, 2003). For this reason, it is vital that early intervention in reading not just aims at increased literacy but also aims at cultivating greater motivation.
4 TESTBED LANGUAGE‐LEARNING GAME
DLLgame is a web‐based adaptive learning game developed by Korea University, Republic of Korea. Its goal is to foster children's early English‐reading skills and motivation through a pair of activities. One involves alphabetic knowledge, whereas the other involves phonological awareness. To promote alphabetic knowledge, the DLLgame instructs players concerning lowercase and uppercase letter shapes as well as their corresponding sounds by playing them the sound of a letter's name, tracing its shape on the screen, and showing pictures of objects whose names start with the spoken phoneme (Figure 1).

When all letters of the alphabet have been reviewed, the player listens to a single phoneme while an assortment of graphemes (target and distractors) will appear on the screen as shown in Figure 2. The player must identify the grapheme that corresponds to the sounded phoneme and select it by clicking on it. Instantaneous feedback then tells the player whether or not the selection is accurate. If the player's selection is inaccurate, he or she is informed of the name of the letter and is instructed to make a second selection. The DLLgame gives players the chance to enhance their letter recognition automatically in a game‐like setting by recognizing and choosing the correct grapheme from various letters that are floating on the screen. When a player clicks on the correct letter, they score points. These letter‐recognition games employ embedded assessments so that the system can speedily determine a player's familiarity with both uppercase and lowercase letters.

When the player masters relations between phonemes and graphemes, the DLLgame automatically advances to phonological training (early decoding skills) wherein it promotes phoneme recognition by extracting the initial sound along with pictures of objects whose names start with the sounded phoneme. In addition to promoting early decoding skills, it also implants vivid inter‐letter associations between word onsets and corresponding objects. These objects will subsequently act as cues for corresponding letter sounds as shown in Figure 3.

The proposed content‐generation framework has been applied in DLLgame to produce new customized contents. A player must identify and select either pictures or pertinent mnemonic image. Although given graphemes are displayed alphabetically for all players alike, object pictures for each grapheme are chosen by data‐driven content‐generation framework according to the player's knowledge level. Hence, DLLgame can produce game contents based on a player's knowledge strengths or deficiencies.
5 THE PROPOSED DATA‐DRIVEN APPROACH
Effective learning requires the generation of a range of contents (or learning materials) that enable the realization of various educational objectives with simultaneous tailoring to individual players. This study advances prior works in this field in which the process of content generation is ruled by either designer‐defined aims or heuristics. It does this by employing a data‐driven method with support vector machines (SVMs) to construct a genetic algorithm (GA) fitness function per content assessment with player‐capability predictive function.
A summary of the data‐driven content‐generation method is given in Figure 4. It encompasses the following three primary modules: content design, data training, and content generation. The first module, content design, assists the designer in the content creation. It generates the sum or total of domain‐specific contents such as learning objectives and materials and the instances. In the second module, data training, SVM is trained using data gathered from the DLLgame. Once trained, the SVM is added to GA‐based content generator in the third module, content generation, to assess content fitness. This content‐generation module can then produce contents that are suited to the learning aim and player capability. It then relays them to the DLLgame. The following subsections describe each of these three modules.

5.1 Content design
When children first start to read, they encounter letters as fundamentally meaningless symbols that they must memorize. As such, fundamental building blocks of reading in English such as letter recognition and sound knowledge pose significant challenges. By yoking letters with an object the student already knows and recognizes, letters can become more significant and thus easily memorable. Such tactics are mnemonic—joining something known with something unknown. By developing a mnemonic relationship, new information can be more easily remembered and recalled. Therefore, when children are given pictures of known objects (an apple, for example) to associate with unfamiliar objects (such as letters), they are able to memorize both names and sounds of letters better (Kyle et al., 2013). Such mnemonic strategies are also affirmed by aspects of dual‐coding theory, which characterizes processes of encoding and retrieving verbal and non‐verbal information. When verbal information is connected to non‐verbal content such as images, conjoined information is in fact stored twice and potentially easier to recollect subsequently. Moreover, dual‐coding theory underscores mnemonic benefits of images because they are easier to remember than words. As such, verbal information that has been yoked to imagery becomes much easier to remember (Paivio, 1991).
Likewise, using image‐based mnemonics to teach recognition of letters and letter sounds can greatly help young readers who are still unfamiliar with basics of alphabet. Researchers have even considered embedding picture mnemonics to more fully integrate images with letters and sounds—literally embedding a letter within a mnemonic image that shares its first letter and sound. For instance, a snake shaped into an “S” would embed the letter “S.” This mnemonic technique joins letter shapes, sounds, and names within a familiar image. Research has proven that this is a successful approach to improve basic letter and sound recognition within a varied range of children (e.g., McNamara, 2012).
The game in question (DLLgame) has the advantage of an intelligent content generation approach to two primary learning objectives. First, it enhances alphabet competency through identification of mnemonic images that render graphemes as recognizable objects, linking the sound to the letter at once. Second, it enhances phonological competency by isolating the initial sound of words in connection with images of objects that start with the same phoneme. In order to generate variable content in support of a given learning goal, a designer needs to initially define the sought‐after Learning Objectives (LOs). To engage a range of LOs, the designer also needs to initially define a range of pedagogic material types according to the aforementioned theoretical disposition—like pictures of objects with names beginning with a given phoneme and mnemonic images—and establish LO intensities. An LO intensity denotes the designer's preference concerning the extent to which one of the LOs can be exercised in a scenario.
If a scenario contains n LOs, LO‐intensity vector that the designer enters can be presented as
= [a1, a2, …, an], where
. A designer enters high or low values as
according to a preference concerning the specific LOs. Each type of learning material can contribute to the exercise of different LOs. A mnemonic image, for instance, could be employed for practice in LO1 or LO2. Learning materials are given parameters to which certain values are assigned and from which specific types of instances can be produced.
In the proposed framework, each generated content instance comprises a sequence of learning boxes as primary game‐content building blocks, grouping a game's basic learning materials together and carrying them out in a given time interval. In a content‐generation system, various learning materials can correspond to the acquisition of different LOs. Therefore, a learning box can group together learning materials that are connected to a specific LO(s).
5.2 Data training
The data‐training module comprises a data‐driven SVM‐training process. These trained SVMs can then be employed to predict player's capabilities and LO intensity of the content. To train SVMs to predict
and the player's proficiency level, player‐performance data are initially gathered by means of a sample set of scenarios generated from the input of random ability levels. These sample contents are likewise produced by means of a random content generator that haphazardly chooses learning boxes from the repository. These boxes are then conveyed to the game for the playing process. Player‐performance data regarding the average performance for each LO are then gathered. Average player's performance is calculated and associated by the designer based on his or her overall performance ranging from 1 to 5. After simulation, these data are employed for mapping of LO intensities of the material and the player's level of proficiency.
For each LO, a corresponding LO performance that indicates the player's performance is obtained. LO2 (building phonological awareness), for example, is assessed in terms of the number of pictures displaying objects with names that begin with a given phoneme that a player correctly chooses from the total number of displayed object pictures. That is, LO intensities of a scenario are mirrored by LO performances where each LO intensity inversely correlates with LO performance. If a player chooses correctly, only a small number of pictures depicting objects that start with the vocalized phoneme are displayed from the total number of pictures, and the LO2 of the scenario for that player receives a high‐intensity value. For this study, a linear relationship between LO performance and LO intensity is presupposed. The former is normalized by way of min‐max normalization and subsequently scaled to LO intensity's range of values.
5.2.1 SVMs in the proposed approach
The SVM is a system of supervised classification designed to minimize the upper bound of an expected error. It seeks to identify the hyperplane dividing two classes of data that can best generalize to subsequent data. After gathering player data, SVMs are then used to predict the player's level of proficiency and LO intensity of contents. Because SVM is a binary classifier and predictions of the player's level of proficiency and LO intensity of contents comprise more than two classes, we used multi‐class classification with SVMs. A total of five SVMs were used for proficiency level, whereas 10 SVMs were used for LO intensity. Training‐target outputs are LO intensities and player‐proficiency level. A set of extracted features of the sample scenario and a set of player features and performances (for instance, descriptions of a user's gameplay behaviour such as switch time, which is the time between two actions, total time, correct and incorrect actions, and the index wherein the current game stage is indicated) then become SVM's input data. Regarding scenario features in this work, the following two sets of features were employed to act as scenario descriptors: (a) learning‐material intensity and (b) learning‐material type. A feature extractor was utilized to uncover values of these scenario features in every sample scenario. These values are then fed into SVMs as inputs. After receiving these inputs and target outputs, each SVM is trained to predict LO intensities of contents (to construct the fitness function for the scenario evaluation) and player‐proficiency level.
5.3 Content generation
When data training was finished, the content‐generation module was used to generate contents. The proposed model offers designers leeway in the assignment of ways LOs are exercised within game contents. This is achieved by establishing LO intensities,
, and player's proficiency,
, both of which have been estimated by the trained SVM model. The player's proficiency is presented as proficiency‐level vector
= [b1, b2, …, bn], where
Nn and n is the number of LOs in the game contents. It is used for adapting game contents to the player. It represents estimation of the player's existing proficiency level. A high
value generally implies a high proficiency of the player regarding task performance. From this pair of inputs, the desired LO‐intensity vector,
, is derived from a combination of
and
. All aspects of the
in the desired LO‐intensity vector,
, are in proportion to the product of the corresponding aspect the
and the
, that is,
∝ aibi. In practical terms, the desired value,
, is obtained by normalizing products ai and bi that are between 0 and 10. This serves to compensate for the difference in proficiencies of various players. In learning processes, a player should be given contents that are suited to her or his current capabilities. Once
is specified, GA‐based content generator is responsible for searching for scenarios that can best match
.
5.3.1 Genetic algorithm in the proposed approach
A GA‐based heuristic search is employed to uncover optimal sequences of learning boxes that are aligned with the designer's inputs. GA is a popular method for combinatorial optimization (Hendrikx et al., 2013). To generate the initial population, different sequences of learning boxes are randomly generated from learning‐box repository. Candidate contents are then evolved by standard genetic operators such as addition, deletion, mutation, and cross‐over. To evaluate candidate content, desired
and aggregated LO‐intensity vector,
, of the sequence must be compared. An aggregated LO‐intensity vector
comprises the sum of respective LO intensities of all learning materials. Once
and
are determined, fitness value for every box sequence in the population can be calculated primarily according to the Manhattan distance between
and
. Instead of using designer‐defined heuristics to define
, SVMs trained in the data‐training module are employed to determine
values.
6 EXPERIMENT
6.1 Experiment 1: Empirical study of human players
This study was carried out in three elementary schools in an urban area north of Seoul, Republic of Korea. A total of 150 students from 14 preschool classes were asked to participate. Of these, 120 students met age requirements of the study (turning 5 or 6 years old). However, 18 of these children were not used for analyses as they were not present for the collection of pre‐ and/or post‐test measures. The final sample comprised 102 children (33 females and 69 males). Parents or caretakers of these children received letters that outlined the study goal and requested permission regarding their children's participation. All parents granted permission.
Each teacher received a user manual detailing game contents and playing instructions. These teachers then helped participants play a sample scenario of the game that served to familiarize participants with player controls. Following this, the experiment was carried out in two phases for each study participant.
In Phase 1, participants played a set of pre‐generated game contents. Players' performances against LOs were recorded. His or her proficiency at the time was estimated using a simple player model that derived a first‐order, linear approximation of player proficiency from performance data. Each participant's average performance was recorded and mapped to individual performance categories. From these mapped performance categories, each player was assigned an integer value between 1 and 5 to denote his or her proficiency in the LO.
Phase 2 of the experiment tested the capacity of the proposed framework regarding the generation of contents customized to skills of individual players. Estimation of the player's proficiency gathered from Phase 1 became the input value for content‐generation system that then generated a set of customized content instances for every participant. For a point of comparison, an additional set of uncustomized contents was generated by authors through assignment of randomly selected proficiency levels (either greater or less than their estimated proficiency) to players. Participants then played both sets of uncustomized and customized scenarios. It should be noted that LO intensities remained unchanged across all content‐generation procedures (LO1 = 8, LO2 = 9).
When determining the relative efficacy of generated contents, the extent to which the performance of a player increased from one round to the next within the game was considered. Customized and uncustomized sets of scenarios were similarly organized into pairs of instances generated with identical inputs (i.e., the proficiency level of the player was fixed for both). Thus, in each pair, difficulty levels of content instances were identical, and the participant played them as a pair. To determine in‐play performance gain, results of the previous performance were subtracted from the subsequent performance. They were then divided by results of subsequent performance and multiplied by 100. The performance gain was calculated for each participant in both sets of scenario pairs (customized and uncustomized). The average value for each set was then determined. Regarding the average performance gain, it was hypothesized that customized scenarios would outperform uncustomized ones.
6.2 Experiment 1: Results
Because the language‐learning game comprised a pair of learning objectives, each LO was evaluated in terms of its own specific measurement. LO1 (alphabetic‐principle acquisition) was calculated in terms of the number of correctly identified mnemonic images divided by the sum of mnemonic images (P1). LO2 (phonological‐awareness development) was calculated in terms of the number of correctly selected object pictures with names that began with the spoken phoneme divided by the total number of pictures of objects with names that began with the spoken phoneme (P2). At the end of each round of the game, P1 and P2 were recorded, and performance gain values of P1 and P2 were then derived accordingly for each pair of the scenarios based on the authors' methodology as described previously.
Using the estimated proficiency level of the player derived in Phase 1 of the experiment, participants were sectioned into five subsets for each LO, with each subset comprising participants who were judged as having similar proficiencies. Average improvements of the performance across all scenario pairs were then examined for each subset through comparison of customized sets with uncustomized ones. Figure 5 shows outcomes of this comparison. Across all five subsets, a greater increase of participants' performances in playing customized scenarios was evident compared with that in playing uncustomized ones. Moreover, results for customized scenarios showed that even the lowest‐proficiency participants (subsets 1 and 2) demonstrated greater performance improvements than more proficient participants.

To test the hypothesis put forward in the previous section, average improvements in the performance of the entire sample of 102 participants were also calculated. For this purpose, a one‐tailed t test was employed for paired samples due to dependency in sample data from customized and uncustomized scenarios. Average performance improvements for the two sets and p values from the t test are summarized in Table 1. The present‐research hypothesis—that customized scenarios will produce greater average performance gain than uncustomized scenarios—is clearly supported by small p values.
| Customized | Uncustomized | p value | |
|---|---|---|---|
| Performance gain (P1) | 30.40 | 9.26 | <.001 |
| Performance gain (P2) | 37.98 | 9.82 | <.001 |
6.3 Experiment 2: Training SVM‐based prediction model
SVMs content‐generation training was performed using a data‐gathering procedure. Children in this study produced and played 100 random scenarios. Data of their LO performances were collected to produce a dataset totaling 2,600 data points (100 children × the 26 A‐to‐Z contents of each scenario). This dataset was split into three parts, with 20% being designated as the testing set. A 10‐fold cross‐validation was employed for training and validating SVMs. SVM inputs are content and player features. Its outputs are predicted LO intensity and player's proficiency. Min‐max normalization was employed to normalize inputs into [0, 1] intervals.
To demonstrate the superiority of the SVM over the widely used artificial neural network (ANN) in PCG‐based games (Hendrikx et al., 2013), each classifier was trained with the gathered dataset to predict LO intensity and player proficiency. Table 2 summarizes performances of the highest precision, recall, and F‐measure from the evaluation, for which SVM and ANN testing sets are employed alike. Results demonstrate that the LO intensity and the proficiency level can be predicted with precision of 95.3% and 94.2%, respectively, with 90.5% and 88.6% recall, respectively, by the trained SVM. With ANN, the LO intensity and the proficiency level can only be predicted with precision of 87.8% and 84.1%, respectively, with 83.5% and 76.3% recall, respectively. These results demonstrate that SVMs can generate stronger predictive results for LO intensity and proficiency level than ANN. These align with target values.
| Classifier | Precision (%) | Recall (%) | F measure (%) | |
|---|---|---|---|---|
| LO intensity | SVM | 95.3 | 90.5 | 91.9 |
| ANN | 87.8 | 83.5 | 79.9 | |
| Proficiency level | SVM | 94.2 | 88.6 | 89.6 |
| ANN | 84.1 | 76.3 | 75.9 |
- Note. ANN = artificial neural network; SVM = support vector machine.
6.4 Experiment 3: Data‐driven versus heuristic‐based approaches
Likewise, the efficacy of the proposed data‐driven method was assessed in comparison with a previous heuristic‐based approach (Hooshyar, Yousefi, & Lim, 2017b) where heuristic function approximated LO intensities of contents by extrapolating from LO intensities of contained learning boxes. Both these intensities and player proficiencies were pre‐established according to the designer's knowledge or intuition (alternatively, in the proposed data‐driven method, they were arrived at by the trained SVMs). To assess their respective efficacy values, their generated contents were compared.
A target performance describing the desired LO performance to be realized in the gameplay was initially set for each LO to generate a scenario. Target performances were then ascribed an LO‐intensity value. To derive sought‐after LO intensities, these intensities were scaled by the designer's chosen player‐proficiency levels. With
values as established inputs, scenarios were generated from each method. When assessing these generated scenarios, LO performances of players in the study's game were considered against specified target performance that indicated the sought‐after LO gameplay performance. A slight discrepancy between the two suggests a closer alignment between the given scenario and the desired LO intensities. That is, to compare the two methods, an inspection is conducted to determine the one that generates a scenario with the least gap between the actual and target performances.
In this experiment, a set of 10 scenarios were generated from each method with the given target performance. Gameplay from real player was then employed in these scenarios to derive actual LO performance. Figure 6 presents performances (targeted and actual) of the heuristic‐based and data‐driven scenarios. Five instances of each scenario were derived using the same target performance. Actual performances displayed in Figure 6 represent the average value of these five instances.

From outcomes given in Figure 6, an average distance was derived between the target and resulting performances over all 50 scenarios (i.e., five instances each for 10 sample scenarios). In the data‐driven approach, average distances for Performance LO1 and Performance LO2 were 0.0330 and 0.0080, respectively. In the heuristic‐based approach, they were 0.0508 and 0.0800, respectively. A two‐sample one‐tailed t test (sample size = 50) was carried out to determine if the data‐driven approach produced a slighter difference between the target and actual performances compared with the heuristic‐based method. Results produced p value <.001 for performances of LO1 and LO2. This shows that the proposed data‐driven approach outperforms the heuristic‐based approach because it can generate scenarios that are more closely aligned with the specified target performance.
7 CONCLUSION AND FUTURE WORK
In this paper, a data‐driven PCG method that seeks to generate adaptive contents suited to various proficiencies of individual users of educational games is proposed. Whereas previous works of content and story generation—both in games and interactive‐storytelling research—have depended on a designer‐established heuristic fitness function, the proposed method is not dependent on the intuition of the author or trainer in the charting of a relation between contents and player's capabilities. Instead, the proposed approach employs a GA as the computational core for content generation and SVMs to construct fitness function for scenario evaluation. In brief, it addresses the way that game contents can be modulated to meet capabilities of individuals.
This data‐driven PCG approach was assessed at length. Its effectiveness was shown in results of an empirical study of 5‐ and 6‐year‐old children (total of 102) who played an educational language‐learning game to cultivate early English‐reading skills. To affirm the efficacy of the proposed data‐driven method, it was evaluated against a heuristic‐based approach. Results clearly demonstrated the following two outcomes. First, users realized greater performance‐based gains from playing contents tailored to their abilities compared with playing uncustomized game contents. Second, the data‐driven approach more effectively generated contents closely matched to a specific player‐performance target than the heuristic‐based approach. The proposed framework can be applied to other educational games provided that these do not have strict didactical constraints that limit variation in contents.
Future research should further investigate the effect of customized and uncustomized game contents on learning gain and motivation of players before and after playing both customized and uncustomized game contents. Moreover, a meticulous additional experiment should be designed to particularly study the extent to which players remain motivated by playing both customized and uncustomized game contents.
ACKNOWLEDGEMENTS
This work was supported by a Korea University Grant as well as Ministry of Culture, Sport and Tourism (MCST) and Korea Creative Content Agency (KOCCA) in the Culture Technology (CT) Research & Development Program 2017 (R2016030031).
Number of times cited: 1
- Yeongwook Yang, Danial Hooshyar, Jaechoon Jo and Heuiseok Lim, A group preference-based item similarity model: comparison of clustering techniques in ambient and context-aware recommender systems, Journal of Ambient Intelligence and Humanized Computing, 10.1007/s12652-018-1039-1, (2018).




