Markov chain to analyze web usability of a university website using eye tracking data

Web usability is a crucial feature of a website, allowing users to easily find information in a short time. Eye tracking data registered during the execution of tasks allow to measure web usability in a more objective way compared to questionnaires. In this work, we evaluated the web usability of the website of the University of Cagliari through the analysis of eye tracking data with qualitative and quantitative methods. Performances of two groups of students (i.e., high school and university students) across 10 different tasks were compared in terms of time to completion, number of fixations and difficulty ratio. Transitions between different areas of interest (AOI) were analyzed in the two groups using Markov chain. For the majority of tasks, we did not observe significant differences in the performances of the two groups, suggesting that the information needed to complete the tasks could easily be retrieved by students with little previous experience in using the website. For a specific task, high school students showed a worse performance based on the number of fixations and a different Markov chain stationary distribution compared to university students. These results allowed to highlight elements of the pages that can be modified to improve web usability.


INTRODUCTION
The usability of a website is crucial for its success. Every public or private institution needs a well-designed website to offer information and services to its users. When we think about the design of such websites, the ease of use should be one of the main features. Obviously, the website should contain every useful information that a user might want to find, but it is also important for this information to be arranged in an orderly and consistent manner. A good level of web usability allows a user who is browsing the web portal to find the information needed as quickly as possible.
To assess if the interface of a website is intuitive and easy to use, many studies rely upon a measure defined as web usability, which is often evaluated only using ques-tionnaires. A more objective measure of web usability can be obtained using an eye tracker, that is, an instrument that uses near-infrared light to create a reflection on the cornea and pupil of the eye of the subject, leading to estimate the exact eye position and, therefore, the exact point observed. Therefore, the eye tracker allows to analyze viewing behavior during visualization of images, texts, or other visual stimuli [1,2]. The eye tracking technology has been increasingly applied to the study of web usability in different fields, for example, tourism [3] and e-commerce [4][5][6]. Indeed, a recent review on studies using neurophysiological tools to investigate e-commerce topics showed that, among 89 retrieved studies, eye tracking was the most commonly used instrument (87.6% of the studies) [4]. Eye tracking has also been used to highlight differences in viewing behavior according to demographic characteristics. For instance, a recent study used eye tracking to investigate gender differences in consumers' visual patterns of attention to online shopping information [5]. The authors identified the main areas of interest (AOI) (product information, consumer opinions, offers and supplementary information) in different online shopping screens and analyzed the number and duration of visual observations in each AOI. Eye tracking has also been proven to be useful to analyze the way in which users react to different designs of a web page, showing how position and visual salience can influence the way in which users are able to find target web objects in web pages [6].
Although eye tracking has been widely used to evaluate the way in which consumers interact with a web page, only a few studies used it to analyze the web usability of portals of public or educational institutions [7,8]. Specifically, two recent studies designed activity tasks to evaluate the efficiency of the website of a university library [8] and a German educational institution platform [7]. Students had either to perform tasks related to the library database such as search and request a textbook or download a book from electronic resources [8], or to look for specific courses [7], respectively. In the latter study, the analysis of viewing behavior was conducted using a combination of qualitative analysis of heat maps and gaze plots (two types of graphs that represent the density and the chronological order of observations, respectively) and a quantitative analysis of metrics such as duration of the search or time to first observation [7]. Indeed, registration of eye movements during the execution of tasks allows to analyze web usability using a more comprehensive range of metrics beyond time to completion. However, eye tracking produces large amounts of data, raising the need for robust statistical approaches to analyze viewing behavior and find patterns [1]. Different approaches can be used to extract knowledge from eye tracking data, ranging from qualitative analysis of heat maps to quantitative analysis of different types of eye movements such as fixations, saccades or smooth pursuits or comparison of scanpaths [9,10]. In recent years, probabilistic approaches for the analysis of eye movements have been proposed. These approaches consider fixations as random variables generated by stochastic processes [9]. Markov chain allows to transform unequal sequences of fixations in fixed length vectors of probabilities and has been successfully applied to understand which features of an image gain more attention from users [11] as well as to compare viewing behavior of images between different groups of participants [12]. To our knowledge, probabilistic methods such as Markov chain have never been used to evaluate web usability of websites of public or educational institutions. Since the large majority of students rely on it to retrieve information about, for example, exams, professors and courses, a university website should be properly organized to facilitate and speed searches and information retrieval. However, to our knowledge, no study previously leveraged the eye tracking methodology and Markov chain analysis to thoroughly assess web usability of a university website, as previous studies just assessed specific parts of institutional websites (e.g., government or university libraries) or were limited to single media (e.g., images, texts and videos).
Aim of the present study was to evaluate the web usability of the website of the University of Cagliari (www. unica.it) in order to identify strengths as well as potential problems and suggest improvements. To this aim, we combined qualitative and quantitative analysis of eye tracking data to using different metrics (number of fixations, task duration and a difficulty ratio) and a Markov chain approach to analyze the participants' behavior in respect to the main AOIs of the home page. In addition, we compared performances of university and high school students to assess whether users with low degrees of knowledge of the website are able to use it in an efficient way.

Design of the tasks
Object of the study was the website of the University of Cagliari (UniCa, Italy) "www.unica.it," launched in 2017. This website was the first Italian university portal to meet the Agid (Agenza Italiana digitale del Consiglio dei ministri) criteria. These criteria mainly promote a user-friendly layout, readability and alternative content for people with disability. Some of the main features of the UniCa's website include being responsive (that is, able to adapt the layout of the pages to any device) and a site section specifically designed for future students. In order to assess whether the web portal of the University of Cagliari is easily browsable, 10 tasks that required an active search within the sections of the web portal of the University of Cagliari were designed ( Table 1). The tasks either required to reach a specific page of the portal or to retrieve a specific information inside a page. The tasks were designed in order to require an approximately equal number of pages to be opened.

Sample
Students belonging to two distinct groups were recruited. The first group included randomly selected students from different high schools (ranging from more humanistic to more technical ones) recruited during a university fair, where students from any part of the region gathered information about faculties. The second group included university students, mainly from economics and law departments, randomly selected in group study rooms (in different days of the week and different times of the day). Each student was randomly assigned one of the 10 tasks. Information regarding age, gender, high school institute, and/or university course were collected.

Assessment of viewing behavior using the eye tracker
For each participant, eye movements during the task were gathered with a screen-based Tobii X2-60 Compact eye tracker, which captures gaze data at 60 Hz, applied to a 25-inch monitor. Before the task was started, the study procedures were explained to each subject. Participants were instructed not to use search engines (internal or external to the site) to complete the task. Then, the instrument was calibrated according to the specific height and distance from the screen for each subject. During the task, the viewing behavior was monitored using a second screen in order to verify that the registration was proceeding correctly.
After all tasks were completed, the raw data were processed in order to classify eye movements into different types (e.g., saccade, fixation, etc.). Specifically, the velocity-threshold fixation identification (I-VT) algorithm implemented in Tobii Studio v. 3.3.1 was used. This algorithm classifies eye movements based on the velocity of the directional shifts of the eye. In this study, we focused on fixations, since they are considered the metric of most interest in research [13] as they are believed to indicate the moment in which an information was most probably registered by the brain [14]. One of the approaches used in the literature consists in removing fixations when their duration is below a threshold, for example, 100 ms [15]. The hypothesis is that under a certain time threshold, no useful information can be gathered, so the fixation can be considered as noise. However, the real threshold can be affected by several factors, such as task complexity or gender [16], introducing a bias in the data analysis. Since we designed a comprehensive range of tasks requiring exploration of different parts of the websites, we decided not to remove any fixation, in the hypothesis that even partial information might be useful to the user to complete the task.

Comparison of performances between high school and university students
Age was compared between the two groups of students using Student's t test, while gender and previous knowledge of the website with Pearson's chi-squared test. Performances of high school and university students were compared using three metrics: time to completion of the task (defined as the time at the last fixation for each participant), number of fixations and a difficulty ratio calculated as the ratio of number of pages visited to minimum number of pages required to complete a task [17]. Correlation between time to completion and the other two metrics was assessed using Spearman's correlation test. Tasks whose metrics were above two standard deviations (SD) from the mean were considered to be more difficult than the average. For each task, normality of distribution for the three metrics was evaluated using the Shapiro-Wilk test. In case of normal distribution, Student's t test (with or without Welch's correction for unequal variances based on results of Levene's test) was used to compare performances between the two groups of students. Otherwise, Mann-Whitney U test was used. Multiple testing was conducted using the Bonferroni correction. As 10 different tasks were conducted, a p-value < 0.005 (i.e., 0.05/10) was considered significant. All analyses were conducted using R v. 3.6.1 [18]. Additionally, fixations coordinates were used to plot two types of graphs to conduct a qualitative analysis: heat map and gaze plot. Heat maps are graphical representation of data with colors ranging from red (areas of the page with more fixations) to green (areas of the page with less fixations). Gaze plots show fixations in the exact order in which they occurred [19].

Analysis of the viewing behavior on the home page using Markov chain
A more detailed analysis to compare gaze transitions among AOIs between the two groups of participants was conducted on the home page of the UniCa web portal using Markov chain. The home page was chosen as this page was the starting point of each task and is commonly considered the most important part of a website. As this analysis was conducted on transitions, only participants with at least two fixations with known (x, y) coordinates on the part of the home page visible without doing any scrolling were retained.
A Markov chain is a type of discrete time stochastic process {X 0 , X 1 , … } where the following property holds: the distribution of X t only depends upon X t − 1 , being therefore independent from all previous values X 0 , X 1 , … , X t − 2 , exception made for the last one X t − 1 [20]. We can also write it as a conditional probability, where p ij in Equation (1) is the probability that the chain jumps from state i to state j and i, j are states of a countable set A. Let a transition probability be the probability of transitioning from state i to state j in one jump. For the transition probabilities, the following property holds: ∑ j ∈ S p ij = 1 with i ∈ S. The matrix P = (p ij ) represents the transition matrix of the chain [21].
The stationary distribution of a Markov chain with transition matrix P is a vector of probabilities π following the property: P = . To converge to a stationary distribution, a Markov chain needs to be irreducible, recurrent and aperiodic [20].
Let X be defined as a Markov chain An irreducible chain X is recurrent if: An irreducible chain X is called aperiodic if: If a chain X satisfies properties of Equations (2), (3), and (4), the stationary distribution represents the limiting distribution of successive iterates from the chain, regardless of the starting probabilities of each state [20].
Seven AOIs were drawn around the main buttons of the menus in the UniCa home page and named with capital letters from A to G (Figure 1). A padding of 30 pixels was applied around every object of interest in order to be able to also capture fixations in proximity of each menu.
For each participant, fixations in the home page were extracted and assigned to the corresponding AOI according to their (x, y) coordinates while fixations outside any AOI were assigned to an eight region (named with the capital letter H). The letters from A to H represented the states of the Markov chain.
For each participant k ∈ K, let D k be a vector of variable size containing the sequence of states representing the regions in which each fixation was made. Transitions between states contained in D k were used to obtain an n × n with n = 8 transition matrix T k . For each group of participants g ∈ [4,22], let M (g) be the matrix defined as: that is, Equation (5) is the matrix containing all transitions between states. Finally, the transition probability i,j . The stationary distribution of the two transition probability matrices was computed using the steadyStates function of the markovchain R package [23]. In this function, eigenvectors corresponding to identity eigenvalues are identified and then normalized to sum up to one.
The verifyHomogeneity function of the same package, which uses a chi-square-based test, was used to verify if transition matrices of the two groups of participants belonged to the same Markov chain.
Similar analyses were conducted at the level of single tasks in case a task showed significant differences between the two groups of participants in time to completion, number of fixations and difficulty ratio. The qgraph R package [24] was used to generate the oriented graphs of the states of the chain, ggplot2 [25] to plot the graphs showing the fixations in each AOI and scanpath [26] to plot the scanpaths of single participants.

Comparison of performances between high school and university students
A total of 56 high school (Group 1) and 66 university students (Group 2) were recruited. After exclusion of 5 high school and 3 university students for technical problems during the task, 51 and 63 participants from Group 1 and Group 2 were included in the analyses, respectively ( Table 2). As expected, the two groups differed in terms of age, while there was no gender difference between high school and university students ( Table 2). A higher proportion of students from the second group had a previous knowledge of the UniCa website ( Table 2). All participants completed the task within 6 min (mean ± SD, Group 1: 1.98 ± 1.33 min, Group 2: 1.63 ± 1.27 min). Individual task completion times across the tasks ranged from 15 s to 5.54 min for Group 1 and from 21 s to 5.43 min for Group 2. Besides time to completion, two other metrics were used to evaluate efficiency of the website and differences in viewing behavior between the two groups: number of fixations and a difficulty ratio calculated as the number of visited pages/the minimum number of pages needed to complete a task. Time to completion was found to be positively correlated with both metrics (fixations: Spearman's rho = 0.81, p = 2.2E-16; difficulty ratio: Spearman's rho = 0.83, p = 2.2E-16). No task showed a time to completion or number of fixations above two SD from the mean. Conversely, for Task 4 the mean difficulty ratio in both groups of participants (Group 1: 6.88; Group 2: 6.42) was above two SD from the mean (Group 1: mean = 3.38; SD = 1.61; Group 2: mean = 3.10; SD = 1.60).
For the majority of the tasks, no significant differences were observed between the two groups after multiple TA B L E 2 Demographic characteristics of the sample testing correction (Table 3). For Task 8, university students showed a better performance based on all the three metrics. However, only number of fixations was significantly different between the two groups after multiple testing correction. In addition, university students completed Task 5 with a lower difficulty ratio (Table 3). A qualitative analysis of heat maps and gaze plots was also conducted. Figure 2 shows an example of heat map and gaze plot of the home page for high school students. As shown by the image, the majority of fixations was concentrated in the upper part of the page and specifically in the areas around the buttons of the menus. Therefore, a more detailed analysis of this area was conducted as described in the next section. Figure S1 shows a gaze plot of the destination page of Task 4, which was the only task with a difficulty ratio above 2 SD from the mean. As shown in the plot, the majority of fixations was made in areas of the page outside the red rectangle highlighting the information to retrieve (deadline to register to the admissions tests).

Analysis of the viewing behavior on the home page using Markov chain
For this analysis, 48 participants from Group 1 and 53 from Group 2 having at least two fixations with known (x, y) coordinates in the home page were included. The fixations made by the two groups of participants in the home page are plotted in Figure 3 before (Figure 3(A,B)) and after (Figure 3(C,D)) defining the AOIs corresponding to the main buttons of the menu. Table 4 shows the mean transitions between AOIs for each task. The oriented graphs constructed using the transition probabilities between the defined AOIs for the two groups of participants are shown in Figure 4. The width of the arcs is proportional to the value of the corresponding transition probabilities. For better clarity, only values of transition probabilities ≥0.30 are reported. For both high school (Figure 4(A)) and university students (Figure 4(B)), it can be observed that the highest transition probabilities are either self-loops or are directed the heat map of the home page for high school students. In the heat map, clusters of fixations are represented in colors ranging from red (areas of the page with more fixations) to green (areas with less fixations). The gaze plot (on the right) shows fixations in the exact order in which they occurred to the H region (the area of the page between the different AOIs).
The matrices P (g) with g ∈ [4,22] of these transition probabilities were used as input for the steadyStates function of the markovchain package to obtain the stationary distributions (Table 5). No significant differences between the two transition matrices were found ( 2 = 76.32, p = 0.12), supporting the hypothesis that, on a global level, users with different levels of knowledge of the portal show similar behavior when searching information on the home page. We then focused on Task 8, as for this task all three metrics showed significant differences in the performance of the two groups of participants (although only the difference in number of fixations survived after multiple testing correction). For this task, significant differences were found in the transition matrices of the two groups of participants ( 2 = 115.29, p = 6.5E-5). This observation is in line with the differences observed in the qualitative analysis of scanpaths for this task ( Figure 5). As shown in Figure 5, the majority of high school students, compared  to university students, went back and forth between different AOIs and made several observations in the area outside any AOI.

DISCUSSION
In this study, we combined qualitative and quantitative analytical approaches to evaluate the web usability of the website of the University of Cagliari using eye tracking. The fact that all participants successfully completed the tasks, together with the observation that for the majority of tasks there was no difference between performances of two groups of participants with different levels of previous knowledge of the website, suggests that the website offers a good web usability. The lack of differences between performances of high school and university students was also confirmed through the analysis of sequences of fixations in the main AOIs of the home page using Markov chain. However, some tasks completed in a less efficient way by high school compared to university students point to specific pages that might benefit of some changes to make them easier to browse for users with no or little knowledge of the website. In Task 8 (in which participants had to open the page of the Human Sciences library), high school students showed a worse performance with all the three metrics we used (time to completion, number of fixations and difficulty ratio), although only number of fixations was significant after multiple testing correction. The worse performance shown by high school students might have been prompted by a different viewing behavior in the home page, in line with the observation of different transition probabilities matrices for the two groups of participants. Specifically, for university students the areas F I G U R E 5 Scanpaths of high school and university students performing Task 8. Participants with at least two fixations with known (x,y) coordinates in the home page are represented. The numbers from 1 to 5 indicate high school students, the numbers from 6 to 10 university students. The letters from A to G indicate areas of interest (AOI) as defined in Figure 1, while the letter H indicate the area outside any AOI. The plots show the sequence of fixations in the different areas of the home page (X axis) in milliseconds (Y axis).The scanpaths show that the majority of high school students, compared to university students, looked back and forth between the different AOIs and made a high number of fixations in the area of the page outside any AOI with the highest probabilities in the stationary distribution were the buttons of the menu corresponding to AOIs A, B, and G, while for high school students it was the area of the page outside any AOI (H). This might be related to the absence of a specific item of the menu labeled "libraries." This lack did not represent a problem for university students as they might be more aware that libraries can be found among university buildings or services to students. While the majority of tasks did not result to be particularly difficult to complete, Task 5 showed a high difficulty ratio, that is, students opened a much higher number of pages compared to the minimum number required to complete the task. For this task, it can be hypothesized that participants were not able to understand the meaning of specific links at first sight or failed to retrieve the information required to complete the task even after reaching the correct page. As shown in the gaze plot of the destination page of the tasks (Figure S1), only a small proportion of fixations was in the area containing the information to retrieve, while the large majority of participants scanned the menu. Therefore, most participants were not able to find the information at first sight and to understand they had reached the final destination of the task. In light of this, it would be important to highlight relevant information such as the deadline to register to admission tests and possibly to place it in the upper part of the page. This suggestion is also supported by the observation of a high concentration of fixations on the upper part of the web pages (Figure 2), which means that a typical user does not scroll a page but merely observes the visible part.
To conclude, while we observed a high efficiency for most analyzed pages, the combination of qualitative and quantitative analysis allowed to suggest some changes to improve the web usability of the University of Cagliari's website, with a specific focus on users with low levels of previous knowledge of the website.

DATA AVAILABILITY STATEMENT
Data available on request from the authors.

ACKNOWLEDGMENT
Open access funding enabled and organized by Projekt DEAL.