A scientometric review of research on trafﬁc forecasting in transportation

Research on trafﬁc forecasting in transportation has received worldwide concern over the past three decades. While there are comprehensive review studies on trafﬁc forecasting, few of them explore the research advancement in this ﬁeld from a visual perspective. With the help of CiteSpace and VOSviewer, this study uses scientometric review to identify the evolution and emerging trends of the research in the ﬁeld. Totally, 1536 bibliographic records with references are extracted from Web of Science and used as the datasets to form the author network, institutional network, keyword network, and co-citation network. The visualization of the results characterizes the research progress in the ﬁeld. It can be found that Eleni I. Vlahogianni receives the highest citation frequency, China and the United States contribute most of the journal articles. Some inﬂuential institutions and articles are also identiﬁed. With the author keyword network, the words “recurrent neural network”, “convolutional neural network”, “spatio-temporal correlation”, “trafﬁc pattern”, and “fea-ture selection” are identiﬁed as the emerging trends. Also, the document citation bursts reveal that the applications of combined models and the study of trafﬁc ﬂow forecasting in atypical situations are becoming the emerging trends. This study provides a valuable reference for the research community in this ﬁeld.


INTRODUCTION
The rapid growth of motor vehicles highly accelerates the imbalance between the infrastructure service capacity and the traffic requirements in transportation, especially in urban transportation. Taking Hong Kong as an example, the number of private cars has increased more than 47.50% over the past 10 years while the road length grew only less than 5% during the same period [1]. This would lead to serious problems to urban transportation if it is not efficiently managed, such as severe traffic jam and long travel time, frequent traffic accidents, and serious air and noise pollution. All these issues further reduce the efficiency of urban operations and people's satisfaction with their quality of life. Fortunately, intelligent transportation systems (ITS) that are capable of improving operational efficiency and system integration are put forward and adopted worldwide [2,3]. The provision of accurate and reliable real-time information and prediction of traffic parameters becomes one of the core aspects of ITS success [4]. Also, such information is one of while the latter focuses on predictions for the next few seconds through few hours [7,9]. Comparatively, more attention has been drawn to the latter due to its capability of adaptive implementation. Previous literature reviews on traffic forecasting have been completed in several articles. The first article [10] presents neural network applications in civil engineering with traffic forecasting being mentioned as a part of the article. Later, [8] gave a systematic review on algorithm developments for short-term traffic forecast based on publications up to 2003 in the literature by analysing the determination of scope, conceptual output specification, and modelling. After ten years, in 2014, [9] highlighted the existing ten challenges to short-term traffic forecasting with a top-down approach and guided the directions for future research efforts. Based on the review work done in [9], [11] systematically surveys the recent progress on data-driven traffic forecasting models, covering the period from 2014 to 2016, and reveals the latest technical challenges faced by traffic forecasting. As pointed out by [9], Artificial Intelligence (AI) is a key technology and can be treated as an excellent alternative for data mining in transportation applications. Recently, Deep Learning (DL) has drawn more and more concerns and its applications in traffic forecast are also reviewed in [12,13], respectively. They indicate that DL is truly the most effective candidate, resulting from the powerful capacity of processing non-linear data.
Traffic forecasting is an awfully complicated problem in the transportation domain, involving various data sources, model selections, and so on. Although there have been a few articles that review this topic comprehensively, some gaps still need to be bridged from a new perspective. On the one hand, previous literature reviews significantly rely on the author' experiences and subjective judgments in this field; in other words, their studies are conducted mainly based on qualitative methods instead of quantitative ones. On the other hand, previous review articles lack visualized analysis for all the studies on traffic forecasting such that they cannot reveal their co-citation relationship and keyword evolution well. This is mainly resultant from the huge amount of research reports in the literature on this topic in the past more than four decades. Hence, obviously, it is almost impossible to review all of the studies manually, motivating us to conduct this study in a different way.
Based on the above discussion, this study attempts to present a comprehensive review on the research in traffic forecasting from a different perspective, namely, scientometrics review, to bring more direct comprehension for all concerned aspects. The goal of this study is to comprehensively review all the relevant articles retrieved from the Web of Science platform (WoS), ranging from the first article on traffic forecasting indexed by WoS to the latest one indexed in 2019. With the employment of scientometric analysis, several tasks could be achieved automatically, including (i) constructing an author network in the field of traffic forecasting, (ii) developing an institution network in the field, (iii) ranking the journals that publish articles on this topic, (iv) establishing an article co-citation network, (v) presenting a keyword network analysis and burst detection, and (vi) identifying the emerging trends and technological evolution of traffic forecasting research.
The main contributions of this study can be listed as follows: (1) for the first time, we introduce a scientometric approach to the field of traffic forecasting research to quantify the research progress in this field; (2) the technological evolution in this field is visualized and emerging trends are identified and analyzed; and (3) influential institutions, journals, scholars, and documents are identified to provide a reference for the research community, while popular forecasting techniques are listed to provide benchmarks of comparisons.
The remainder of this work is structured as follows. Section 2 provides a detailed specification of the review methodology, namely, scientometrics and data acquisition. Section 3 discusses the co-authorship network and institutional network followed by keyword co-occurring network in Section 4. Journal co-citation network, author co-citation network, and document co-citation network are analysed in Section 5. Based on the scientometric analysis, Section 6 lists several challenges in the field of traffic forecasting. Section 7 summarizes the entire research effort and suggests some directions for future inquiry.

Methodology
Scientometric review, regarded as a quantitative study of science, is an important method to comprehensively evaluate and examine the development of a research field [14][15][16]. Different from the traditional literature review method, this novel one can provide a wider range of articles to be reviewed, which means not only the articles themselves, but also their citing articles can all be reviewed simultaneously [16]. Additionally, with the help of the scientometric analysis and the support of its corresponding software, researchers can carry out the review work effortlessly and repeatedly rather than relying on domain experts, by which the emerging trends are always available. In recent years, the scientometric analysis has been applied by several researchers across different domains, including recommendation systems [16], building information modelling [14], sustainability and sustainable development [17], genome-wide association [18], and so on.
In the present study, we employ CiteSpace and VOSviewer to conduct the scientometric analysis for the domain of traffic forecasting. For scientists, they are very useful knowledge visualization tools. CiteSpace is a powerful and popular visualization application developed by Dr. Chen to explore and visualize hot topics, emerging trends, and fundamental changes in the focused field over time [19]. Apart from generating clusters of authors, institutions, and co-citation, CiteSpace can be used to construct a co-occurrence network, providing different keywords that appear in the same article. More importantly, the burst detection, as one of the most critical functions, is used to mine emerging trends by a special algorithm for detecting sharp changes in terms of frequency of occurrences. This freely available computational application can be downloaded from its official website with related manual and books. In addition, VOSviewer is also used for scientometric analysis based on the same dataset. This visualization tool developed by van Eck and Waltman can analyze authors, citations, keywords etc. It is unique in clustering techniques and visualization [20]. Different from CiteSpace, VOSviewer mainly illustrates the clustering relationship among nodes in terms of distance and density, accurately exploiting the nature of research topics through an effective combination.

Data collection
Web of Science (WoS) is one of the most frequently used literature search engines for researchers in different fields of sciences, providing a comprehensive citation indexing. Specifically, it provides basic information of publications in the literature, which involves the users' concerns, ranging from titles, authors, journals, organizations, keywords, abstracts to citation records. The core collection of WoS covers over 18,000 high impact journals and more than 148 million records from all over the world, tracing back to the early 20th century [21]. Here, we retrieve the records of the selected documents from WoS. When conducting the searching work, the database in WoS from which records are extracted is limited to "Web of Science Core Collection" to ensure the validity of the data source for this study. The searching rules for retrieving the related articles are described as follows. (i) Advanced search is carried out with the input being "(TS = (traffic AND forecast*) OR TS = (traffic AND predict*)) AND (TI = forecast* OR TI = predict* OR TI = estimat*)", where * represents a fuzzy search, and TI and TS mean an article title and subject, respectively; (ii) The language is restricted to English and the document type is restricted to article. This is mainly due to the fact that journal articles are usually subject to a rigorous peer review; (iii) Timespan is set to "from 1975 to 2019″; and (iv) The citation indices include Science Citation Index Expanded (SCI-EXPANDED) and Social Sciences Citation Index (SSCI). Totally, 4776 bibliography records are retrieved and among them a significant number of articles are not related to the subject of traffic forecasting in transportation. It is found that, based on the given search rules, the articles relevant to traffic forecasting in the domain of communication networks are also retrieved. Hence, those records need to be excluded manually to guarantee the relevance of the selected records to the topic on which we are studying. After selection, 1536 bibliographic records are downloaded in January 2020 with cited references being also included in order to conduct the co-citation analysis in Sections 3-5. CiteSpace and VOSviewer are employed to perform the scientometric analysis for the acquired bibliographies. CiteSpace is employed to display simple networks and time zone diagrams etc., while VOSviewer is used for some complex network analysis, mainly because some nodes in CiteSpace need to be manually positioned when a network is complex.  Figure 1 gives a snapshot to show the number of articles published annually during the last 45 years on traffic forecasting in transportation. It is quite clear that the overall evolution is largely on a significant upward trend and has been growing rapidly in recent years, verifying that the research on traffic forecasting in transportation is gaining more and more attention. During this period, the first article in this domain was published in 1976, using a non-linear parametric model for traffic forecasting [22]. Since then, the number of published articles in this domain has increased year by year. Specifically, in 2019, the number of publications peaked at 330, accounting for 21.48% of the totally published articles during this period. A simple reason for this is that, along with the development in the domain of transportation, some new methods and their variants, especially machine learning and DL models, have been introduced into traffic forecasting in recent years.

AUTHOR NETWORK ANALYSIS AND DISCUSSION
In this section, we first use the collected records to generate a co-authorship network and use the statistical results in CiteSpace to identify the top productive authors. Further, considering the authors' institutions and countries, we get the network of institutions and nations to distinguish the top institutions and nations in this field.

Co-authorship network
An author's productivity level can, to some extent, represent the researcher's efforts in the corresponding field [23]. The application of co-authorship network not only identifies the most productive authors in the field of traffic forecasting research, but also clearly and visually demonstrates the co-authorship relationship among those authors. CiteSpace is used to visualize the co-authorship network on traffic forecasting in transportation, which is depicted in Figure 2. There are totally 804 nodes and 698 links in this figure, and the network density is 0.0022. It can be found that the number of researchers in this field is relatively The co-authorship network on traffic forecasting in transportation large and the co-authorship network basically shows a "coreedge" structure. Overall, it is clear that the collaboration among the authors is relatively fragmented, only a small number of academic teams are formed, and there is a portion of the network, where author collaboration is not significant. In Figure 2, each node represents an author and the link between two nodes indicates the establishment of a coauthorship between them in their publications. Note that the size of each node denotes the number of publications, while the thickness of the links signifies the strength of the authors' relationship. Besides, different colours of links represent different time spans from 1975 to 2019. In terms of productivity, Yinhai Wang, a professor with the Department of Civil and Environmental Engineering at the University of Washington, is the most productive author in the domain of traffic forecasting. Other top seven productive authors with more than 10 publications include Bin Ran, Li Li, Lei Zhang, J. W. C. van Lint, Constantinos Antoniou, and Jianhua Guo. The rest of the top authors with more than eight publications and their details are listed in Table 1, where "Year" indicates year when the corresponding author published the first article in this field. Furthermore, when taking the cooperative relationship into consideration, several research communities exist in the collaborative network, with productive authors generally located at the centre of the communities which they belong to. The first primary community falls into the research circuit composed of Yinhai Wang, Jinjun Tang, and Yunpeng Wang as the central authors and others including Dongfang Ma, Yajie Zou, Weibin Zhang, Jing Qin, and Fang Zong. Another large community corresponds to the research circuit in which Wei Huang, Jianhua Guo, Yun Wei, Jinde Cao, and Bin Ran can be regarded as the central authors.

Network of institutions and nations
Here, a network of institutions and nations, where the authors come from, is constructed by the use of CiteSpace to explore the geographical distribution of these articles. Figure 3 displays the network and we can see clearly that 16  Besides, it is worth noting that links among these countries are considerably dense, indicating a close cooperation between them on traffic forecasting research. Traffic congestion is a common problem in many cities and needs to be alleviated by the development of ITS. However, the effective operation of ITS first relies on the advancement of traffic forecasting. Close cooperation among these nations can help improve people's mobility. Another index of the network, called centrality, can reveal a similar conclusion. Nodes with a high centrality score are those that connect two or more large groups of nodes [14,17], and the identification of nodes with a high centrality also implies that these countries are the critical ones in this research field. The countries with high centrality include USA

KEYWORD NETWORK ANALYSIS AND DISCUSSION
Keywords represent the core content of the articles and demonstrate the development of the research topic over time [14]. In this sub-section, we perform an analysis of author keywords by constructing keyword networks to explore the hotspots and emerging trends in the field of traffic forecasting research. There are two types of keywords in the WoS bibliographyauthor keywords, given by authors; and keywords plus, provided by journals. Author keywords are leveraged for analysis because they are usually more accurate in terms of expressing the core content of an article. It is worth noting that some of the author keywords are synonymous or there are differences in expressions, such as "traffic flow forecasting" and "traffic flow prediction", or "Gaussian process" and "Gaussian processes", whereas most of the scientometric software tools have no way to identify them automatically. Hence, the author keywords need to be merged first to facilitate an accurate analysis in the next step. By programming, the author keywords are efficiently cleaned up mainly by replacing the synonyms.

Network of co-occurring keywords
VOSviewer is used to analyze and display the network of cooccurring keywords which can reveal research hotspots. Compared with other scientometrics software tools, VOSviewer is more suitable for the graphical representation of scientometric maps, particularly functional for presenting large scientometric maps in an easily interpretable way [20]. In all the selected articles, totally 3713 keywords are extracted among which only 605 keywords meet the threshold of two, meaning that the minimum of occurrences of a keyword is two. Here, we set the threshold to be four and obtain 159 keywords to generate the network of co-occurring keywords, which is depicted in Figure 4. Almost as with CiteSpace, each node in Figure 4 represents a keyword. The size of a node or its label in the network expresses the occurrence frequency of the corresponding keyword, Overlay visualization of the network of co-occurring keywords meaning that the larger the node or label is, the more important the keyword is. The distance between two nodes indicates the strength of the relationship between the two keywords. For this reason, the closer the two nodes are to each other, the more frequently these two keywords appear together [24]. In addition, the colour of the nodes explains the cluster to which the node belongs. The largest node in Figure 4 belongs to "traffic flow" with the weight of occurrences being 136 (see Table 2), followed in a descending order by "neural network" (103); "travel time"(78); "ITS" (77); "deep learning" (51); "short-term forecasting" (46); "time series" (41); "support vector machine" (39); "Kalman filtering" (33), and "genetic algorithm" (27). It is clear that these keywords present the research hotspots in the field of traffic forecasting. What is more, these nodes are very close in terms of the distance, including "traffic flow", "neural networks", "travel time", "intelligent transportation systems", "deep learning", "support vector machines", "short-term forecasting" etc. This also means that the research domain primarily focuses on using neural networks, DL, and support vector machines to study traffic flow prediction and travel time prediction in ITS, especially for short-term forecasting. Note that the temporal aspect cannot be reflected in Figure 4. Furthermore, we need to explore and discover which keywords are the emerging hotspots, which are slowly fading away, and which are the historical research hotspots. To do so, with VOSviewer, the overlay visualization of keywords (see Figure 5) are generated, which significantly shows the evolution of research topics in the field of traffic forecasting over time in , which can describe the emerging hotspots well. In the network shown in Figure 5, the Y i 's are distinguished by different colours in the overlay visualization, with warm colours (e.g. red, orange, and yellow) indicating keywords that occur mainly in recent articles, and cold colours (e.g. green, turquoise, and blue) presenting keywords that occur in earlier articles. In Figure 5, it is not difficult for us to discover that some of the themes with red and orange occur in articles published in recent years, including "deep learning", "road traffic", "learning (artificial intelligence)", "convolutional neutral network", "feature selection", "long short-term memory network", "spatio-temporal correlation", "attention mechanism", "deep belief network", and so on, indicating that they are the emerging hotspots in the field of traffic forecasting research. Green nodes, such as "traffic flow" and "time series", indicate that researchers' concerns with them are more evenly distributed over time.

Emerging trends based on keyword bursts
The word "burst" refers to a large change in terms of the value of a variable over a short period of time [25]. Analysing the bursts of items, such as keywords or citations, is regarded as an important way to identify emerging trends in CiteSpace [26]. For instance, if the occurrence frequency of the keyword "recurrent neural network" increases at a significantly faster rate than that of other keywords in the last three years, this keyword can be considered as a keyword with strong bursts. Meanwhile, the emergence of these keywords specifies that scholars in the field have paid or are paying particular attention to them. As shown in Figure 6, several keywords with the strongest citation bursts are identified from 1997 to 2019, listing by the time when they first start to explode. Apparently, the early keywords with the strongest citation bursts involve "neural network" and "travel time" and they last for nine and 12 years. However, the citation bursts of these two keywords were changed as time went by. More recently, "recurrent neural network", "convolutional neural network", "spatio-temporal correlation", "traffic pattern", and "feature selection" became the hot keywords with the strongest citation bursts. Consequently, we can draw an important conclusion that using recurrent neural network (RNN), convolutional neural network (CNN), or their variants, or analysing spatio-temporal correlation, traffic pattern, or feature selection to conduct traffic forecasting becomes the emerging trends in the field.
These forecasting techniques are emerging trends mainly because of their good capabilities in the field of traffic forecasting. RNNs, for example, do not require much in terms of the distribution of time series data and can learn complex patterns within them, rather than just mechanically targeting certain fixed factors. In addition, there are spatio-temporal correlations between the traffic states of adjacent roads. CNNs are used to automatically and efficiently capture spatial information through convolutional operations. Of course, there are also some drawbacks to these popular DL methods. The first one is that the process of building a network is complex and many hyper-parameters need to be set, and improper data processing or inappropriate parameter settings can hardly achieve good results. The second is the large amount of required data. The training process consumes serious hardware resources and is time-consuming. The third is that the generalization capability can be problematic.

Technological evolution based on timezone view
With CiteSpace, the technological evolution of this field can be further identified based on the changes of keyword frequency over time. For details, a timezone map can be generated, visually displaying the characteristics of the primary keywords and their relationships. It is useful for analyzing and summarizing the technological evolution of the traffic forecasting, and predicting the future research trends in this field. The timezone map is shown in Figure 7 with the time duration from 1996 to 2019. Articles published before 1996 are not selected in this part because they only make up a small percentage of the total publications, and more importantly, the keywords are missing from most of them. Besides, keyword nodes with their first occurrence time being the same are aggregated in the same period and divided by one year.
As can be seen in Figure 7, the largest node is "traffic flow", followed by "neural network", "travel time", "deep learning", and other keywords. From Figure 7, we can see that the network structure formed by these words is complex and has a relatively large time span. The technologies involved include neural network, Kalman filter, genetic algorithm, support vector machine, random forest, long short-term memory etc., indicating that the research methods are diversified. Combining the timezone view and the current development status of traffic forecasting, the FIGURE 7 Technological evolution technological evolution of this field can be divided into three stages as follows.
The first phase (1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003) is the inception of traffic forecasting research. Although this period lasts for eight years, only 85 articles are published, accounting for only about 5.53% of the total publications, which indicates that less attention was paid to this area during that time. The half-life of the keywords "neural network" and "Kalman filter", the most frequent keywords in that time, is long, indicating that these technologies have had a more profound impact on the study of traffic forecasting and laid the foundation for the research in this field.
The second phase (2004-2015) is a period of steady growth in traffic forecasting studies. The techniques during this period are more diverse. Some parametric optimization methods, such as genetic algorithms, particle swarm algorithms etc., were introduced into the forecasting techniques to further improve the performance of forecasting. Support vector machine received a lot of attention during that period due to their good prediction performance. With the rise of machine learning and the development of big data technology, data mining in the field of traffic forecasting has gradually received attention, and some new forecasting techniques, such as Bayesian networks and random forests, were applied to this field and became popular.
The third phase (2016-2019) is a period of explosive growth in traffic forecasting research. In terms of volume, the number of publications reached 779 in just four years, accounting for more than 50% of the total number. Traffic forecasting is influenced by several factors, such as weather, holidays, traffic accidents, and the structure of the road networks. DL forecasting techniques were introduced into the field of traffic forecasting during this period and proved to be one of the most effective forecasting techniques due to the ability of handling non-linear data. CNNs, RNNs, and long short-term memory networks were widely used during this period, while deep belief networks also received some attention. With further refinement on traffic prediction research, feature selection and spatiotemporal correlation methods have been extensively developed to further improve prediction performance. Notably, the attention mechanism approach has received more attention in the past year, due to its practical and superior capabilities.
In general, traffic prediction techniques can be broadly divided into two categories-parametric methods and nonparametric methods. Parametric methods cover, for example, time series and Kalman filters. These methods usually consist of a fixed number of parameters and are always explored with some assumptions that are suitable for steady-state conditions. This is not realistic for variable traffic states. In contrast, nonparametric methods do not impose many assumptions and are more suitable for modeling traffic forecasting. Non-parametric approaches to traffic forecasting include Bayesian networks, Knearest neighbor algorithms, support vector machine, artificial neural networks, multi-layer perceptron, and DL methods. As can be seen from the technology evolution diagram in Figure 7, the scholarly interest in traffic prediction techniques is generally along the path from parametric models to machine learning, and then to DL. Besides, it is suggested that these historically and currently popular approaches should be considered as the baseline models for comparison when researchers demonstrate the advancement of their proposed methods.

Journal co-citation network
VOSviewer is utilized to generate a network of the source journals that publish these 1536 articles, and then, this network file is

FIGURE 8
The journal co-citation network converted into an EXCEL formation to explore the distribution of the journals. The 13 top journals for traffic forecasting are extracted, as described in Table 3. The first journal, Transportation Research Record, has published 132 articles on this topic, followed closely by the IEEE Transactions on Intelligent Transportation Systems (122 articles), and Transportation Research Part C-Emerging Technologies (120 articles). These three top source journals are all professional journals in the field of transportation, accounting for 24.35% of the total number of articles. Besides, IEEE Access and IET Intelligent Transport Systems also published more than 60 articles on this topic. Among these 13 journals, nine are published in USA. Furthermore, the references cited by these 1536 articles are investigated by using CiteSpace, and a journal co-citation network with 552 nodes and 2049 links is generated to identify the most notably cited journals which may play important roles in the development of the field. The journal co-citation network is depicted vividly in Figure 8. Similar to Figure 2, the size of the nodes indicates the frequency, giving the number of the cited references that are published in the corresponding journals. It can be clearly seen that the three top most cited journals in the field of traffic forecasting are Transportation Research Record (frequency = 969), Transportation Research Part C-Emerging Technologies (frequency = 951), and IEEE Transactions on Intelligent Transportation Systems (frequency = 732). They are also the three top source journals in Table 3. Interestingly, although Transportation Research Part B: Methodological (frequency = 624) and Journal of Transportation Engineering-ASCE (frequency = 533) do not fall into the five top journals in terms of the number of articles published by the journals in the field, they are in the top five in terms of the citation frequency of articles published in them, indicating that the articles on traffic forecasting published in these journals are quite influential.
Another index, called betweenness centrality, is used to identify the pivotal nodes in Figure 8. The pivotal journals with high centrality shown in

Author co-citation network
The author co-citation network is applied to analyse the relationships between authors. Two authors are said to be co-cited if the articles of both authors are cited by the same articles and also their names appear in the references. Note that there may be co-authors for an article and, when it is cited, only parts of authors are listed in the references. Thus, the authors whose names are not in the references are not treated as a co-cited author. With this network, we can understand how the scientific The author co-citation network community evolves over time [14,23], and thus, eventually, figure out the intelligent structure for traffic forecasting research. As shown in Figure 9, The diversity of countries where these authors come from shows that the growth of the field of traffic forecasting research is worldwide. In terms of the betweenness centrality of the nodes, the authors who play a pivotal role are Howard R. Kirby (centrality = 0.28), Mukhtiar S Ahmed (0.18), Nancy L. Nihan (0.14), Enrique Castillo (0.13), Federal Highway Administration (0.13), Moshe Ben-Akiva (0.12), and Gary A. Davis (0.11). They are the critically important nodes in the network, acting as intermediaries and helping to connect different research communities. Note that one of the authors with the highest centrality is a government department, Federal Highway Administration. This agency, which is part of the United States Department of Transportation, produces a large number of publications each year to introduce the public to some domains of transportation, especially new technologies in transportation. This finding is a major boon to the global push for traffic forecasting development and also demonstrates that the practices of some traffic management agencies are also driving the field forward as well as scholars. When comparing the highly cited authors with the high centrality authors, an important conclusion is that a highly cited author is usually not a high centrality author, which is consistent with the finding reported in [27].

Article co-citation network
When two documents are cited simultaneously by a third document, the first two documents own a co-citation relationship [28]. Document citation analysis is very effective in establishing the underlying knowledge structure of a research domain and identifying its evolutionary process. Unlike the common citation analysis, article co-citation analysis is a citation network analysis method that selects representative articles and utilizes network analysis to establish the network relationships of selected articles. In this co-citation network, the importance of nodes is not only reflected by the frequency of citations, but also by the connectivity of the nodes in the network. As previously discussed, there are 1536 articles for this scientometric review and they cite 29,592 references. Among these references, 136 are cited more than 25 times and they are used to generate the co-citation network by using VOSviewer, as shown in Figure 10. In this network, the largest node representing the article [3] receives the highest citation and the citation weight or frequency is 220, while the number of citations in WoS is 460 up to February 2020.

Analysis of highly cited articles
The size of each node in Figure 10 denotes the citation frequency of the corresponding article. Obviously, the largest node in the graph goes to [3]. For traffic flow forecasting, this article conducts a comparison study between a non-parametric regression model, called nearest-neighbour non-parametric regression, and a classical parametric modelling method, called seasonal autoregressive integrated moving average (ARIMA). It verifies the theoretical basis of non-parametric regression. Although non-parametric models do not perform as well as seasonal ARIMA, based on the concept of non-parametric models, the proposed heuristic forecast generation technique significantly improves the forecasting performance of non-parametric regression. Particularly, this article highlights other ways to further improve the forecasting performance of non-parametric regression models as a future research direction, which also becomes the foundation and guideline for the research of following scholars. It is the major reason that this article was cited so many times. Article [29] with 190 citations is at the second position in the top 10 highly cited articles. This article introduces two models based on Kalman filter theory for short-term traffic volume forecasting and is regarded as the first one to use Kalman filter for traffic volume forecasting [30]. Since then, a large number of Kalman-filtering-based methods with some variants are proposed for traffic forecasting, and the article [29] receives much attention and high citations. The third place goes to [2] with 180 citations. This study successfully applies seasonal ARIMA based on Wold decomposition theory for traffic flow forecasting. The methods proposed by [2,29] are a typical kind of approaches for eliminating the daily-periodic trend for traffic prediction [31]. Article [32] occupies the fourth place with 149 citations. This FIGURE 10 Document co-citation network generated by VOSviewe article focuses on the application of DL models for traffic flow prediction and receives 149 citations in less than five years. This is relatively due to the fact that the explosive increase of big data and machine learning models in recent years has driven a shift in traffic forecasting from traditional simple time series forecasting methods to DL models. This article is the pioneer work that applies DL models to traffic forecasting and it becomes a hit in the field.
The fifth, sixth, and eighth highly cited articles, [8,9,33], are published by the same authors. The first two of them are review articles and in these two articles, the authors provide a comprehensive analysis of traffic forecasting achievements over different periods in 2004 and 2014, respectively, and point out the future research directions. The other article extends the previous research by using genetic algorithms to optimize neural networks for predicting short-term traffic flows with satisfactory results. This idea inspires a lot of scholars in this field. As the other highly cited articles, these three articles are quite instrumental for advancing traffic forecasting research. Especially, efforts are made to cope with some of the challenges that are pointed out in these articles in a recent study [11]. Article [4] is ranked at the seventh in the citation network, which develops multivariate time series state space models to predict downstream traffic flows on urban arterial roads, based on data from upstream. All the 30 top highly cited documents in the co-citation network are presented in Table 4.

High centrality documents
Articles with high betweenness centrality are also very important in the co-citation network, as they often act as a bridge in the evolution of knowledge domains [27]. The betweenness centrality of each article or node is computationally obtained in Citespace. Totally, three articles in the network have a centrality value greater than 0.1. The book entitled "Neural networks for pattern recognition" owns the highest centrality (0.2), bringing the first comprehensive introduction of feed-forward neural networks from the viewpoint of statistical pattern recognition [55]. Later, this book produces a huge impact on many fields, including traffic forecasting, and is cited over 30,000 times in Google Scholar up to February 2020. Neural networks also become the most widely used model in the field of traffic forecasting as this can be vividly seen in Figure 4. The second highest centrality (0.11) corresponds to [56], introducing the counter propagation network for free link travel time forecasting. The last one with centrality 0.10 is [57], providing a comprehensive review of neural networks for its applications in transportation. All these publications can be considered as the major intellectual turning points in the field of traffic forecasting, although they are not cited very frequently in the co-citation network.

References with citation bursts
Reference citation bursts, describing the dynamics of a field to some extent, demonstrate the possibility that the scientific community has given or is giving particular concerns to the underlying contributions [19]. In terms of the strength of citation bursts, the top 30 references are generated from Citespace, as shown in Figure 11. Instead of discussing them all, we apply the following rule to select some of them for special analysis. The top 30 references are first divided into several groups according to the time when the citations begin to increase, and one reference with the strongest bursts is chosen from each group for analysis. The start time of citation bursts of the 30  [53] with the strongest bursts is chosen. The selected references are shown in Table 5. From Table 5, we observe that the first article [58] in the table starts its citation bursts in 2000. In this article, the authors first classify the historical link travel time and, for each class, a module neural network is calibrated and then employed to predict the link travel time. The superiority of this approach is verified empirically. The article [59] receives the strongest burst that starts in 2003. This work identifies four different classes of time series based on white noise tests and they are used for singlestep highway traffic volume prediction. Results demonstrate the superiority of a subset of ARIMA over the full ARIMA. Article [3] gets the strongest burst starting in 2004. Note that this article not only owns the strongest citation bursts among all the references, but also receives the highest citation frequency in the document co-citation network, further highlighting the significance to this field. Citation burst starting in 2005 is led by [60] and this article proposes a linear model in which the coefficients vary as a smooth function of the departure time to predict highway travel time. Articles [4] and [33], listed as top ten in highly cited documents as shown in Table 4, own the strongest bursts starting in 2006 and in 2007, respectively. This also indicates that these two articles play important roles in the field.
The strongest burst from 2008 goes to [53], presenting a Bayesian combinatorial neural network that linearly combines two single predictors, the back propagation and the radial basis function neural networks. This article also occurs in top 30 for high citation documents, partly because the combined mindset of forecasting comes into focus in the field. As the strongest burst from 2009, [61] proposes an online learning method based on extended Kalman filtering to improve state-space neural networks and this method is applicable to situations where real travel time is not available. The strongest burst from 2010 goes to [62], which improves the approach proposed by [53] and presents a nonlinear combination of an online adaptive Kalman filter and a neural network model. The next article is [63], which provides an aggressive approach, a combination of several classical predictive models. The most recent strongest burst from 2013 goes to [35], which proposes a model of online support vector machine regression for traffic flow prediction under atypical conditions. Note that this article is also listed as top 30 in the highly cited documents.
It follows from the analysis of these critical article citation bursts that some of the highly cited articles in the citation network also have strong citation bursts, suggesting that these articles contribute significantly to the field; and recent citation bursts indicate that the application of combined models to traffic forecasting and the study of traffic flow forecasting in atypical situations, such as unavailability of actual data, traffic accidents, congestion etc., attract widespread interest of researchers in this field.

CHALLENGES IN THE FIELD OF TRAFFIC FORECASTING
Existing publications and reviews have already depicted the challenges in this field. In this section, we expand and present the latest challenges based on the results of the scientometric analysis.

Forecasting models
Discussions from the previous scientometric analysis suggest that deep neural networks (DNN) are of interest to the research community and the emerging hotspots in the field of traffic forecasting. In engineering practice, data acquisition becomes easy, yet training DNN with massive traffic data is one of the biggest challenges. These DNN models achieve training by frequent iterations. Traditional training methods are extraordinarily time-consuming, which is not suitable for practical applications. In addition, combined forecasting models also attract researchers' attention at present, but their training time is usually longer than that of a single model. How to speed up the training of DNN models is a challenging issue and combined models are the focus of future research.

Forecasting scope
Spatio-temporal correlation and feature selection have become the hotspots of traffic forecasting, indicating that scientists are no longer limited to studying the predictions of individual points. As demonstrated in the earlier reviews, network-wide forecasting is being watched on an ongoing basis and is seen as a challenge, especially when the road network is complex. For example, in an urban road network, traffic flows at one point may be influenced by multiple points nearby. How to identify the spatial and temporal correlation of different points based on the road network characteristics for more accurate and efficient prediction is a direction of future research.

CONCLUSIONS AND FUTURE WORK
The importance of traffic forecasting in ITSs is gaining more and more recognition among a growing number of researchers and practitioners. A comprehensive review on this topic helps to capture its evolution through a period of time and clarify future research directions. In this study, the author network analysis, institutional network analysis, keyword network analysis, and cocitation network analysis are performed on 1536 bibliographic records from WoS using scientometric methods with CiteSpace and VOSviewer. These methods are used to statistically identify the hotspots and emerging trends in the field of traffic forecasting research.
The study on the number of annually published articles on traffic forecasting vividly shows the steady increase over these years, which is the further evidence of the increasing attention and efforts that have been paid to this field. The analysis of the co-authorship network reveals that highly productive authors in this field include Yinhai Wang, Bin Ran, Li Li, Lei Zhang, J. W. C. van Lint, Constantinos Antoniou, and Jianhua Guo. Moreover, the analysis of the author co-citation network presents that Eleni I. Vlahogianni, Brian L. Smith, Billy M. Williams, Iwao Okutani, J. W. C. van Lint, Anthony Stathopoulos, Yisheng Lv, and others are the highly cited authors in the co-citation network. The comparisons indicate that, on the one hand, a few authors are both highly productive and highly cited, such as J. W. C. van Lint and Eleni I. Vlahogianni, who deserve special attention; on the other hand, highly productive authors are not always highly cited, and some authors are highly cited for their fundamental contributions to the field, even if they have a small number of publications in this field.
As for the geographic distribution of these journal articles on traffic forecasting, China and the United States contribute the majority of publications. At the same time, Southwestern University, Beijing Jiaotong University, Tsinghua University, Delft University of Technology, University of Maryland, Tongji University, and other institutions are the most productive institutions in terms of quantity. The diversity of countries/regions and institutions indicates that traffic forecasting becomes a global concern. The complexity of the network links also shows that some institutions have worked or are working closely together.
The analysis of journal citation network shows that Transportation Research Record, Transportation Research Part C-Emerging Technologies, and IEEE Transactions on Intelligent Transportation Systems are not only the top three source journals on traffic forecasting but also the top three most cited journals in the field. In terms of citation frequency, Transportation Research Part B: Methodology and Journal of Transportation Engineering-ASCE are also influential journals in the field of traffic forecasting, although they do not have the highest volume of publications.
This study analyses the author keywords and draws important conclusions from three perspectives. First, the words "traffic flow", "natural network", "travel time", "ITS", "deep learning", and "short-term forecasting" appear most frequently, indicating that the hotspots in the field mainly focus on using neural networks or DL to study traffic flow or travel time in ITS. The overlay visualization of keyword co-occurring network shows that the words "deep learning", "road traffic", "learning (artificial intelligence)", "convolutional neutral network", "feature selection", "long short-term memory network", "spatiotemporal correlation", "attention mechanism", and "deep belief network" are the emerging hotspots in this field. Second, keyword burst analysis reveals that the words "recurrent neural network", "convolutional neural network", "spatio-temporal correlation", "traffic pattern", and "feature selection" are becoming the emerging trends. It is recommended that researchers should pay more attention to these aspects. Finally, it shows that the technological evolution of this field can be divided into three stages.
In the document co-citation network, the main highly cited documents apply then-new and later popular methods to traffic forecasting, gaining significant attention from the research community. The analysis of reference citation bursts indicates that some of the highly cited articles in the citation network also have strong citation bursts, while recent citation bursts reveal that the applications of combined models to traffic forecasting and study of traffic flow forecasting in atypical situations, such as the unavailability of actual data, traffic accidents, congestion etc., attract widespread interest in the research community and may also become the emerging trends in the near future.
This study provides valuable information and describes the global landscape of traffic forecasting research through a visualization technique. It not only provides the research community with influential institutions, journals, scholars, and documents in the field, as well as some insights on research hotspots and emerging trends, but also provides a reference for practitioners to select powerful institutions and people as partners. Also, the most commonly used models, identified by this study, can be regarded as comparisons for the research community in validating their model advances. Future research on traffic forecasting may focus on the emerging trends and hotspots identified in this study.