Predicting cyclone tracks in the north Indian Ocean: An artificial neural network approach



[1] Predicting cyclone tracks in the Indian Ocean has been a challenging problem. In this paper, we used past 12 hours of observations (2 positions, at 6 hourly intervals and the present position) to predict the position of a cyclone 24 hours in advance in terms of latitude and longitude. For this purpose we adopted an artificial neural network approach using 32 years (1971–2002) of tropical cyclone best track analysis over the Indian Ocean. The mean absolute error between the estimated and actual latitude (longitude) is 0.75 (0.87) degrees with correlation coefficient of 0.98 (0.99) for the prediction data set that was not used for developing the model. The mean error of estimation of the distance between the best track and the predicted positions is 137.5 km. Forecasts for 12, 36, 48, 60 and 72 hours were also attempted.

1. Introduction

[2] Several models and methods have been developed to predict the positions of the cyclones accurately to issue appropriate warning for disaster management. Mohanty and Gupta [1997] and Gupta [2006] described different track prediction techniques. Bell [1979] described different operational numerical forecasting models. Storm surge is the most devastating impact of the tropical cyclones, particularly, for the Indian coastal regions. Because of the highly varying bathymetry of the Indian region, even a slight error in the prediction of landfall point can lead to a totally different storm surge height. Hence, it is of importance to predict the position of the cyclone in advance with sufficient accuracy. Compared with north Atlantic region, errors in the track prediction have not significantly reduced over the north Indian Ocean [Gupta, 2006]. The objective track prediction of tropical cyclones may be grouped into four categories [Elsberry, 1995]: (1) empirical, e.g., climatology, persistence of past motion, climatology and persistence (CLIPER), and analogue techniques; (2) statistical-synoptic, in which additional meteorological information is incorporated, usually via statistical regression using grid-point values from synoptic analysis available at the forecast time; (3) statistical-dynamic, in which grid-point values from synoptic predictions are also incorporated; and (4) dynamic, in which a global or regional numerical weather prediction (NWP) model is integrated as an initial value problem to provide a track forecast. The empirical track forecasts are easy to understand given the simple inputs. On the other hand, the statistical models have additional complexity and are not easy to interpret because the grid-point predictions are generally not available to the forecaster. The dynamical models are even more complex and difficult in which steering influences at many levels and various physical processes such as advective, adiabatic and frictional effects may be contributing to cyclone motion.

[3] Traditionally, modeling a dynamical system requires deriving the equations of motion from first principles, to measure initial conditions and, finally, to integrate the equations of motion forward in time. Alternatively, when such an approach is not feasible due to some reasons, empirical laws governing the physical processes can be obtained by model-fitting approaches based on the observed variability of the system evolution. Lam [1993] used the NWP model for 24 hours advance prediction of cyclone track. But the most sophisticated models in operation could not escape large forecast errors. According to Carr and Elsberry [2000], about one third of the 72-hours forecasts in 1997 had errors exceeding 555 km. Heming et al. [1995] pointed peculiarities of the tropical cyclone problem in the NWP context.

[4] In this paper, we used the artificial neural network (ANN) approach to predict the position of the Indian Ocean cyclones 24 hours in advance using only the past 12-hour locations at six hourly intervals besides the present position. Neural networks have been applied to a wide variety of areas like physical oceanography, biological oceanography, meteorology, acoustics, robotics and medical sciences [Nannariello and Fricke, 1998; Chen et al., 2000; Richaume et al., 2000; Pozzi et al., 2000; Schroeder et al., 2002; Bourras and Liu, 2003; Ali et al., 2004; Jain and Ali, 2006]. ANN model has been used to predict the hurricane intensity in the north Atlantic basin [Ramirez and Castro, 2006; Ramirez and Veneros, 2004]. Baik and Paek [2000] compared linear regression method and ANN approach in predicting the cyclone intensity and found that ANN scheme has improved the prediction. For the development and prediction of the ANN model, we used 32 years (1971–2002) of best track analysis from Joint Typhoon Warning Center (JTWC), USA that provides cyclone positions at 6 hourly intervals.

2. Data and Methods

[5] We divided the individual 6-hourly best track cyclone positions from the analysis of JTWC, during 1971–2002, into different track-segments consisting of (1) two 6-hourly positions and the present position in terms of latitude and longitude as the predictors, and (2) one position 24 hours in advance as the predictand. Thus, altogether 3684 track segments were available for 230 cyclones from JTWC analysis in the Indian Ocean consisting of Bay of Bengal and Arabian Sea.

[6] A single ANN model is developed for the entire area including Arabian Sea and Bay of Bengal. The distance, D in degrees, between the best track position (X1, Y1) and the predicted position (X2, Y2) is calculated following great circle distance [Smith, 1958; H. G. Baker, Computing great circle distances from latitudes and longitudes, available at∼rusin/known-math/95/distance] as:

equation image

where Xi and Yi represent the longitude and latitude respectively in degrees. We multiplied this value with 111.194928 to obtain the distance in km.

3. ANN Analysis

[7] We used the linear ANN technique, a powerful data modeling tool, capable of capturing and representing complex input/output relationships. It is an artificial system that can perform intelligent tasks similar to those performed by human brain. It acquires knowledge through learning that is stored within inter-neuron connection strengths known as synaptic weights. ANNs are able to represent both linear and non-linear relationships and they learn these relationships directly from the data being modeled. Networks of linear units implement a basic linear model, used principally for regression problems. These models with no hidden layers involve a simple linear relationship between input and output variables.

[8] We used past two 6-hourly and the present positions (latitude and longitude) as input parameters (predictors) and 24 hours in advance as the output positions (predictant). We have used several linear and non linear transfer functions (e.g. Radial basis function, linear least squares optimization) with different number of hidden layers and neurons. The results of these ANN models have been inter-compared using several statistical criteria like root mean square error (RMSE), scatter index, training/validation performances, and prediction errors. In this case the linear neural network employing a pseudo invert learning algorithm has given the best results and hence this model has been selected for the analysis. Pseudo invert algorithm uses the singular value decomposition technique to calculate the pseudo inverse of the matrix needed to set the weights in a linear output layer to find the least mean squared solution. It guarantees to find the optimal setting for the weights in a linear layer, to minimize the RMS training set error. The ANN approach requires three types of data sets: (1) for training, (2) for validating, and (3) for predicting. We used 1713 (49.46%) cyclone track segments of 131 cyclones during 1971–1982 for training, 920 (26.56%) track segments of 58 cyclones during 1983–1994 for validation and 830 (23.96%) track segments of 41 cyclones during 1995–2002 for prediction. The distribution of the 24 hours advance position used for training, validation and prediction are shown in Figure 1. The absolute mean errors for training, validation and prediction are 0.67 (0.96), 0.76 (0.91), and 0.74 (0.87) degrees for latitude (longitude) respectively (Table 1). This error for the entire data set is 0.71 (0.92). Incorporation of speed and direction of the cyclone, β-value and time of the year did not improve the prediction skill of the ANN model. Besides, the model is sensitive to only the present and the past two positions. We also studied the effect of intensity on the prediction of cyclone position (only 1744 track segments could be analysed as the number of cyclones from JTWC with intensity information is less). The absolute error mean for this new data set has reduced from 0.68 to 0.66 in case of latitude without any difference for longitude predictions. Hence, we considered only the past two 6-hourly positions besides the present position in our present analysis. All the results discussed in the subsequent section refer to the predicted positions consisting of 830 track segments of 41 cyclones during 1995 – 2002 that were not used for developing the ANN model.

Figure 1.

Distribution of 24 hours advance position of training, validation and prediction (dots, positions of training and validation data; circles, positions of prediction data).

Table 1. Statistics of the ANN Analysis for 24 Hours Forecast of Cyclone Positions in Terms of Latitudes (Longitudes)
Data SetAbsolute Error MeanRMSECorrelation Coefficient
Overall0.71 (0.92)0.95(1.23)0.98 (0.99)
Train0.67 (0.96)0.93(1.29)0.98 (0.99)
Validation0.76 (0.91)0.97(1.20)0.98 (0.99)
Prediction0.74 (0.87)1.01(1.16)0.98 (0.99)

4. Results

[9] The RMSE between the predicted and the actual latitudes (longitudes) is 1.01 (1.16) degrees with the correlation coefficients of 0.98 (0.99). The scatter between the best track and the predicted latitudes and longitudes (Figures 2a and 2b) is very good. The mean prediction error (distance between best track positions and the predicted positions) is 137.5 km. India Meteorological Department (IMD) runs the Florida State University based Limited Area Model (LAM) and the National Centre for Environmental Prediction based Quasi Lagrangian Model (QLM) to forecast the cyclone movements. We have taken the mean distance errors from the regional specialized meteorological center reports during 2000–2005. We considered only those cyclones that are commonly reported both in JTWC and the above reports. The average errors from LAM, QLM and ANN models are 132.6 km, 142.0 km and 127.5 km respectively.

Figure 2.

Scatter of best-track and predicted (a) latitudes and (b) longitudes for prediction data set.

[10] Super cyclonic storm over the Bay of Bengal during 25–31 October 1999 was the most intense cyclone in the history of Orissa for the past 114 years after the September 1885 cyclone. The JTWC derived best and the ANN predicted track positions 24 hours in advance of this cyclone are shown in Figure 3a. The ANN based 24 hours forecast has a mean error of 110.95 km between the best track and the ANN predicted positions. Very severe cyclonic storm of the Arabian Sea during 16–22 May 1999 is the only storm that crossed the Pakistan Coast of Sindh after 1948. Its unpredictability by different models is another peculiarity of this storm. The ANN predicted track (Figure 3b) coincides very well with the best track of JTWC. The ANN model could predict the track of this cyclone 24 hours in advance with a mean difference error of 109.05 km. The forecasts given by IMD using QLM for the Orissa super cyclone and by using LAM for the Arabian Sea very severe cyclonic storm are shown in Figures 3g and 3h [India Meteorological Department, 2000] respectively. The comparison of Figures 3g and 3h with Figures 3a and 3b shows the superiority of the ANN model.

Figure 3.

A comparison of the best track (asterisks) and the predicted tracks (circles) of the cyclones during (a) 25–31 Oct 1999 (Orissa super cyclonic storm); (b) 16–22 May 1999 (Arabian Sea very severe cyclonic storm); (c) 05–10 May 2002; (d) 26 Nov–07 Dec 1996; (e) 14 Oct–02 Nov 1996; and (f) 11–18 May 1996. Tracks predicted by IMD (g) for super cyclonic storm using QLM, and (h) for Arabian Sea very severe cyclonic storm using LAM along with IMD best tracks are also shown for comparison. Figures 3g and 3h are re-generated using the digital values of 24-hour forecast from Figures 2. 7. 3 and 2. 2. 2 respectively from India Meteorological Department [2000].

[11] Generally, the cyclones originating in the central Arabian Sea move towards north. Only 11 cyclones moved towards west and crossed the Arabian coast since 1896 [India Meteorological Department, 2003]. Except for a few initial positions, the ANN predicted track coincides with the best track (Figure 3c) with an overall average error of 176.26 km for the Arabian Sea cyclone during 5–10 May 2002. Because of most of the cyclones originating in this location move northward, the ANN model also would have predicted the northward movement in the initial stages but later could predict properly. Though ANN initially predicted a northward movement, after a few hours ANN could pickup the right motion. If the first initial positions are not considered, the mean error reduces to 123.84 km. Though the ANN model could reproduce the looping cyclone of 26 Nov–7 Dec 1996 (Figure 3d) the errors are slightly large with a mean error of 155.15 km. The ANN model could also predict the retraced cyclone of 14 Oct–2 Nov 1996 (Figure 3e) with a mean error of 134.80 km. Besides the model being able to retrace the tracks properly, it is also efficient in predicting over land. Another interesting feature of this model is to predict the recurved cyclones. The Bay of Bengal cyclone during 11–18 June 1996 has re-curved twice that is well predicted by the model with an average error of 136.9 km (Figure 3f).

[12] Besides 24 hours, forecast errors for 12, 36, 48, 60, and 72 hours were also estimated maintaining the same periods for training, validation and prediction used for 24 hours forecast. The results of these forecast analyses are summarised in Table 2. We also estimated the mean distance errors using CLIPER approach [Neumann and Mandal, 1978; Neumann and Randrianarison, 1976]. The period considered for developing and predicting the cyclone positions using CLIPER model is same as that used for the ANN model. Lead time errors for the CLIPER model are also given in Table 2. The ANN predicted errors are less than those obtained from CLIPER method in all the cases. The percentage reduction in ANN predictions with respect to CLIPER vary from 24.7% to 29.5% (Figure 4). The y-axis of Figure 4 is for the percentage reduction (PR) of track forecasting error by ANN over CLIPER method given as: PR = [(MDECLIPER − MDEANN)/MDECLIPER] *100; where, MDE is the Mean distance error in km.

Figure 4.

Percentage reduction in track forecast errors of ANN over CLIPER for different forecast hours.

Table 2. Statistics of the ANN Analysis for Different Hour Forecast of Cyclone Positions in Terms of Latitudes (Longitudes) Without Removal of Retraced and Looped Cyclones
Forecast, hoursTotal (Predicted) Number of PointsAbsolute Error MeanRMSECorrelation CoefficientMean Distance Error, kmMean Distance Error From CLIPER, km
123908 (912)0.34 (0.35)0.48 (0.49)0.99 (0.99)59.484.3
243463 (830)0.75 (0.87)1.01 (1.16)0.98 (0.99)137.5182.5
363033 (748)1.19 (1.45)1.57 (1.96)0.95 (0.98)224.1298.9
482627 (668)1.59 (2.04)2.08 (2.78)0.90 (0.97)307.9412.5
602243 (588)2.02 (2.62)2.62 (3.55)0.85 (0.95)395.3526.5
721907 (520)2.44 (3.12)3.15 (4.21)0.77 (0.93)474.6650.1

[13] There are 4 looping and 5 retraced tracks in the training and validation data sets. We developed another linear ANN model without considering these 9 tracks and compared the results after removing similar tracks from the prediction data set. The errors marginally decreased from 137.5 km to 132.3 km in case of 24 hours forecast. We could not carry out a separate analysis for the retraced/looped cyclones as the number of such cyclones is not sufficient for ANN analysis.

5. Summary and Conclusion

[14] We demonstrated the capability of ANN approach in predicting the cyclone tracks in the North Indian Ocean 24 hours in advance using only past two 6-hourly positions and the present position in terms of latitude and longitude. For this purpose, best track cyclone positions of 131 cyclones from 1971 to 1982 were used for training, 58 cyclones from 1983 to 1994 for validation and 41 cyclones from 1995 to 2002 for prediction. Thus, the period used for prediction was not used for developing (neither for training nor for validating) the ANN model. The RMSE between the predicted and the actual latitudes (longitudes) is 1.01 (1.16) degrees with the correlation coefficient of 0.98 (0.99) for the prediction dataset. The mean error in the distance between the actual and predicted positions for the 41 predicted cyclones is 137.5 km. The model could very well predict the 24 hours advance positions of the Orissa super cyclonic storm (during 25–31 October 1999) and the very severe cyclonic storm of the Arabian Sea (during 16–22 May 1999) with mean errors of 110.95 km and 109.05 km respectively. However, errors are slightly large for the looping and retraced cyclones. Out of 189 cyclones used for developing the model (131 for training and 58 for validating), only 4 cyclones have taken a loop and 5 cyclones have retraced during 1971–1994. Had more cyclones of these nature be available for training and validation, the model would have predicted the positions of such cyclones also with a better accuracy. Besides 24 hours, 12 to 72 hours forecast is also attempted. 12 hours in advance forecast is the best and the model accuracy decreases beyond 24 hours forecast. The improvement in forecasting is only marginal if retraced and looped cyclones are excluded. The ANN predictions are accurate compared with those obtained from the CLIPER, LAM and QLM models.


[15] The authors acknowledge the facilities provided at National Remote Sensing Agency, Hyderabad, India, and Space Applications Center, Ahmedabad, India, to carry out this analysis. The cyclone data have been obtained from Joint Typhoon Warning Center (JTWC), USA. The software for the CLIPER method was provided by Julian Heming, Meteorological Office, United Kingdom. The authors thank R. C. Bhatia, Additional Director General of Meteorology, IMD, for providing information on cyclones in the Indian Ocean. Fellowship of Sarika Jain is supported by Department of Ocean Development (DOD). The authors thank the anonymous referees for their critical and constructive comments that have improved the quality of the paper.