The Helsinki bike‐sharing system—Insights gained from a spatiotemporal functional model

Understanding the usage patterns for bike‐sharing systems is essential in terms of supporting and enhancing operational planning for such schemes. Studies have demonstrated how factors such as weather conditions influence the number of bikes that should be available at bike‐sharing stations at certain times during the day. However, the influences of these factors usually vary over the course of a day, and if there is good temporal resolution, there could also be significant effects only for some hours/minutes (rush hours, the hours when shops are open and so forth). Thus, in this paper, an analysis of Helsinki's bike‐sharing data from 2017 is conducted that considers full temporal and spatial resolutions. The station hire data are analysed in a spatiotemporal functional setting, where the number of bikes at a station is defined as a continuous function of the time of day. For this completely novel approach, we apply a functional spatiotemporal hierarchical model to investigate the effect of environmental factors and the magnitude of the spatial and temporal dependence. Challenges in computational complexity are faced using a Monte Carlo subsampling approach. The results show the necessity of splitting the bike‐sharing stations into two clusters based on the similarity of their spatiotemporal functional observations in order to model the station hire data of Helsinki's bike‐sharing system effectively.

However, the influences of these factors usually vary over the course of a day, and if there is good temporal resolution, there could also be significant effects only for some hours/minutes (rush hours, the hours when shops are open and so forth). Thus, in this paper, an analysis of Helsinki's bike-sharing data from 2017 is conducted that considers full temporal and spatial resolutions. The station hire data are analysed in a spatiotemporal functional setting, where the number of bikes at a station is defined as a continuous function of the time of day.
For this completely novel approach, we apply a functional spatiotemporal hierarchical model to investigate the effect of environmental factors and the magnitude of the spatial and temporal dependence. Challenges in computational complexity are faced using a Monte Carlo subsampling approach. The results show the necessity of splitting the bike-sharing stations into two clusters based on the similarity of their spatiotemporal functional observations in order to model the station hire data of Helsinki's bike-sharing system effectively.

K E Y W O R D S
bike-sharing system, functional data analysis, spatiotemporal statistics, subsampling INTRODUCTION Bicycle-sharing systems have become popular in all the cities in which they have been implemented. Among these cities is the capital of Finland, Helsinki, where a station-based system has provided a flexible transport option since 2016 (City of Helsinki, a; Helsinki Region Transport, a). That is, a bike can be taken from a bike-sharing station and returned to any other station. The extension of the existing system to neighbouring cities and the rebalancing of the bike-sharing stations are challenging tasks for the operator and city planners (cf. Schuijbroek et al., 2017). Support can be provided via empirical research examining the demand at all stations over time, and determining the factors that influence the usage of a bike-sharing system is one of the main research interests. Furthermore, the environmental impact of implementing bike-sharing schemes is an important question in current research (e.g. Maranzano et al., 2020;Zhang & Mi, 2018).
In studies concerning bike-sharing systems in other cities, a variety of influential factors have been discussed and analysed using different statistical models (El-Assi et al., 2017;Wang et al., 2020;Yang et al., 2016). The significance and magnitudes of such factors have been analysed by Eren and Uz (2020) in a review. Moreover, Yang et al. (2020) focused on the analysis and prediction of the bike-sharing usage at different points in time. In these studies, the amount of data is often reduced through the aggregation of the observations in time spans consisting of a set number of hours (e.g. El-Assi et al., 2017) or a day (e.g. Buck & Buehler, 2012). So far, no study has analysed how factors' influences vary by time of day. Additionally, no study has utilised an entire dataset in its analyses, that is, all the studies to date have aggregated the data or reduced the dimensionality.
This gap in knowledge is addressed in this paper by means of a comprehensive analysis of the freely available station hire data for the bike-sharing system in Helsinki from 2017. The station hire data are meant to represent spatiotemporal functional observations. Thus, we apply a complex spatiotemporal functional hierarchical model implemented in the software package D-STEM (cf. Fassò et al., 2018;Finazzi & Fassò, 2014;Wang et al., 2021). This model can be used to predict and map the spatiotemporal process and its uncertainty over a geographical region across time. Applications of such dynamic coregionalisation models are used in Fassò et al. (2016), Fassò and Finazzi (2013), Finazzi et al. (2013) and Taghavi-Shahri et al. (2019) to assess the air quality in Europe and model the concentrations of several airborne pollutants in a multivariate setting or for land use regression in Teheran, Iran. In contrast to previous approaches, which handle the purely temporal dynamics separately from the purely spatial correlation component, the approach presented in Calculli et al. (2015) combines the spatial and temporal dependencies in an autoregressive spatial component. It is known as hidden dynamic geostatistical model (HDGM). The parameters are estimated using the maximum likelihood approach and an EM algorithm (cf. Finazzi & Fassò, 2014). Alternatively, Bayesian approaches can also be used (cf. Rue et al., 2009). These approaches are mostly based on computationally efficient integrated nested Laplace approximations (INLA). In this paper, we focus on the EM estimation for functional HDGM implemented in D-STEM, because there is no prior information about the model parameters (in this case, non-informative priors would be appropriate). Moreover, functional data can efficiently be handled. Although this technique was originally developed to handle spatiotemporal functional data from environmental sciences, such as atmospheric radiosonde profiles, its potential for modelling the number of allocated bikes at the bike-sharing system in Helsinki is demonstrated in this paper.
Because of the large amount of data from 140 stations (measured in 5-min intervals), we propose to combine this estimation with a Monte Carlo subsampling approach. That is, we repeatedly draw a smaller subset from all available spatial locations, bike-sharing stations. Using all functional observations of these stations over time, the spatiotemporal model was estimated. Thus, we will be able to estimate the standard errors of the estimated functional model parameters in a very efficient way, allowing a rich interaction model with spatiotemporal interactions to be estimated from the full data. Moreover, all results are validated in a cross-validation study.
The remainder of the paper is structured as follows. First, we provide an overview on different bike-sharing systems and previous empirical findings. Moreover, we briefly introduce the bike-sharing service in Helsinki and present some descriptive statistics for a first exploratory analysis. In the ensuing section, we introduce the spatiotemporal functional model from a theoretical perspective and explain the applied subsampling principle. The concept of functional data and the construction of a continuous function from discrete observations are described. These theoretical sections are followed by the empirical analysis of the data from Helsinki. Initially, we discuss several descriptive statistics and figures in detail to provide a comprehensive understanding of the data, which is highly complex (i.e. spatial, temporal domain; daily, weekly periodicity; high frequency; and so forth). Eventually, the estimated functional parameters are shown and the results are discussed in Section 5. In this section, we also explain how the specific model (hyper-)parameters are chosen. Section 6 concludes the paper.

BIKE-SHARING SYSTEMS
Many of the larger cities across the world have expanded their public transport systems by introducing bike-share schemes, which provide an alternative and sustainable transport mode. Starting in 2000, the number of bike-share systems worldwide has increased rapidly, and many studies have been conducted to improve these services and understand their usage patterns (Fishman, 2016;Gervini & Khanal, 2019). The systems differ in the way they handle bike usage (Eren & Uz, 2020). On the one hand, there are station-based systems, where users retrieve a bike from a particular station, take a ride and return it to any other station. Here, a positive effect is that the station locations are fixed and thus users know where to search for bikes. However, a bike station may already be full when a user wants to return a bike, as there are only a limited number of docks at each station. Then the bike can only be returned to another station. On the other hand, dockless sharing systems offer more flexibility, as users can return bikes anywhere and they are not bound to stations. However, the disadvantage is that users who want to pick up a bike need to be lucky to find a bike close by. Bike-sharing systems have become popular for various reasons (O'brien et al., 2014). City administrations aim at increasing the number of cyclists and reducing the car traffic in the cities (Fishman, 2016). Shared bikes can be used to overcome distances between public transport options, such as the metro or train, to reach specific destinations, such as the workplace or recreational areas. Hence, bike sharing improves the public transport network and helps users cover gaps in that network or the last miles (Willberg et al., 2019).
Research in this area proceeds in various directions but mainly aims to understand users' behaviour and the different facets of bike-sharing demand. The knowledge gained from the investigations helps improve bike-sharing systems and support operators in operational planning. Understanding usage patterns of, and dependencies between, stations may help when introducing similar systems to other cities (Tran et al., 2015). Martinez et al. (2012) and García-Palomares et al. (2012) address finding appropriate station locations and determining bike fleet size. Data from user registrations or from user surveys provide the users' perspective and give insights into the socio-demographic factors influencing the usage of bike-sharing systems (Willberg et al., 2019).
A different focus is set by data-driven demand analysis. On the one hand, there is station hire data, which give either the number of bikes or the number of check-outs and check-ins at each station at a certain point in time. On the other hand, trip-based data give information about the origin, destination and duration of each bicycle trip. Thus, there are several studies that investigate spatial and temporal factors influencing the demand on a station level or trip basis that try to predict future usage (Li et al., 2015;Rixey, 2013;Yang et al., 2016). However, station-based bike-sharing systems suffer from an unbalanced spatial distribution of bikes at the stations due to different levels of demand across space and time. Hence, this optimization problem must find the most effective rebalancing strategy for the bikes in the network (Shi et al., 2019).
A recently published literature review by Eren and Uz (2020) on the factors influencing bike-sharing demand focuses on six categories. The categories used are weather conditions, built environment, public transport and temporal factors that are used in many studies on station hire data. One of the main findings is that precipitation affects bike-sharing demand the most among the meteorological covariates. Its negative correlation with demand was found in almost all examined studies. Furthermore, increases in the humidity and wind speed decrease the demand, whereas air temperatures between 0 and 30 • C lead to more bicycle trips. The strongest positive correlation was found for temperatures between 20 and 30 • C, but the demand is less for temperatures under 0 • C and over 30 • C. Infrastructure and land use are widely investigated factors in the built environment category. Bicycle lanes and the proximity of bike-sharing stations to them are found to have high positive impacts on a station's demand. Furthermore, changes in the elevation across the area of a bike-sharing service are correlated with its demand. From trip-based data, it can be seen that users tend to use shared bikes to go downhill more than uphill. Moreover, considerable differences in demand are found for bike-sharing stations in commercial and residential areas, and a station's proximity to infrastructure, such as museums, shopping centres, schools, universities and restaurants, is investigated by many studies. Also, public transport options seem to be related to the bike-sharing demand. The more train, tram, bus and metro stations near a station, the higher its demand. Moreover, many studies have shown that the bike-sharing demand varies along the temporal dimension, with the most apparent differences occurring between weekdays and weekends due to different user travel motivations. The usage of bike sharing for commuting to work becomes visible via the peak usage during morning and afternoon rush hours on weekdays. On the other hand, trips during the weekend are more often for recreational purposes.
Most of the studies use linear models (e.g. generalised linear models (Chastenet de Castaing, 2017), hierarchical linear mixed effect models (El-Assi et al., 2017) or negative binomial models (Gebhart & Noland, 2014;Nair et al., 2013)) without explicitly addressing spatial dependence. Yang et al. (2020), Ji et al. (2018) and Wang et al. (2020) model the demand using ordinary least squares regression, which is inconsistent in the presence of spatial dependence, but they subsequently check residuals for spatial autocorrelation using Moran's I (Lee & Li, 2017). In other studies, spatial dependence is mostly addressed via cluster analyses (e.g. Froehlich et al., 2009;Lathia et al., 2012;Li et al., 2015;Raninen, 2018;Vogel et al., 2011;Zhou, 2015). In contrast, temporal dependence has been studied more accurately. For instance, Shi et al. (2018) studied metro riderships explicitly addressing its temporal dimension, while El-Assi et al. (2017) consider a first-order temporal autoregressive model. In some studies, temporal dependence has been ruled out for the dependent/independent variables through aggregation over time, for example, average number of trips per month (Rixey, 2013), per day (Buck & Buehler, 2012) or during the peak hours in the morning or afternoon (Nair et al., 2013;Tran et al., 2015;Wang et al., 2020).
The city of Helsinki introduced the station-based public bike sharing scheme in 2016 with 50 stations and further expanded it in 2017 with 100 additional stations (see City of Helsinki, a and Jäppinen et al., 2013). As a consequence of its high usage and popularity, the system was extended to the neighbouring cities Espoo (2018) (Helsinki Region Transport, a) and Vantaa (2019) (Helsinki Region Transport, b). Hence, the bike-sharing system covers wide areas of the larger Helsinki region and has become a dense network of bike-sharing stations, making this system an alternative transport mode. Helsinki's bike-sharing scheme has been addressed before in a few studies (see Chastenet de Castaing, 2017;Raninen, 2018;Tarnanen, 2017).

HELSINKI'S BIKE-SHARING SYSTEM AND DATA DESCRIPTION
The City of Helsinki is located at the coast, the Gulf of Finland, and is characterised by a primarily flat area with a few hills. In our empirical study, we exploited the data from the bike sharing season 2017 for which the bike sharing service was already operational in Helsinki city centre shown in Figure 1. In general, the area shown is widely covered with bike sharing stations, although they are denser and more evenly distributed in the city centre. In the north of Helsinki city centre, where there are more residential areas, coverage becomes sparse and rather irregular. Moreover, a public transport network connects the city centre with the greater Helsinki area (outside the map extent) consisting of its suburbs and neighbouring cities Vantaa and Espoo. In Figure 1, the metro and train lines are indicated by the orange and purple lines, respectively, with the markers indicating the stations. In addition, there is a dense tram and bus network in Helsinki, but this is not shown on the map. The background colours of the map correspond to the four city land use categories assigned by the 2016 Helsinki master plan for city development (City of Helsinki, b). These four categories consist of the city centre (orange), shops (red), recreational areas (green) and predominantly residential areas (yellow).
The company Helsinki Region Transport has provided an API to enable individuals and organizations to develop their own applications and investigate the data related to transport in Helsinki and neighbouring municipalities (cf. Kainu, 2017 Helsinki Region Transport, c). We selected 176 days in 2017, starting on May 9, which was the first day with full records, and ending on October 31, which appears to have been the end of the biking season that year. However, the data are incomplete, as shown in Figure 2, where black entries depict missing values. It is worth noting that the functional HDGM can be estimated even when the responses are not available for all stations and/or time points.
The station hire data contain information on the observed number of bikes recorded every 5 min at 140 bike-share stations in Helsinki. Thus, there are over 7 million bicycle counts in total. In a preliminary analysis, the time series for the stations were analysed in the frequency domain (cf. Brockwell & Davis, 2016;Cooley & Tukey, 1965). The Fourier transform was applied to decompose the time series of each individual bike sharing station into the weighted sum of its underlying periodic signals. The resulting periodograms depict the magnitude for each temporal frequency present in each time series. Analysing the time series of the bike sharing stations in this frequency domain shows us that the stations differ with respect to their magnitudes of the dominating temporal frequencies. Periodograms were computed for all the stations separately and are depicted in Figure 3c via a glyph-map (Eden et al., 2010;Wickham et al., 2012) with the station Arabiankatu shown in Figure 3a to illustrate the scale and axes definition of the glyphs. The periodograms are shown as small glyphs at the locations of the respective bike-sharing stations.
The glyph-map reveals differences in the periodograms that appear to be spatially correlated. In the city centre close to the main station, the magnitudes are higher than in most parts in the north of Helsinki. Moreover, the bike-sharing stations in the city centre have periodograms with several dominant peaks. However, most of the periodograms for the stations have one outstanding peak in common. This dominant peak corresponds to the daily frequency. This finding is further highlighted in the histogram for the relative magnitudes in Figure 3b. The histogram shows an where n is the number of stations, and N f is the number of frequencies in the periodogram. According to the histogram in Figure 3b, the daily cycle is most prominent, representing over 2% of the total signal from all stations. In general, time series can be subdivided into linear time granularities, for example, a sequence of subsequent days, and cyclic time granularities, for example, daily, weekly or yearly periodicities (Andrienko et al., 2010;Gupta et al., 2021). Moreover, cyclic data are often classified into regular and irregular cycles. For regular cycles, the time series are constructed by subdividing the linear time into pieces, and stacking them to match the cycles, for example, days, weeks, years, etc. The station hire data contain both temporal types of data, that is, linear time as the sequence of days and regular cyclic patterns on each weekday/weekend.
There are two predominant periodicities that can be understood as cyclic time. The most prominent cycle is the length of one day. Its prominence could be due to daily activity and sleep periods. Moreover, there is a cycle with a length of one week that is connected to the transition between workdays (Monday through Friday) and the weekend (Saturday and Sunday). Nevertheless, these repetitive structures appear for a sequence of days, which represents linear time. Both types of time have to be considered in the analysis of the bike-sharing data in order to cover all spatial and temporal dependencies within the data.
Therefore, this study makes the novel proposal that the cyclic time of one day should be treated as a functional observation (cf. Ferraty & Vieu, 2006;Ramsay & Silverman, 2007). To be precise, the number of bikes is one continuous function of time across a day, that is, a function that maps h ∈ [0, 24] to the non-negative integer y ∈ N 0 . Thus, the daily cycle is incorporated into the functional observations for every day at every station. This daily cycle is denoted by h (time during the day), whereas the day will be denoted by t below. It is important to note that h is not restricted to full hours, but can be any time point within the day.
To analyse the temporal dependence, functional boxplots are used (cf. Sun & Genton, 2011). Figure 4 show these plots for the stations Itämerentori ( Figure 4a) and Haukilahdenkatu (Figure 4b). Both functional boxplots highlight the periodic behaviour belonging to the cycle of one day. The stations are characterised by a change in the number of bikes allocated during the morning hours from 7 to 10 and another major change in the afternoon hours from about 14 to 18 o'clock. However, the directions of change, as well as the ranges of observed bikes, are different. At Itämerentori station, the number of bikes increases in the morning and decreases in the afternoon. In addition, up to 58 bikes were observed at maximum. The opposite happens at Haukilahdenkatu station. There is a decrease in the number of bikes in the morning and an increase in the afternoon. Here, the maximum number of bikes observed was 32 bikes.
Because we observed these two different type of stations, we performed a preliminary cluster analysis. We found no evidence for including more than two clusters. The analysis of the explained variance shows that only very weak improvements are possible by including more clusters. To be precise, k-means clustering was applied to the median function of all observations, ensuring that these clusters are robust against outliers. Moreover, we chose Pearson's correlation as a distance measure between the median curves and the cluster centres.
The clustering was conducted separately for each day of the week. Hence, the clustering yielded k = 2 cluster centres̃k(h) for each day of the week. These centres are shown in Figure 5a. The solid lines represent the cluster centres̃k(h) for Monday through Friday, and the cluster centres for the weekend are denoted by dashed lines. The weekend's dashed lines are similar to each other but differ from the solid lines for Monday through Friday by a positive shift on the time axis of approximately 4 h. Additionally, the magnitude of change in the function is less for the cluster centres for the weekend than for the cluster centres̃k(h) for Monday through Friday. Due to doing separate clusters for all days of the week, a station could be assigned to different clusters over the course of one week. However, for all days, the numbers of stations assigned to the clusters are similar, with roughly 40% of the stations belonging to the first cluster and consequently about 60% to the second cluster.
The assignments are shown in Figures 5b and c, where the sizes of the symbols denoting the locations of the stations are scaled by the number of assignments out of seven to that respective cluster. Looking more closely at the locations of both clusters, we can see that type 1 stations are mostly located in the centre, in areas where people work, while type 2 stations are located in regions where people live. We therefore interpret these main characteristic temporal patterns of the two clusters as the users commuting from home to work places and back. Hence, the clusters are hereafter named 'Work' and 'Home' respectively.

MODELLING SPATIOTEMPORAL DEPENDENCE IN FUNCTIONAL DATA
Functional data analysis deals with a functional random variable Y that is continuously defined, as in, for example, Ferraty and Vieu (2006). Observations y of a functional random variable are either measured on a regular grid or at random discrete points, thus leading to a set of q discrete measurements y i,1 , … , y i,q of the functional data i ∈ 1, … , m. It is worth noting that q can be different for different functional data y i . However, for functional data analysis, a continuous function is needed to evaluate the function y(h) for any argument h. Hence, the functional form is reconstructed, for instance, using the basis function expansion (cf. Ndongo, 2017;Wang et al., 2016). That is, the reconstruction is accomplished with a set of K known basis functions k with respective coefficients c k , y(h) that can be expressed as (2) Typical choices for the basis functions are the Fourier series for periodic data or the B-splines for non-periodic data (Ramsay & Silverman, 1997). For this analysis, we focus on the B-spline approach, where the number of free parameters is given by the order of the piecewise polynomials and the number of interior knots. The compact support of the B-spline basis functions has the advantage that the computational complexity increases only linearly with K. Furthermore, B-spline basis functions are flexible in the sense that the location of the break points can be chosen in order to approximate the function better in segments where it changes more frequently.
Let the number of available bikes in a bike-sharing station be described by a functional space-time random variable Y (s, t, h), and let y(s, t, h) be the observed functions from h ∈ [0, 24] ⊆ R to R at day t and station s ∈ D, with D ⊂ S 2 and S 2 being a sphere in R 3 (surface of the Earth). For this application, we consider that D is a discrete set of N bike-sharing stations. Even though the curvature of the Earth might be neglectable in this case because of the small extent of the city of Helsinki, we have used spherical coordinates. Thus, all distance measures reported below correspond to great-circle distances (in metres). Time is assumed to be discrete, with t ∈ {1, … ,T}. Furthermore, the actual observations of Y (s, t, h) are made at q discrete points along the dimension of the function y(s, t, h), meaning, in the course of the day. More precisely, the number of available bikes is available in a 5-min frequency. Hence, the observation at (s, t) is the q-dimensional vector y(s, t) = (y 1 (s, t), … , y q (s, t)) T , where q equals 288 5-min intervals within 24 h. In this paper, we do not explicitly account for integer-valued observations of Y (s, t, h), because this would result in non-smooth function over the day, but we consider the number of available bikes in a station as continuous variable. The sample size m of the dataset is then given by the number of functional observations m = NT = 24,640 with N = 140 being the number of stations and T = 176 the number of days. The spatiotemporal functional variable Y (s, t, h) is assumed to be first-order stationary (i.e. the mean of the spatiotemporal process does not depend on the location).
A hierarchical model is used to model the mean spatiotemporal functional process and its variation by splitting up the total uncertainty into separate components. For multivariate data, the model is commonly known as hidden dynamic geostatistical model (HDGM, cf. Calculli et al., 2015;Wang et al., 2021). The first level is given by where the (s, t, h) are fixed effects, and the (s, t, h) are spatially and temporally correlated random effects. The model errors (s, t, h) are assumed to be from independent Gaussian white noise processes with a constant variance that is allowed to vary over the functional domain. More precisely, with̃2 It is important to note that the spline basis functions could be chosen differently for each term. Hence, the basis functions T a and their dimension p a are denoted by the subscript corresponding to the terms a ∈ { , , ε} of the hierarchical model given in Equation (3). The fixed effect model consists of d space-time varying functional covariates x ,i (s, t, h), where the unknown coefficients i must be estimated. It is worth noting that these covariates could also be constant across space or time and/or in the functional dimension. Furthermore, the random effects model is given by For each time step and each location a function is sampled from the estimated random effect model. It covers both spatial and temporal dependencies by modelling the respective variation using a basis function expansion. Specifically, the spatiotemporal latent component z(s, t) has the Markovian dynamics z(s, t) = Gz(s, t − 1) + (s, t).
Thus, the random effect merely depends on the previous time step and the innovation (s, t).
The degree of dependence from the previous time step is specified by the transition matrix G that is assumed to be stable. Here, G is a diagonal matrix G = diag(g 1 , … , g p ) and accordingly the latent components z(s, t) are not cross-correlated across the functional dimension. For each day t an innovation (t) = vec( (t, s 1 ), … , (t, s N )) is sampled from the following spatially dependent Gaussian process Here, ⊗ stands for the Kronecker product and vec is the vectorisation operator. While the cross-covariance matrix V describes the correlation between all components, the spatial covariance function (||s − s ′ ||, , ) captures the correlation across space. For instance, this function could be a Matérn covariance function, which is an isotropic covariance function depending on the distance ||s − s ′ || between spatial locations only. The range of the spatial dependence is described by the coefficients and cover potential further parameters. Moreover, the variance of the random effects is given by the matrix V = diag( 2 1 , … , 2 p ). It is worth noting that the cross-covariance matrix V is restricted to a diagonal matrix in order to reduce computational effort and, hence, it acts as a scaling matrix of the random effects. To estimate the parameters, we follow the maximum likelihood approach using an EM algorithm implemented in D-STEM (see Wang et al., 2021), which uses a functional hierarchical model called the f-HDGM model. For more details on the closed form and the numerical computations of the parameters in the EM algorithm, we refer the reader to Wang et al. (2021), Calculli et al. (2015), Fassò and Finazzi (2011) and Fassò and Cameletti (2009).
However, for this approach, the computational costs (in particular, for the computation of variance-covariance matrix of the estimated parameters) increase drastically with the number of spatial locations N and the number of splines p a chosen for the basis function expansion. In our study of Helsinki's bike-sharing system, the number of bikes at 140 stations was observed every 5 min for 176 days. Thus, we propose to use the following subsampling procedure to estimate the standard errors of the model parameters more efficiently.
First, B independent samples of bike-sharing locations, b 1 , … , b B ∈ {s ∈ D ∶ y(s, t, h)}, are generated, each consisting of M locations that are randomly drawn without replacement from all locations where the process is being observed. To reduce the computational costs, we have chosen M ≪ N. Moreover, we followed a stratified sampling technique (see, e.g. Cochran, 2007), such that the stations of each cluster appear in their correct proportions in the subsamples b 1 , … , b B (i.e. 40% of cluster 'Work', 60% of cluster 'Home'). Then the parameter of interest is estimated separately for each of the B samples.
The estimator of the i-th sample {y(s, t, h) ∶ s ∈ b i } is denoted bŷi. Lastly,̂is given bŷ Furthermore, the standard error̂= is the estimate of the true standard error of the reduced sample. Finally, the confidence interval of the estimatêcan be computed using the percentile method (see, e.g. Efron & Hastie, 2016).
Here, the (1− ) confidence interval is approximated by the respective empirical quantiles of the distribution of {̂i}.
Since we have reduced the total number of stations in each subsample (i.e. M locations in each sample vs. N locations of the full sample) to make the calculation of the standard errors of the parameters computationally feasible, the estimated standard error must be interpreted as an upper bound of the true standard error of the whole sample. That is, all statistically significant results which we will report below would stay significant for full sample as well, while insignificant effects might be significant (but probably weak) when considering the full sample.

EMPIRICAL RESULTS AND INTERPRETATION
In the following section, the focus is on the results of the empirical analysis. That is, the bike-sharing usage in Helsinki is analysed in the spatiotemporal functional framework to gain insights into the influence of functional covariates and interaction effects on the number of allocated bikes. This knowledge can be used to understand and predict the number of bikes at existing stations over the course of the day due to the functional setting. Moreover, rebalancing strategies could be improved based on our empirical results. To a limited degree, the model could also be used for predicting the bikes at new locations, that is, kriging. However, one has to keep in mind that the overall demand is not arbitrarily scalable by introducing new stations. All included covariates with the notation of their corresponding coefficients are listed in Table 1. More precisely, we have fitted a model with two intercepts corresponding to the two cluster types 'Work' and 'Home'. Note that the cluster assignment was based on the preliminary analysis of the response variable. Moreover, a set of nine covariates varying either in space, in both time and the functional domain, or in time were included as interactions with each intercept. That is, the fixed effects models are considered like independent models for each cluster.  However, these two models are linked by the common random effects model, which accounts for the spatiotemporal dependence of the data. In the final model, we selected several meteorological covariates. Three observatories are located within the spatial extent of the bike-sharing stations in Helsinki, but we only have used data from the Kaisaniemi observatory located in the central city. The spatial differences in these weather covariates are neglectable; hence, the observations are assumed to be constant over space but not time. Figure 6 shows the four weather covariates for the period from May 9 to the October 31, 2017. Furthermore, the respective histogram shows the distribution of the meteorological observations with the relative frequencies of occurrence. The histograms on the right-hand side of Figure 6 are aligned with the observations over time on the left-hand side.

Cluster intercepts
In summary, we included dummy variables for Saturday and Sunday showing the weekend effects, meteorological variables (i.e. temperature in [ • C], cloud coverage in [%], wind speed in [m∕s], precipitation in [mm]), a geographical variable, namely the elevation of the station in [m], and two infrastructure variables (i.e. first, the distance of a bike-sharing station to its closest metro and second, to its closest train station in [km]). Doing so leads to two intercept functions and nine interactions for each cluster; in total, 20 parameter functions must be estimated (each consisting of several spline coefficients).
The model set-up (i.e. the choice of the spatial covariance function, splines basis, knots and so on) was determined via a cross-validation study using B = 1000 subsamples and a sample size of 30 stations for both the in-sample and out-of-sample cases. Note that the choice of size is a trade-off between reliability and computational complexity. The stations were drawn from the first and the second cluster according to the proportions described above (i.e. 41.4% for cluster 'Work', and 58.6% for cluster 'Home') to obtain distinct in-sample and out-of-sample sets in each subsample.
F I G U R E 6 Time series and histograms for four meteorological variables from May 9 to October 31, 2017.
The histograms on the right-hand side refer to the time series on the left-hand side. A cumulative histogram is shown for precipitation In our case, an exponential covariance function fits the data the best. Hence, there are no anisotropic dependencies, which would indicate a prevalent direction of bike usage. Furthermore, for the basis function expansion, the B-splines approach was chosen, although the spectral time series analysis revealed periodic structures in the time series for all the stations. However, the B-splines allow the knot positions to be adapted according to the variation in the data along the functional dimension. For this study, we have determined the knot positions in such a manner that the standard deviatioñ(h) of the modelling error (s, t, h) remained less than three bikes. To be precise, the position of the break points was set to break points = {0, 5, 7.14, 9.29, 11.43, 13.57, 15.71, 17.86, 20, 24} o' clock with only a few B-splines supporting the morning and evening hours, as there is little variation in the spatiotemporal functional observations during these periods. In contrast, there is high variation in the functional observations during the middle of the day; thus, the splines basis functions are denser during this period.

Fixed effects
The two intercepts referring to the stations' cluster memberships are shown in Figure 7. The mean curve of̂W ork (h) shows about seven bikes during the night and in the evening, while the number of bikes increases during the day and has a first peak at approximately 11 o'clock, with about 17 bikes, and a second peak at 15 in the afternoon, with 19 bikes. The confidence interval has the widest range at the peaks. As expected,̂H ome (h) shows the opposite shape. In the beginning and the end of the day, approximately 17 bikes are located at stations from cluster 'Home'. The number grows smaller during the day, with the minimum of nine bikes occurring at approximately 9 in the morning. Between 10 and 14 o'clock, there are 10 bikes; afterwards, the number of bikes increases again. For both intercepts, the 95%-confidence interval along the entire function indicates a range of possible values of up to ±2 bikes around the mean. However, these intercept curves represent the expected number of bikes in the case that all other covariates would be zero. Thus, they have to be rather interpreted along with the regressive effects. For instance, in Figure 8, the temperature effects in both clusters are depicted (i.e. Work * Temperature (h) and̂H ome * Temperature (h)). In the areas for working, we observe that the number of allocated bikes changes by −0.2 bikes∕ • C in the night and evening. Between 8 and 15 o'clock, the influence of the temperature is not significantly different from zero. As a consequence, the higher the temperature in Helsinki, the fewer bikes are located at the stations from cluster 'Work' in the evening and night, while a temperature change between 8 and 15 o'clock has no significant effect on the number of bikes. Regarding residential areas, the interaction starts in the night at around 0.1 bikes∕ • C, decreases rapidly to −0.15 bikes∕ • C from 11 to 16 o'clock and increases afterwards up to 0.14 bikes∕ • C. The shape of the function is similar to the valley-like shape of the intercept̂H ome (h). Consequently, the higher the temperature in Helsinki, the more bikes are located at the stations from cluster 'Home' in the night and evening, while the number of allocated bikes decreases during the daytime.
Both interactions of the temperature with the cluster effects yield functions with shapes similar to that of the function of the intercept itself, meaning that an increase in the temperature amplifies the already existing mountain-and valley-like shape of the intercepts. The effect of the temperature shows that usage of the bike-sharing scheme is higher when it is warmer, as the change in the number of allocated bikes at stations from both clusters increases. This effect becomes more clear when combining each estimated functional interaction covari-atêW ork * Temperature (h) and̂H ome * Temperature (h) with their corresponding functional interceptŝ Work (h) and̂H ome (h) respectively. Figure 9 illustrates the predicted number of bikes of a station at cluster 'Work' or 'Home' depending on the temperature. Importantly, all other covariates and the random effects are considered to be zero. The solid black lines show the respective intercept curves. For example, the number of allocated bikes at a station from cluster 'Home' at noon (indicated by the vertical black line) ranges from roughly 7 to 9 bikes, merely due to temperature variation when at noon temperature is realistically assumed to range between 10 and 25 • C.
The influence of a bike-sharing station's elevation on the number of allocated bikes is given bŷW ork * Elevation (h) and̂H ome * Elevation (h), as shown in Figure 10. Both functions are significant and have negative signs along their entire domains. The influence of the interaction of elevation with the cluster 'Home' is around −0.4 bikes/m. There is little variation around the mean, which does not change significantly. In comparison, the interaction̂W ork * Elevation (h) varies more. Due to the negative sign, the number of allocated bikes at a station decreases as the elevation of the station increases, showing empirically that cyclists prefer cycling downhill over cycling uphill. The change in̂W ork * Elevation (h) in the afternoon emphasises that cyclists might use the bikes even less for cycling uphill in their free time.
Most surprising are the results regarding the effect of precipitation as no effects could be identified from the data. However, bear in mind that only 8 % of all precipitation observations were non-zero. Including precipitation as a dummy variable could perhaps enable to see an effect for precipitation. Furthermore, public transport was included in the model, as Jäppinen et al. (2013) suggested that it was one of the major factors influencing bike-sharing usage. The influence of the distance of a bike-sharing station to public transport varies as shown in Figure 11. While the esti-matêH ome * Train (h) is not significantly different from zero,̂W ork * Train (h) shows significant effects from 6 to 8 in the morning and from 15 to 17 in the afternoon. In the morning, the number of bikes increases by about 3 bikes/km with increasing distance from the closest train station. The opposite is the case in the afternoon, with −2.2 bikes/km with increasing distance from the closest train station. The intercept of the cluster 'Work' increases during the respective morning period and decreases in the afternoon. This change is amplified by the interaction̂W ork * Train (h), meaning that more bikes are allocated to bike-sharing stations further away from the train stations. This finding supports the hypothesis that commuters use the public bike-share scheme to overcome distances from the train station to their destinations, also known as the last-mile problem. Moreover, the greater the distance of a bike-sharing station from the closest metro station, the more bikes are allocated to this station during the morning until the mid-afternoon. Thus, stations closer to the public transportation system are used more frequently (i.e. there are fewer bikes allocated during the day). Interestingly, the influence of the distance to metro stations is only less than half the effect of the distance to train stations.

Random effects
Below, we discuss the results of random effects and the error term briefly. The random effects model explains the random variation in the data that is not explained by the fixed effects, that is, the mean of the process and the covariates. First, the temporal autoregressive dependence is estimated with the diagonal elements of the transition matrixĜ. Consequently, about half of the random variation observed in a day is explained by the previous day. To analyse the spatial dependence, Figure 12 shows the estimated range parameterŝ. Their median values are around 160 m. Since the exponential covariance function declines rapidly and the covariance is about 0.37 at distance , the spatial dependence of the process is weak in most cases-the largest values are in the range [264,415] m. Considering the distances between the bike-sharing stations in Helsinki, where the median distance is 3 km, the estimated range param-eterŝreveal that the spatial dependence is constrained to stations that are very close together. Whereas, the diagonal elements of the matrixV indicate a smaller random effect at night than during the day, it seems that the degree of the spatial dependence is mostly stable, with slightly more variability occurring during the day.

Out-of-sample forecasts
Finally, one question remains open-how well does the model fit the data? To answer this question, an out-of-sample study was conducted for each of the 1000 subsamples. More precisely, 30 randomly chosen locations which were not included for estimation are used to compute the out-of-sample fit. The out-of-sample RMSE is depicted in Figure 13 with an orange line. For comparison, two alternative models (i.e. a simple intercept-only model, and a model with all regressors but no interactions with the clusters) are shown with blue and red curves. Interestingly, the three functions have roughly the same shape but are shifted. The RMSE is the lowest from midnight to 7 o'clock and increases drastically between 7 and 9 o'clock. After reaching a maximum at 9 in the morning, it decreases until 16 o'clock and has a local maximum between 17 and 19 o'clock. The range is ±1.5 bikes over the course of the day. Considering the daily out-of-sample RMSE, we observe median values varying between 5 and 9 bikes. Moreover, the estimated functional error variancê2(h) is shown in Figure 13. Across the entire day, the maximal standard error is less than two bikes. During the night from about 1 o'clock to 5 o'clock, the variance is the smallest, showing that the variation in the data for all bike-sharing stations is low in the night. Also, from 22 in the evening until midnight, the standard deviation is less than one bike. In contrast, the variance is higher in the morning from 6 to 9 and in the afternoon from 15 to 19 o'clock. Here, the corresponding standard deviation is up to about 1.9 bikes.

CONCLUSION
This paper focused on station hire data from the bike-sharing system in Helsinki. With over 7 million observations from 140 bike-sharing stations taken at 5-min intervals, the analysis of bike-sharing station usage was brought to a new level, as the entire complex dataset was considered and knowledge about the changes in the influencing factors over the course of a day was inferred. We simultaneously accounted for spatial, temporal and spatiotemporal dependence by applying a geostatistical model in a functional framework. The model parameters were estimated using the implemented maximum-likelihood approach of the software package D-STEMv2 (see Wang et al., 2021). To supply computationally efficient estimated standard errors and guarantee a certain robustness against outliers, a subsampling approach was applied. Most findings about the influencing factors were in line with the results from existing literature, although comparability is limited due to the use of a different methodology and data. We have shown that the bike-sharing stations can be divided into two clusters depending on the similarities in their spatiotemporal functional observations. It is important to note that these similar functional observations cluster together in space too. The estimated parameters have shown that the morning rush hour is particularly difficult to model and predict. There is a mountain-like shape to the daily available bikes for stations belonging to predominantly working areas. By contrast, we observe a valley-like shape in living areas. This behaviour is different on weekends, where the daily peaks are also shifted towards the afternoon. Furthermore, we examined which weather conditions could have an influence. According to Eren and Uz (2020), precipitation should affect bike-sharing station usage the most among the weather conditions. Here, however, the influence of precipitation was not significant.
A drawback of the model can be seen in the out-of-sample RMSE values, which ranged between seven and nine bikes, giving it a range similar to that for the random effects. The random effects covered the unexplained variation in the station hire data. Unfortunately, out-of-sample validation of models from the literature was not found, meaning that the error mentioned above could not be compared and evaluated. Moreover, a spatiotemporal integer-valued model might improve the prediction accuracy, especially for less frequently used stations. To better understand the implications of different types of the bike-sharing station usage, future studies could address spatially varying covariates, perhaps providing insights into the cause of the separation into clusters. On the other hand, bike-sharing system station usage is governed by the decisions made by individuals and maybe even pure coincidences in their behaviour. In general, analysing the relationship between the number of allocated bikes at the bike-sharing stations and the proposed covariates produces a correlation and does not necessarily imply causality.
It is worthwhile to develop and apply complex models to spatiotemporal functional data from bike-sharing systems, as detailed knowledge can be gained and perhaps lead to future improvements in implemented bike-sharing systems.