The Branching-Course MPC Algorithm for Maritime Collision Avoidance

This article presents a new algorithm for short-term maritime collision avoidance (COLAV) named the branching-course MPC (BC-MPC) algorithm. The algorithm is designed to be robust with respect to noise on obstacle estimates, which is a significant source of disturbance when using exteroceptive sensors such as e.g. radars for obstacle detection and tracking. Exteroceptive sensors do not require vessel-to-vessel communication, which enables COLAV toward vessels not equipped with e.g. automatic identification system (AIS) transponders, in addition to increasing the robustness with respect to faulty information which may be provided by other vessels. The BC-MPC algorithm is compliant with rules 8 and 17 of the International Regulations for Preventing Collisions at Sea (COLREGs), and favors maneuvers following rules 13-15. This results in a COLREGs-aware algorithm which can ignore rules 13-15 when necessary. The algorithm is experimentally validated in several full-scale experiments in the Trondheimsfjord in 2017 using a radar-based system for obstacle detection and tracking. The COLAV experiments show good performance in compliance with the desired algorithm behavior.

avoidance (COLAV) systems. Such COLAV systems must make the ASVs, as other vessels, follow the International Regulations for Preventing Collisions at Sea (COLREGs) which contains a set of rules on how vessels should behave in situations where there is a risk of collision with another vessel (Cockcroft & Lameijer, 2004). However, COLREGs is written for human interpretation with few quantitative rules, which makes it challenging to develop algorithms capturing the intention of COLREGs by machine decision-making.
COLAV algorithms have typically been divided into reactive and deliberate algorithms. Reactive algorithms are characterized by considering a limited amount of information, originally only currently available sensor information (Tan, Sutton, & Chudley, 2004), and employing little motion planning in a short time frame. This makes reactive algorithms computationally cheap, and able to react to sudden changes in the environment. Examples include vessels making sudden unpredicted maneuvers, late detection of obstacles, and so forth.
However, since reactive algorithms consider a limited amount of information and employ little motion planning, they tend to make suboptimal choices in complex situations which makes them sensitive to local minima. Examples of reactive algorithms are the velocity obstacles (Fiorini & Shiller, 1998;Kuwata, Wolf, Zarzhitsky, & Huntsberger, 2014) and the dynamic window (DW); (Fox, Burgard, & Thrun, 1997) algorithms. Deliberate algorithms consider more information and plan for a longer time frame, which results in more optimal choices at the cost of increased computational requirements. Examples of deliberate algorithms include the A* (Blaich, Rosenfelder, Schuster, Bittel, & Reuter, 2012;Hart, Nilsson, & Raphael, 1968) and the rapidly exploring random tree (La Valle, 1998) algorithms.
The previously clear border between reactive and deliberate algorithms have become somewhat artificial since few algorithms only utilize currently available sensor information. However, the idea that the reactive algorithms are capable of responding quickly to changes in the environment and the deliberate algorithms are capable of performing optimal motion planning in a longer time frame is still relevant. We therefore choose to rather use the terms "short-term" and "long-term" algorithms to distinguish the algorithms. In a practical COLAV system, both short-term and long-term algorithms are useful. For long time frames, all available information should be included, while one may use a less detailed vessel model for planning. For short-term COLAV, one can include less spatial and temporal information but may need to use a more detailed model of the vessel to ensure dynamically feasible maneuvers.
By combining short-term and long-term algorithms in a hybrid architecture, the benefits of both algorithms can be combined, ensuring both responsiveness, feasibility, and optimality. An example of a hybrid architecture with three COLAV levels is shown in Figure 1.
The topmost level, named path planning, is intended to produce a nominal path or trajectory from the initial position to the goal. The spatial and temporal distance between the initial and goal positions may be large, allowing only for a limited complexity in this algorithm. For instance, moving obstacles could be neglected at this level. The midlevel COLAV algorithm tries to follow this nominal path or trajectory, while at the same time performing COLAV with respect to all obstacles, characterized as a long-term COLAV algorithm. COLREGs is a natural part of this level, since it may be complex to decide the appropriate action with respect to COLREGs. The mid-level algorithm produces a modified trajectory which is passed to the short-term COLAV layer. This layer performs short-time COLAV making sure to avoid obstacles performing sudden maneuvers or which are detected too late to be handled by the mid-level algorithm, while also ensuring that the maneuvers are feasible with respect to the dynamic constraints of the F I G U R E 1 A hybrid COLAV architecture with three levels. The support functions provide relevant information for the COLAV algorithms, including obstacle trajectories, static obstacles from electronic nautical charts (ENC), and situational awareness in the form of COLREGs situations. The short-term layer does not currently utilize information from ENC or situational awareness [Color figure can be viewed at wileyonlinelibrary.com] ERIKSEN ET AL.

| 1223
vessel. The short-term layer can also act as a backup solution to avoid collisions in cases where the mid-level algorithm fails to produce feasible trajectories, for instance, due to time constraints or numerical issues (Eriksen & Breivik, 2017b). Furthermore, the short-term layer should be able to avoid collision in emergency situations, for example, when obstacles do not maneuver in accordance with COLREGs.
COLAV algorithms depend on information about obstacle position, speed, and course to be able to avoid collisions. One possible source of such information is using automatic identification system (AIS) transponders. AIS is a vessel-to-vessel communication system where vessels transmit their current position and velocity to other vessels carrying AIS transponders (IMO, 2018). Passenger ships and vessels with a gross tonnage of over 300 are required to carry AIS transponders. This is of course valuable information when it comes to navigation and COLAV at sea. However, AIS transponders usually rely on satellite navigation and data inputs from the user, which results in the possibility of transmitting inaccurate or invalid data (Harati-Mokhtari, Wall, Brooks, & Wang, 2007). Also, vessels or objects not equipped with AIS transponders will not be detected. A more robust approach to obtain information about the environment is to employ exteroceptive sensors, which have the advantage of not relying on any infrastructure or collaboration with the obstacles to detect them. A commonly used exteroceptive sensor at sea is radar.
However, the data from a radar usually include a fair amount of noise, which makes this sensor more complex and difficult to work with than AIS (Eriksen, Wilthil, Flåten, Brekke, & Breivik, 2018). On-board radars have been used for full-scale COLAV experiments based on the A* algorithm in Schuster, Blaich, and Reuter (2014), and using a modified version of the DW algorithm in . In Elkins, Sellers, and Monach (2010) and Kuwata et al. (2014), other exteroceptive sensors such as cameras and lidar are used for COLAV.
Model predictive control (MPC) has for a long time been a wellknown and proven tool for motion planning and COLAV for, for example, ground and automotive robots (Gray, Ali, Gao, Hedrick, & Borrelli, 2013;Keller, Haß, Seewald, & Bertram, 2015;Ögren & Leonard, 2005), aerospace applications (Kuwata & How, 2011), and underwater vehicles (Caldwell, Dunlap, & Collins, 2010). In the later years, MPC has also been applied for COLAV in the maritime domain, both using sample-based approaches where one considers a finite space of control inputs (Hagen, Kufoalor, Brekke, & Johansen, 2018;Johansen, Perez, & Cristofaro, 2016;Švec et al., 2013) and conventional gradient-based search algorithms (Abdelaal & Hahn, 2016;Eriksen & Breivik, 2017b). None of these algorithms does, however, consider the amounts of noise which we expect to encounter using a radar-based tracking system. Gradient-based algorithms have the benefit of exploring the entire control input space, but the complexity of the COLAV problem can make it difficult to guarantee that a feasible solution will be found within the time requirements (Eriksen & Breivik, 2017b). This makes sample-based approaches well suited for short-term COLAV. In Benjamin, Leonard, Curcio, andNewman (2006, 2010), a protocol-based COLAV algorithm using interval programming is presented. The algorithm optimizes over multiple functions considering different behaviors, for example, waypoint following and adherence to different parts of COLREGs, by combining them in an objective function with adaptive weights. The algorithm does, however, use vessel-to-vessel communication to obtain obstacle information, and is not necessarily well suited for use with exteroceptive sensors.

| The International Regulations for Preventing Collisions at Sea
COLREGs regulate how vessels should behave in situations where there exists a risk of collision. There are in total 38 rules, where Rules 8 and 13-17 are the most relevant ones for designing COLAV algorithms for ASVs, although the rest must also must be addressed in a COLREGscompliant system. Rules 8 and 13-17 can be summarized as 8: This rule requires, among other things, that maneuvers applied in situations where a risk of collision exists should be large enough to be readily observable for other vessels. Small consecutive maneuvers should hence be avoided.

13:
In an overtaking situation, where a vessel is approaching another from an angle of more than 22.5°abaft the other vessel's beam, the overtaking vessel is required to keep out of the way of the overtaken vessel. The overtaking vessel is allowed to pass on either side. However, in a case where the overtaken vessel is required to avoid collision with another vessel it may be required to make a starboard maneuver. To avoid blocking the path of the overtaken vessel in such a situation, we consider it as most suitable to overtake a vessel on her port side.
14: In a head-on situation, where two vessels approaches each other on reciprocal or nearly reciprocal courses (a margin of ± ∘ 6 is often used), both vessels are required to do starboard maneuvers and pass the other vessel on her port side.  Rules 13-15. The interested reader is referred to Cockcroft and Lameijer (2004) for more details on the COLREGs rules.

| Contributions
The authors of this article have focused on short-term and reactive COLAV for ASVs for the last few years, starting with a modified version of the DW algorithm designed for use with autonomous underwater vehicles (AUVs; Eriksen, Breivik, Pettersen, & Wiig, 2016). This algorithm was adapted for use with high-speed ASVs, and tested in conjunction with a radar-based tracking system (Wilthil, Flåten, & Brekke, 2017) successfully demonstrating closed-loop radar-based COLAV in full-scale experiments

| Outline
The rest of the article is structured as follows: Section 2 describes modeling and control of ASVs, Section 3 presents the  (Faltinsen, 2005;Fossen, 2011).
The conventional approach to modeling ASVs is by using the three-degree-of-freedom (3DOF) model (Fossen, 2011): where There exist many versions of the model (1) (Fossen, 2011), but they require that the vessel operates in the displacement region. For the Telemetron ASV, this would require a maximum operating speed of approximately 3.5 m/s (Eriksen & Breivik, 2017a). This is quite a big limitation, and we therefore rather use a control-oriented nonfirst principles model developed for high-speed ASVs (Eriksen & Breivik, 2017a), valid for the displacement, semidisplacement, and planning regions: where χ is the vessel course and β is the sideslip. For more details on the model, see (Eriksen & Breivik, 2017a).

| ASV control design
As shown in Figure 1, the COLAV system is built on top of the vessel controllers. Hence, the performance of the COLAV system can be limited by the performance of the vessel controllers. It is therefore beneficial to use high-performance vessel controllers ensuring that the maneuvers that the COLAV system specifies are properly executed, not limiting the performance of the COLAV system.
The model (2) can be used in control design, particularly using it for model-based feedforward in speed and yaw rate is shown to provide good performance (Eriksen & Breivik, 2017a). A controller named the feedforward feedback (FF-FB) controller is presented in Eriksen and Breivik (2017a), which combines model-based feedforward terms with a gain-scheduled proportional-integral feedback controller for controlling the vessel speed and yaw rate. For the BC-MPC algorithm, we need a controller capable of following a speed and course trajectory. The FF-FB controller has proven to have high performance in experiments (Eriksen & Breivik, 2017a;, so we therefore extend the FF-FB controller to include course control: is a matrix of proportional gains, > K 0 i is a diagonal matrix of integral gains, and where Ũ= − = − U U r r r , d d , and χ χ χ = ϒ( − ) d are the speed, yaw rate, and course errors, respectively. The function  ϒ → S : 1 maps an angle to the domain π π [− ) , .
F I G U R E 4 Vessel variables. The superscripts (⋅) n and (⋅) b denote the NED and body reference frames (Fossen, 2011), respectively. The variables N E , , and ψ represent the vessel pose, u v , , and r represent the body-fixed vessel velocity and U is the vessel speed over ground. The course χ is the sum of the heading ψ and the sideslip β.
In the control law (4), we use the desired yaw rate r d and its derivative˙r d . Through (3), the relation between the course and yaw rate is stated as The interested reader is referred to  for more details on the speed and course controller.

| THE BC-MPC ALGORITHM
The BC-MPC algorithm is intended to avoid collisions with moving obstacles while respecting the dynamic constraints of the vessel to ensure feasible maneuvers, which is ideal for short-term COLAV. The algorithm is based on MPC, and plans vessel-feasible trajectories with multiple maneuvers where only the first maneuver is executed. The trajectories have continuous acceleration, which is beneficial for vessel controllers utilizing model-based feedforward terms, such as (4). To fit well with tracking systems based on exteroceptive sensors, such as, for example, radars, the algorithm is designed to be robust with respect to noisy obstacle estimates.
Furthermore, the algorithm is designed with the short-term perspective of COLREGs in mind, namely situations where the stand-on requirement may need to be ignored to avoid collision in compliance with Rule 17. The algorithm is also modular, so it can easily be tailored for different applications.
The BC-MPC algorithm can be described by two steps, which will be explained in detail in the following sections: 1. Generate a search space consisting of feasible trajectories with respect to the dynamic constraints of the vessel.
2. Discretize the search space and compute an objective function value on the trajectories. The optimal trajectory is then selected as the one with the lowest objective function value.
The BC-MPC algorithm architecture is shown in Figure 5. The algorithm inputs a desired trajectory, which can originate from either another COLAV algorithm or directly from a user. The guidance function receives the desired trajectory, and computes a desired acceleration given a vessel state and time specified by the trajectory generation. The trajectory generation block creates a set of possible vessel trajectories, given an initial vessel state, initial desired velocity, and a desired acceleration from the guidance function. A tracking system provides obstacle estimates, which are used to calculate a part of the objective function. The optimization block computes the optimal trajectory based on an objective function, and outputs this as a desired velocity trajectory to the vessel controller (4).

| Trajectory generation
The search space consists of a number of trajectories, each consisting of a sequence of subtrajectories each containing one maneuver.
Having multiple maneuvers in each trajectory enables the algorithm to consider complex scenarios which may require a time-limited speed and/or course change, before selecting a new speed and/or course. In addition, it will allow the algorithm to consider a complete avoidance situation, consisting of an evasive maneuver and a plan for converging back to the desired trajectory. Each trajectory is defined by a desired velocity trajectory containing a speed and course trajectory with continuous acceleration, and feedback-corrected predicted pose and velocity trajectories.

| Trajectory generation: A single step
As mentioned, each trajectory consists of a sequence of maneuvers, resulting in trajectories that branches out from each other. Hence, the trajectory generation can be divided in repeatable steps. At each step, a set of subtrajectories, each containing one maneuver, are computed given an initial vessel configuration, initial time, and some step-specific parameters: F I G U R E 5 BC-MPC algorithm overview. The algorithm inputs a desired trajectory from a mid-level COLAV algorithm or an operator, obstacle estimates from a tracking system, and the current vessel state from a navigation system, and outputs a desired velocity trajectory for the vessel controllers • The number of speed maneuvers N U , • The number of course maneuvers χ N , • The time allowed for changing the actuator input, named the ramp time T ramp , • The maneuver time length in speed T U and course χ T , • The total step time length T We start by generating the desired velocity trajectories, which should be feasible with respect to actuator rate and magnitude saturations. To ensure feasibility with respect to the actuator rate saturations, we start from the model (2) where˙= where > T 0 ramp is the ramp time, τ 0 is the current control input, τ max and τ min are the maximum and minimum control input, respectively, and τ˙m ax and τ˙m in are the maximum and minimum control input rate of change, respectively. The saturation function for and (⋅) i denoting element i of a vector. Following this, we create a set of possible accelerations as The set of possible accelerations is then sampled uniformly to create a discrete set of candidate maneuver accelerations:̇̇̇χ  Figure 6, where A d is sampled with = N 3 U speed samples and χ = N 5 course samples.
Given the acceleration samples, we create a set of N U motion primitives for speed based on the piecewise-linear speed acceleration is the sampled acceleration for speed is the speed maneuver length and > T 0 is the total trajectory length. Similarly, we define χ N course motion primitives by the piecewise-linear course acceleration trajectories:̇̇χ r a m p is the sampled acceleration for course motion and χ > T 0 is the course maneuver length. For notational simplicity and without loss of generality, we assumed zero initial time = t 0 0 in (12) and (13). The acceleration trajectories and parameters for = N 5 U speed motion primitives and χ = N 5 course motion primitives are illustrated in Figure 7.
Notice that the integral of the course acceleration maneuvers is zero, hence if the maneuver is initialized with zero yaw rate the maneuver will end with zero yaw rate. The motion primitives (12) and (13) are chosen as linear piecewise functions to ensure a continuous acceleration with a minimum complexity.
Based on the acceleration trajectories, we create trajectories for the desired speed, yaw rate, and course by integrating the expressions (12) and (13) as The initial values U r , This implies that we do not include feedback in the desired trajectories. Furthermore, as in Section 2, the vessel sideslip is neglected. This could, however, be included by using a vessel model including sideslip. A numerical example of five speed and five course trajectories is shown in Figures 8 and 9, where a maneuver length of 5 s is used for both speed and course. Vessels at sea usually maneuver by either keeping a constant speed and course or by performing a speed and/or course change and continuing with this new speed and course for some time. By selecting the initial yaw rate in (14) we ensure that maneuvers start and end with constant-course motion, which mimics this behavior while also producing maneuvers that should be readily observable for other vessels, as required by Rule 8 of COLREGs.
Following this, we create a union set of the desired velocity trajectories as , , , , , ,

| 1229
Given the desired velocity trajectories, we calculate the feedbackcorrected pose trajectories. To do this, we first predict the resulting speed and course trajectories, respectively. This is done by simulating the closed-loop error dynamics of the vessel and vessel controllers using the desired velocity trajectories as the input. In this article, we approximate the error dynamics using first order linear models, which may seem as quite rough approximations. However, this is justified by noting that the modelbased speed and course controller demonstrates very good control performance for the Telemetron ASV, resulting in small control errors . Furthermore, the control errors are dominated by environmental disturbances, which is difficult to model without increasing the complexity to an unnecessarily high level. The closed-loop error models are given as  , 0 introduces feedback in the prediction through the current vessel speed and course, U 0 and χ 0 , respectively. Similarly as (15), we construct a set of predicted velocity trajectories: , , , , , .
Combinations of speed and course trajectories that was considered infeasible when forming d are also removed from ¯.
Following this, vessel position trajectories calculated from the predicted velocity trajectories using a kinematic model: which is integrated using the current vessel position as the initial condition. The feedback-corrected predicted vessel pose trajectories are finally combined in the set ¯a s where To summarize, a single step of a trajectory is defined by the set of desired velocity trajectories d , the set of predicted velocity trajectories ¯, and the set of set of predicted pose trajectories ¯.

| Trajectory generation: The full trajectory generation
A full trajectory consists of multiple subtrajectories, each containing one maneuver and constructed using the single-step procedure. This naturally forms a tree structure, with nodes representing vessel states and edges representing subtrajectories. The depth of the tree will be equal to the desired number of maneuvers in each trajectory.
The tree is initialized with the initial state as the root node, which the single-step procedure is performed on, generating a number of subtrajectories and leaf nodes. Following this, the single-step procedure is performed on each of the leaf nodes, adding the next subtrajectory and leaf nodes to the existing trajectories and expanding the tree depth. This procedure is repeated until the tree has the desired depth, resulting in each trajectory having the desired number of maneuvers. Using the same number of speed and course maneuvers at each level would result in the tree growing exponentially with the number of levels. To limit the growth, we therefore allow for choosing a different number of speed and course maneuvers at each level, for instance keeping the speed constant in all levels except the first, only allowing the speed to be changed during the first maneuver of a trajectory.
The remaining parameters can also be chosen differently for each level, and in principle the acceleration trajectories (12) and (13) can also be designed using different structures. However, we choose to use the same acceleration trajectory structure for each level, while also keeping the ramp time and maneuver time lengths constant. This leaves only the step time length and number of speed and course maneuvers as parameters that can change throughout the tree depth.
Choosing different step time lengths can be considered as an MPC input blocking scheme, requiring that the step lengths are integer dividable by the algorithm sample time.
A full trajectory generation can hence be defined by the following parameters: • An initial vessel state including the current desired velocity.
• The number of maneuvers in each trajectory, or levels, defined as , the ramp time T ramp and the speed and course maneuver lengths T U and χ T , respectively.
• The number of speed maneuvers at each level A set of predicted vessel pose trajectories with = B 3 levels is shown in Figure 10. Selecting the trajectory generation parameters is a complex task, and it is difficult to provide a general guideline on how to do this. However, we attempt to provide some thoughts and insight on this. Increasing the number of maneuvers in each trajectory B increases how complex solutions the algorithm look for, and also increases the computational requirements. In general, most COLAV situations should be able to solve by a few maneuvers, and selecting three maneuvers will allow the algorithm to plan for

| Calculating a desired acceleration
In the single-step trajectory generation, a desired acceleration can be used to include a desired maneuver in the search space. We therefore use a guidance algorithm to ensure that there exists a trajectory in the search space that converges toward the desired trajectory inputted to the BC-MPC algorithm. To achieve this, we use a modified version of a path tracking algorithm ensuring vessel convergence to a curved path (Breivik & Fossen, 2004). The control law is based on line of sight (LOS) guidance (Fossen, 2011), together with defining a desired point on the path which the velocity of is controlled, named the path particle (PP).
The desired course is stated as where χ path is the path angle at the desired point, e is the cross-track error, and Δ > 0 is the lookahead distance. The PP velocity along the path is stated as where U is the vessel speed, γ > 0 s is a tuning parameter, and s is the along-track distance. The guidance scheme is illustrated in Figure 11.
The control law (22) controls the speed along the path U PP as a function of the vessel speed, course, and the along-track distance to the PP, letting the vessel converge toward the path with a constant speed. We rather want to be able to follow a desired trajectory by controlling the vessel speed and course based on the desired trajectory. We therefore fix the PP at the desired position on the trajectory, given the current time, and by reformulating (22) we obtain a desired vessel speed given the trajectory velocity: F I G U R E 1 1 LOS guidance scheme. The path particle propagates along the path with the speed U pp . The vessel course is denoted as χ , while χ path denotes the current tangential path course and χ d,LOS denote the desired course. The variables e s , , and Δ are the cross-track error, the along-track distance, and the lookahead distance, respectively It is in general difficult to obtain detailed models of high-speed ASVs, while time delays, sensor noise, and modeling uncertainties are shown to cause robustness issues when using feedbacklinearizing controllers (Eriksen & Breivik, 2017a). Hence, the simplicity of (21), (22), and (23) is appealing when a guarantee of stability and convergence is not required.

| Selecting the optimal trajectory
Given the set of feasible trajectories, we solve an optimization problem to select the optimal trajectory. We start by defining a cost function to assign a cost to each trajectory: are tuning parameters to control the weighting of the different objective terms, which can be selected heuristically by simulating the algorithm to obtain the desired behavior. In general, the avoidance weight ≫ w w av al to ensure that the algorithm prioritizes avoiding obstacles over following the desired trajectory, while w t can be tuned to control how responsive, and sensitive to noise, the algorithm will be.
Using (25), we define the optimization problem: where *( ) u t d is the optimal desired velocity trajectory to be used as the reference for the vessel controllers. The optimization problem is solved by simply calculating the cost over the finite discrete set of trajectories and choosing the one with the lowest cost.
The next sections describe the different terms of the objective function (25). Notice that we strive to avoid using discontinuities and logic to improve the robustness with respect to obstacle estimate noise.

| Trajectory alignment
The alignment between the desired trajectory and a candidate trajectory is used in the objective function (25) to motivate the algorithm to follow the desired trajectory. Given a desired trajectory ( ) → + p t : R R d 2 , required to be C 1 , we obtain a desired course as

| Obstacle avoidance
Obstacle avoidance is achieved by penalizing candidate trajectories with small distances to obstacles. We define three regions around the obstacles, namely, the collision, safety, and margin regions. The idea behind this is to make it possible to use different gradients on the penalty depending on how close the ownship is to the obstacle. This, together with avoiding logic and discontinuities, should improve the robustness with respect to noise on the obstacle estimates.
We define a time-varying vector between obstacle i and a predicted vessel trajectory as The obstacle position in future time is computed under the common assumption that obstacles will keep their current speed and course Johansen et al., 2016;Kuwata et al., 2014), which is a reasonable assumption for relatively short time periods. More complex techniques can also be applied for predicting the future position of obstacles, for instance based on historic AIS data (Dalsnes, Hexeberg, Flåten, Eriksen, & Brekke, 2018) or by estimating the turn rate of the obstacles . Using (29), we define the distance and relative bearing to obstacle i given a predicted vessel trajectory a t a n 2 , , where χ ( ) t i is the course of obstacle i, calculated as The distance d i and relative bearing β i are illustrated in Figure 12.
F I G U R E 1 2 Distance d i and relative bearing β i to obstacle i. The ownship is marked O/S ERIKSEN ET AL.

| 1233
The obstacle distance and relative bearing is used to calculate a penalty function, which we use to define the avoidance function as The penalty function can be designed in a variety of ways, with the simplest possibly being a circular penalty function. When using a circular penalty function, the relative bearing to the obstacle does not matter, and the function can be defined as are the margin, safety, and collision region sizes, respectively, while γ ∈ ( ) 0, 1 1 is a tuning parameter controlling the cost gradient inside the margin and safety regions. The circular penalty function is illustrated in Figure 13.
A circular penalty function is useful for static objects where there is no preference on which side of the object one should pass. For moving vessels, it should be considered to be more dangerous to be in front of the vessel than on the side or behind it, and COLREGs also introduce preferences on which side one should pass an obstacle. An intuitive approach to handle COLREGs would be to use logic to decide the applicable rule with respect to each obstacle, but this conflicts with the idea of designing the algorithm with high robustness to noisy obstacle estimates. Also noting that the BC-MPC algorithm is intended to be used in a hybrid architecture with a mid-level algorithm taking a more proactive approach to the COLREGs rules, we here focus our attention toward a smooth and continuous approximation. In a short-term COLAV perspective, it is not beneficial to constrain the algorithm to strictly follow the head- where * D 0 is given as The actual parameters of the obstacle function should be selected such that the safety region represent the desired clearance, while the collision region represents the absolute minimum clearance required.
The margin region should be selected as the distance when we want the ownship to initiate a maneuver, and should be quite much larger than the safety region. This, together with a quite small obstacle gradient parameter γ 1 , will make the algorithm less sensitive toward fluctuating estimates of obstacle position, speed and course. To reduce the number of parameters to select, we consider that the clearance in front of the obstacle should be twice that behind the ship, hence

| Transitional cost
An important design criteria for the algorithm is that it should be robust with respect to noise on the obstacle estimates, making it well suited for use with tracking systems based on exteroceptive sensors.
By introducing transitional cost in the objective function, a certain level of cost reduction will be required to make the algorithm change the current planned maneuver. This should increase the robustness Notice that the transitional cost term introduces discontinuities, which we previously stated that we would like to avoid to improve the robustness with respect to noise on obstacle estimates. The transitional cost term does, however, not rely on obstacle estimates, making the term insensitive to noise on the obstacle estimates and justifying the use of a discontinuous transitional cost function.

| EXPERIMENTAL RESULTS
Full-scale experiments were conducted in the Trondheimsfjord, Norway, on October 12, 2017. This section describes the experimental setup and results.

| Experimental setup
The Telemetron ASV, briefly introduced in Section 2, was used as the ownship. The vessel is fitted with a SIMRAD Broadband 4G™ Radar, and a Kongsberg Seatex Seapath 330+ GNSS-aided inertial navigation system was used during the experiments. See Table 1 for more details on the vessel specifications. The BC-MPC algorithm was implemented in discrete time using the Euler method to discretize the algorithm, see Table 2 for the algorithm parameters. The parameters was mostly selected heuristically through simulations of the algorithm and the vessel of interest, as described in Section 3. We inputted a userspecified straight-line trajectory with constant speed as the desired trajectory, and used the elliptical COLREGs penalty function for obstacle avoidance. The BC-MPC algorithm was run at a rate of 0.2 Hz.
The implementation consists of a radar-based tracking system to provide obstacle estimates, the BC-MPC algorithm, and the model- based speed and course controller described in Section 2.2 for lowlevel vessel control. The system was implemented on a processing platform with an Intel ® i7 3.4 GHz CPU running Ubuntu 16.04 Linux, using the Robot Operating System (ROS; Quigley et al., 2009). Figure 15 shows the implementation architecture.
The tracking system receives spoke detections from the radar through a UDP interface. The detections are transformed to a local reference frame and clustered together to form one measurement per obstacle, which is a common assumption for many tracking algorithms. The obstacle measurements are used by the radar tracker, which is based on a probabilistic data association filter (PDAF). See Wilthil et al., 2017 for more details on the tracking system.
The BC-MPC algorithm interfaces the tracking system using a ROS service, which enables request-response functionality for providing obstacle estimates. The BC-MPC algorithm outputs a desired velocity trajectory to the model-based speed and course controller, which specifies a throttle and rudder command to the onboard control system through a TCP interface. The on-board control system has an electromechanical actuator for controlling the motor throttle, while the rudder command is handled by steering the outboard engine angle to the desired angle using a PD controller and a hydraulic actuation system.
The system receives AIS messages over VHF to obtain groundtruth trajectories for the vessels involved in the experiments. Notice that these are subject to the uncertainty of the navigation system providing the AIS data on the given vessels. They are, however, expected to be much more precise than the estimates from the radar-based tracking system. Figure 16a shows the inside of the Telemetron ASV, with the navigation system and processing platform.
The Kongsberg Seatex Ocean Space Drone 1 (OSD1) was used as the obstacle. This was originally an offshore lifeboat, which has been fitted with a full control and navigation system for testing autonomous control systems, shown in Figure 16b.
The vessel is 12 m long, and has a mass of approximately 10 metric tons.
During the experiments, the OSD1 was steered on constant course with a speed of approximately 2.5 m/s (five knots) using an autopilot. In addition to the OSD1, several commercial and leisure crafts were present in the area, affecting some of the scenarios.
We included four different scenarios in the experiments:

3.
Overtaking. The ownship approaches the OSD1 from behind with a higher speed. COLREGs requires the ownship to avoid collision by passing on either side. We prefer, however, to pass the OSD1 on its port side by doing a port maneuver.

4.
Crossing from port (CP). Similar scenario as crossing from starboard, but here the OSD1 approaches the ownship from the port side. In this case, COLREGs deems the ownship as the standon vessel, and the OSD1 is supposed to avoid collision. The OSD1 will, however, keep its speed and course, resulting in Rule 17 revoking the stand-on obligation and requiring the ownship to avoid collision, preferably avoiding maneuvering to port.

| 1237
In the following sections, we present three head-on scenarios, two crossing from starboard scenarios, one overtaking scenario and one crossing from port scenario.

| Head on: Experiments 1.1-1.3
The first experiments we performed were a number of head-on scenarios. In these scenarios, the desired trajectory inputted to the BC-MPC algorithm is a straight-line trajectory approaching the OSD1 on a reciprocal course, resulting in a collision with a relative bearing of ∘ 0 if the desired trajectory is followed.
With respect to COLREGs, both vessels should perform starboard maneuvers. However, in our case, the OSD1 violates COLREGs by keeping its speed and course constant throughout the scenario.
To verify that the BC-MPC algorithm worked as it was supposed to, we first used AIS for providing obstacle estimates in Experiment 1.1. The OSD1 is equipped with an AIS transceiver providing lownoise estimates of the position, speed, and course, originating from a Kongsberg Seatex SeaNav 300 navigation system. As shown in Figure 17, we successfully avoid collision in this scenario. This is, however, achieved by performing a port maneuver, which violates the desired COLREGs behavior of maneuvering to starboard. This is most likely caused by the ownship approaching the obstacle on the port side of the desired trajectory, which together with the slightly angled obstacle trajectory makes a port maneuver attractive. The ownship is, however, either in a head-on or stand-on situation, but it is difficult to program an explicit understanding of this without introducing logic or discontinuous functions, which would reduce the robustness to noise. In addition, the algorithm is intended to handle short-term situations, in which the vague possibility of an obstacle making a port maneuver should not be neglected. The elliptical COLREGs obstacle function employs a soft COLREGs interpretation, which allows the algorithm to consider all actions in emergency situations, including maneuvering to port when the algorithm believes this is the safest. However, when making such nonconventional maneuvers, the algorithm requires a significantly increased obstacle clearance, which can be tuned. Notice also that in a hybrid architecture, the mid-level algorithm should have a harder interpretation of COLREGs which would maneuver to starboard at an earlier point, avoiding the situation in full as long as nothing unforeseen happen. Moreover, the maneuver is smooth with a sufficient course change to be readily observable for other vessels. Figure 18a shows the distance between the OSD1 and the ownship, and the predicted future distance given the trajectory the Following this experiment, we performed several experiments using the radar-based tracking system for providing obstacle estimates. Figure 19 shows the results from Experiment 1.2, a similar experiment as the one performed with AIS. In this experiment, the ownship performs a starboard maneuver to avoid collision, as preferred by COLREGs. As shown in the figure, there is a fair amount of noise on the obstacle estimates, in particularly the course estimate. This is confirmed by the course estimate shown in Figure 20b, which shows course fluctuations often in excess of ∘ 20 . Despite this, the ownship performs a smooth maneuver, which demonstrates the BC-MPC algorithm's robustness with respect to noise on the obstacle estimates. This is also shown in Figure 20a, where the predicted distance to the obstacle varies quite much without making the algorithm decide on a new maneuver.
The last head-on scenario, Experiment 1.3, is shown in Figure 21, where we approach the OSD1 from northeast. The predicted future obstacle trajectories at each iteration are omitted from the following figures to improve the readability. This scenario was slightly more complex, as two other vessels unexpectedly entered the scenario. One of these was a high-speed leisure craft approaching from the west, while the other was a high-speed passenger ferry approaching from northeast, behind the ownship.
The leisure craft did not have AIS, and we do therefore not have a ground-truth trajectory for this vessel. Figure 22 shows an image captured by a drone during this experiment, with algorithm visualization embedded in the lower left corner. As in the previous scenario, we avoid the OSD1 by doing a starboard maneuver.
Following this, we approach the desired trajectory before the smooth maneuvers, which again shows robustness with respect to obstacle estimate noise. Figure 24 shows similar plots for the Trondheimfjord II ferry. Notice that it takes some time before the tracking system detects that the passenger ferry makes a maneuver, which is due to a limited sample rate on the radar combined with some latency in the PDAF tracking system.

| Crossing from starboard: Experiments 2.1-2.2
Crossing from starboard is a more complex scenario than the head-on scenario. We performed two experiments with the OSD1 approaching on collision course from starboard. The scenarios were constructed such that the desired trajectory coincides with the obstacle trajectory, resulting in a collision with a relative bearing of -90°if the desired trajectory is followed. In such a scenario, the ownship is deemed the give-way vessel and should avoid collision by preferably maneuvering to starboard and passing abaft of the stand-on vessel.
In Experiment 2.1, shown in Figure 25, we avoided collision with the OSD1 by maneuvering to port and passing in front of the obstacle. This can be considered as suboptimal with respect to the preferred action being passing abaft of the obstacle. Passing in front in a crossing situation is, however, not strictly forbidden by Rule 15. Furthermore, the minimum distance to the obstacle is 214.0 m, meaning that the obstacle is only slightly inside the margin region. With this in mind, we consider this maneuver to be safe with similar arguments as for Experiment 1.1.
In Experiment 2.2, shown in Figure 26, we avoided collision by passing abaft of the OSD1, as preferred by COLREGs. In this experiment, the minimum distance to the obstacle was 106.2 m, significantly closer than when we passed in front of the obstacle. This is still only slightly inside the margin region, remembering that the elliptical COLREGs penalty function with the tuning in Table 2 is smaller abaft an obstacle than in front of an obstacle.

| Crossing from port: Experiment 4
The last scenario we tested was a crossing from port, which may be the most complex of the experiments presented in this article. This situation was generated similarly as the crossing from starboard situation, but with a relative bearing of ∘ 90 instead of -90°. Here, COLREGs deems the ownship as the stand-on vessel, while the OSD1 is deemed the give-way vessel. However, the OSD1 keeps its speed and course, requiring the ownship to avoid collision. In such a situation, COLREGs recommends the ownship to avoid maneuvering to port, favoring a starboard maneuver.

| Experiment summary
The BC-MPC algorithm has been tested in four different scenarios, avoids the use of logic, which provides the algorithm with increased robustness toward obstacle estimate noise. This is an important property when using tracking systems based on exteroceptive sensors such as, for example, radar.

| SIMULATION RESULTS
To complement the experimental results presented in the previous section, in this section we present simulation results in more complex situations. The simulations include multiobstacle scenarios where multiple COLREGs rules apply simultaneously, also with obstacles that maneuver in accordance with the rules.

| Simulation setup
The simulations are performed with the same tuning parameters as the experiments, shown in Table 2. To focus on the algorithm performance itself, we present the algorithm with noise-free measurements of the obstacle position, course and speed during the simulations. To challenge the algorithm, we present it with four multiobstacle scenarios: 1. Head on and crossing from starboard.

2.
Head on and crossing from port. 5.2 | Head on and crossing from starboard: Simulation 1 In this scenario, shown in Figure 29, the ownship faces a simultaneous head-on and crossing from starboard situation, which both require the ownship to maneuver to starboard. With respect to COLREGs, the crossing obstacle has a stand-on obligation with respect to the ownship, and a give-way obligation with respect to the head-on obstacle. In this situation, the crossing obstacle should maneuver toward starboard, and pass behind the head-on obstacle, which should maneuver to starboard in accordance with the head-on situation with the ownship. In Figure 29a, the obstacles do not maneuver, and the BC-MPC algorithm choose a maneuver to starboard to avoid the head-on obstacle and pass behind the crossing obstacle. In Figure 29b, the obstacles maneuver in accordance with COLREGs, and the ownship makes a starboard maneuver and passes behind the crossing obstacle. The maneuver is, however, somewhat smaller than when the obstacles do not maneuver, which is caused by the head-on obstacle cooperating in achieving the required clearance.

| Head on and crossing from port: Simulation 2
In this scenario, shown in Figure 30, the ownship faces a simultaneous head-on and crossing from port situation. This situation is more complex than Simulation 1, since the crossing obstacle requires the ownship to stand on in accordance with Rule 17, while the head-on obstacle requires the ownship to maneuver to starboard in accordance with Rule 14. It is, however, dangerous to ignore a head-on obligation to stand on, and the algorithm should therefore prioritize the head-on situation. The head-on obstacle should give way to the crossing obstacle and maneuver to starboard in accordance with the head-on situation with the ownship, while the crossing obstacle should give way for the ownship. In Figure 30a, the obstacles do not maneuver, and the ownship applies a clear and large maneuver to starboard to avoid the head-on obstacle, and avoid collision with the crossing obstacle. When the obstacles maneuver, shown in Figure 30b, the BC-MPC algorithm evaluates the predicted clearance given how the obstacles maneuver, and chooses to stand on. It is clear that the head-on obstacle performs a large maneuver to pass behind the crossing obstacle, which combined with the crossing obstacle's maneuver makes it safe for the ownship to stand on.
Notice, however, that time delays in estimating the obstacles position, speed and course would delay the ownship in detecting that the obstacles maneuver. This could, depending on the amount of time delay, make the BC-MPC algorithm initiate a maneuver to starboard, as when the obstacles did not maneuver.

| Head on and crossing from starboard with an extra obstacle: Simulation 3
In this scenario, shown in Figure 31, the ownship faces a head-on obstacle, and one crossing obstacle from starboard. In addition, there is another vessel approaching the ownship with an opposing course on a parallel path. The head-on obstacle has a stand-on obligation with respect to the crossing vessel, and a head-on obligation with respect to the ownship. The crossing obstacle has a give-way obligation with respect to the head-on obstacle, and a stand-on obligation with respect to the ownship. The ownship is in a head-on situation with the head-on obstacle, and has also to give way to the crossing obstacle. The third obstacle is considered to have sufficient clearance to the ownship and the two other obstacles to not be considered to be in a collision situation. In pass behind the crossing obstacle. Following this, the ownship makes a port maneuver to avoid interfering with the third obstacle. This is an example of a situation where including future maneuvers in the search space is beneficial, since the algorithm has to plan for the port maneuver already when making the starboard maneuver in order see the full picture. Notice that the ownship has a slow convergence toward the desired trajectory in Figure 31a, which is due to the transitional cost term introducing a just too large cost for the algorithm to change to a trajectory with a faster convergence. When the obstacles maneuver, the BC-MPC algorithm chooses a similar, but smaller maneuver, as shown in Figure 31b.

| Simultaneous crossing from starboard and port: Simulation 4
In this scenario, shown in Figure 32, the ownship faces a simultaneous crossing from starboard and port. The obstacles are in head-on situations with each other. With respect to the ownship, the port obstacle has a give-way obligation, while the starboard obstacle has a stand-on obligation. The ownship is obliged to stand on with respect to the port obstacle, and give way to the starboard obstacle. In Figure 32a, the obstacles do not maneuver, and the ownship makes a large maneuver to starboard to pass behind the obstacle crossing from starboard and avoid to interfere with the obstacle crossing from port. This simulation is unrealistic since the obstacles collide with each other, but it does nevertheless provide insight into the performance of the BC-MPC algorithm. When the obstacles maneuver, as shown in Figure 32b, the ownship still maneuvers to starboard and passes behind the obstacle crossing from starboard. The maneuver is, however, performed with two subsequent turns, where the second starboard turn is made when the starboard obstacle turns to port to pass parallel to the other crossing obstacle.

| Simulation summary
Key points and numbers from the simulations are presented in Table 4.
To present some insight into how the BC-MPC algorithm performs in situations with multiple obstacles, and when multiple The BC-MPC algorithm managed to solve all the scenarios satisfactory, while complying with Rules 13-15 of COLREGs. In the situations where the ownship was given both stand-on and give-way obligations, the give-way obligation was prioritized, except in Simulation 2 when the obstacles maneuvered. The specific reason for this was that head-on obstacle made a large avoidance maneuver to fulfill its give-way obligation with respect to a crossing obstacle, which allowed the ownship to achieve a sufficient clearance while obeying the stand-on obligation.
When the obstacles maneuver in accordance with COLREGs, the BC-MPC algorithm generally chooses smaller maneuvers. As shown in Table 4, the minimum distance to head-on obstacles is approximately the same both when the obstacles maneuver and do not maneuver, except for Simulation 2 where the head-on obstacle makes a large maneuver. There is not a clear trend on how the minimum distance to crossing vessels is influenced when obstacles maneuver, but the number of simulations is anyhow too small to draw any statistical conclusions. Nevertheless, this indicates that the BC-MPC algorithm has an understanding of the joint responsibility in achieving the required clearance since it achieves approximately the same clearance regardless of whether the obstacles maneuver or not.

| CONCLUSION AND FURTHER WORK
We The authors have continued the work on the BC-MPC algorithm, specifically on including static obstacles and providing smoother trajectories with clearer maneuvers, which will be published in Eriksen ORCID