The reset disambiguation policy for navigating stochastic obstacle fields



The problem we consider is a stochastic shortest path problem in the presence of a dynamic learning capability. Specifically, a spatial arrangement of possible obstacles needs to be traversed as swiftly as possible, and the status of the obstacles may be disambiguated (at a cost) en route. No efficiently computable optimal policy is known, and many similar problems have been proven intractable. In this article, we adapt a policy which is optimal for a related problem and prove that this policy is indeed also optimal for a restricted class of instances of our problem. Otherwise, this policy is generally suboptimal but, nonetheless, it is both effective and efficiently computable. Examples/simulations are provided in a mine countermeasures application. Of central use is the Tangent Arc Graph, a polynomially sized topological superimposition of exponentially many visibility graphs. © 2011 Wiley Periodicals, Inc. Naval Research Logistics, 2011