We present novel parallel algorithms for collision detection and separation distance computation for rigid and deformable models that exploit the computational capabilities of many-core GPUs. Our approach uses thread and data parallelism to perform fast hierarchy construction, updating, and traversal using tight-fitting bounding volumes such as oriented bounding boxes (OBB) and rectangular swept spheres (RSS). We also describe efficient algorithms to compute a linear bounding volume hierarchy (LBVH) and update them using refitting methods. Moreover, we show that tight-fitting bounding volume hierarchies offer improved performance on GPU-like throughput architectures. We use our algorithms to perform discrete and continuous collision detection including self-collisions, as well as separation distance computation between non-overlapping models. In practice, our approach (gProximity) can perform these queries in a few milliseconds on a PC with NVIDIA GTX 285 card on models composed of tens or hundreds of thousands of triangles used in cloth simulation, surgical simulation, virtual prototyping and N-body simulation. Moreover, we observe more than an order of magnitude performance improvement over prior GPU-based algorithms.