The human medial temporal lobe (MTL) is an important part of the limbic system, and its substructures play key roles in learning, memory, and neurodegeneration. The MTL includes the hippocampus (HC), amygdala (AG), parahippocampal cortex (PHC), entorhinal cortex, and perirhinal cortex—structures that are complex in shape and have low between-structure intensity contrast, making them difficult to segment manually in magnetic resonance images. This article presents a new segmentation method that combines active appearance modeling and patch-based local refinement to automatically segment specific substructures of the MTL including HC, AG, PHC, and entorhinal/perirhinal cortex from MRI data. Appearance modeling, relying on eigend-ecomposition to analyze statistical variations in image intensity and shape information in study population, is used to capture global shape characteristics of each structure of interest with a generative model. Patch-based local refinement, using nonlocal means to compare the image local intensity properties, is applied to locally refine the segmentation results along the structure borders to improve structure delimitation. In this manner, nonlocal regularization and global shape constraints could allow more accurate segmentations of structures. Validation experiments against manually defined labels demonstrate that this new segmentation method is computationally efficient, robust, and accurate. In a leave-one-out validation on 54 normal young adults, the method yielded a mean Dice κ of 0.87 for the HC, 0.81 for the AG, 0.73 for the anterior parts of the parahippocampal gyrus (entorhinal and perirhinal cortex), and 0.73 for the posterior parahippocampal gyrus. Hum Brain Mapp 35:377–395, 2014. © 2012 Wiley Periodicals, Inc.