A model is proposed for recent ground-based observations of auroral roar emissions, detected at 2Ωe and 3Ωe, where Ωe is the local electron cyclotron frequency in the source region, between 200 and 500 km above the Earth's surface. Electron cyclotron maser emission is a likely mechanism to account for these emissions because it naturally produces coherent radiation at harmonics of Ωe. A theory for auroral roar emissions has already been proposed, whereby maser-generated second (X2) and third (X3) harmonic x mode radiation is amplified in the source region by multiple reflections off the walls of the density cavity in which they are produced. After many reflections the X2 and X3 waves propagate along the density cavity to a ground-based observer. However, it is demonstrated here with ray-tracing calculations that it is highly probable that maser-generated X2 and X3 radiation is reabsorbed at lower altitudes and thus cannot be detected at the ground. An indirect maser mechanism is proposed instead, where maser-generated z mode waves at Ωe grow to high levels in the source region and then undergo repeated nonlinear wave-wave coalescence to produce second- and third-harmonic waves that propagate directly to the ground. The z mode waves must satisfy the necessary kinematic constraints to produce observable second- and third-harmonic radiation. The dependence of the z mode maser on the temperature and functional form of the unstable electron distribution is discussed, along with the conditions required for the coalescence processes to proceed and produce the observed levels of radiation.