By recording auditory electrical brain potentials, we investigated whether the basic sound parameters (frequency, duration and intensity) are differentially encoded among speech vs. music sounds by musicians and non-musicians during different attentional demands. To this end, a pseudoword and an instrumental sound of comparable frequency and duration were presented. The accuracy of neural discrimination was tested by manipulations of frequency, duration and intensity. Additionally, the subjects’ attentional focus was manipulated by instructions to ignore the sounds while watching a silent movie or to attentively discriminate the different sounds. In both musicians and non-musicians, the pre-attentively evoked mismatch negativity (MMN) component was larger to slight changes in music than in speech sounds. The MMN was also larger to intensity changes in music sounds and to duration changes in speech sounds. During attentional listening, all subjects more readily discriminated changes among speech sounds than among music sounds as indexed by the N2b response strength. Furthermore, during attentional listening, musicians displayed larger MMN and N2b than non-musicians for both music and speech sounds. Taken together, the data indicate that the discriminative abilities in human audition differ between music and speech sounds as a function of the sound-change context and the subjective familiarity of the sound parameters. These findings provide clear evidence for top-down modulatory effects in audition. In other words, the processing of sounds is realized by a dynamically adapting network considering type of sound, expertise and attentional demands, rather than by a strictly modularly organized stimulus-driven system.