Get access

Identifying fragments of natural speech from the listener's MEG signals

Authors

  • Miika Koskinen,

    Corresponding author
    1. Brain Research Unit, MEG Core, and Advanced Magnetic Imaging Centre, Low Temperature Laboratory, Aalto University, Finland
    • P.O. Box 13000, FI-00076 AALTO, Finland
    Search for more papers by this author
  • Jaakko Viinikanoja,

    1. Helsinki Institute for Information Technology HIIT, Aalto University, Finland
    2. Department of Information and Computer Science, Aalto University, Finland
    Search for more papers by this author
  • Mikko Kurimo,

    1. Department of Information and Computer Science, Aalto University, Finland
    Search for more papers by this author
  • Arto Klami,

    1. Helsinki Institute for Information Technology HIIT, Aalto University, Finland
    2. Department of Information and Computer Science, Aalto University, Finland
    Search for more papers by this author
  • Samuel Kaski,

    1. Helsinki Institute for Information Technology HIIT, Aalto University, Finland
    2. Department of Information and Computer Science, Aalto University, Finland
    3. HIIT, Department of Computer Science, University of Helsinki, Finland
    Search for more papers by this author
  • Riitta Hari

    1. Brain Research Unit, MEG Core, and Advanced Magnetic Imaging Centre, Low Temperature Laboratory, Aalto University, Finland
    Search for more papers by this author

Abstract

It is a challenge for current signal analysis approaches to identify the electrophysiological brain signatures of continuous natural speech that the subject is listening to. To relate magnetoencephalographic (MEG) brain responses to the physical properties of such speech stimuli, we applied canonical correlation analysis (CCA) and a Bayesian mixture of CCA analyzers to extract MEG features related to the speech envelope. Seven healthy adults listened to news for an hour while their brain signals were recorded with whole-scalp MEG. We found shared signal time series (canonical variates) between the MEG signals and speech envelopes at 0.5–12 Hz. By splitting the test signals into equal-length fragments from 2 to 65 s (corresponding to 703 down to 21 pieces per the total speech stimulus) we obtained better than chance-level identification for speech fragments longer than 2–3 s, not used in the model training. The applied analysis approach thus allowed identification of segments of natural speech by means of partial reconstruction of the continuous speech envelope (i.e., the intensity variations of the speech sounds) from MEG responses, provided means to empirically assess the time scales obtainable in speech decoding with the canonical variates, and it demonstrated accurate identification of the heard speech fragments from the MEG data. Hum Brain Mapp, 2013. © 2012 Wiley Periodicals, Inc.

Ancillary