Singing Voice Separation

SVS (Singing Voice Separation) from polyphonic audio music is a difficult task, even for a human. Our SVS system can achieve state-of-the-art performance, and the extracted singing voices can be used for various purposes:

  • By performing pitch tracking on the separated singing voices, we can construct databases from polyphonic audio music site (such as Youtube) for QBSH.
  • The separated singing voices can be used as the groundtruth for singing scoring based on pitch, vibrato, and other paralinguistic effects.
  • The separated singing voices can be aligned with lyrics to create synchronous texts for karaoke.
  • The same concept of SVS can be used for speech enhancement, such that human speech can be extracted from a noisy environment for better further processing, such as speech recognition.
  • The same concept of SVS can be used to extract a specific instrument (such as guitar) from the original polyphonic audio music. Our SVS system has achieved top-1 ranking in MIREX 2015.