Next: Summary and Future Directions
Up: Multi-Source Approaches
Previous: Inverse of Two-Source Model
We now propose a method for considering a harmonics plus noise
model of each input source in cases where DASSS suggests that more
than one source is present. Though this method has not yet been
implemented, we plan to pursue this approach in the weeks to come.
We will require that the spectral content of each source be
either:
- harmonic, meaning that the spectral components are
integer multiples of a single fundamental frequency, or
- bandlimited noise-like, meaning that the spectral
components are part of a spectrally smooth curve.
To introduce this source model into the current paradigm, we will
allow the calculation of source components via
equation 7 only when the sources satisfy the above
requirements.
This decision clearly requires estimates of the fundamental
frequency and bandlimited noise shape of each source. To determine
the former, we can use any of a variety of pitch detection
techniques on the source signals estimated using the single source
approach of section 4. Although music signals are
characterized by large overlap of spectral components, even a few
spectral peaks may yield a reliable pitch estimate. Furthermore,
the pitch detection data may be smoothed over frames, or
considered across source estimates so as to seek at least say
pitches. To determine the overall noise shape, similar techniques
may be employed.
Given these estimates for each source, we will develop a scoring
function to determine how likely it is that a particular bin
belongs to a given source. If the function scores above a certain
threshold, we will include the corresponding source in an inverse
or pseudo-inverse calculation similar to equation 7.
We note that the harmonics plus noise model is only minimally
constrained, and that most speech and non-percussive instrument
sounds may be modeled this way. We currently place no constraints
on the smoothness of the harmonics' spectral envelope or the pitch
range of the fundamental frequency. These characteristics tend to
be instrument or voice-gender specific, and have been exploited to
significant pitch detection success in [4].
Nonetheless, we maintain the generality, flexibility, and
simplicity of the model by avoiding such considerations for the
present.
Next: Summary and Future Directions
Up: Multi-Source Approaches
Previous: Inverse of Two-Source Model
Aaron S. Master
2003-03-27