next up previous contents
Next: Example: Three Source Application Up: Bayesian Two-Source Models for Previous: Maximizing for : DASSS   Contents

Algorithm

Thus far, we have developed the theory behind the DUET and DASSS methods and how their underlying models may allow us to deal with the two-source case. We now explicitly describe how the Bayesian approach for DASSS data from the previous section may be used to demix music:

  1. Run the DUET system on the musical excerpt to estimate the mixing parameters $(a_i,\delta_i), i \in \{1,2,...N\}$ for each of $N$ sources.
  2. Calculate all $\alpha$ assuming the values for all $\omega$ using the closed form expression in equation 8:

    \begin{eqnarray*}
\alpha_{u,v} \equiv (1-\ensuremath{\frac{a_v}{a_u}}e^{j\omega(\delta_u - \delta_v)}).
\end{eqnarray*}



  3. Choose a resolution for possible magnitude ratios $r$, and then numerically calculate a histogram distribution on $\hat{Y}_{i \neq (u\vert v)}$ for each possible $(u,v,r)$ combination by using equation 23:

    \begin{eqnarray*}
\hat{Y}_{i \neq (u\vert v)} &=& S_u(\alpha_{iu} + r e^{j\theta}\alpha_{iv}).
\end{eqnarray*}



    Divide the bin labels by whatever constant is necessary to achieve $Y_{i=v} = 1$. Use a constant number of bins in each histogram, regardless of range. This will make scores comparable later.
  4. Establish values for $P(u,v,r)$ for each frequency using prior musical knowledge. Allow ``NULL'' as one of the sources3.
  5. Perform STFT processing on the musical excerpt. For each time-frequency point, do the following:
    1. For each possible source $i \in \{1,2,... N\}$, calculate $\hat{Y}_i$ values using equation 4:

      \begin{eqnarray*}
Y_i &\equiv& X_1 - \ensuremath{\frac{1}{a_i}}e^{+j\omega\delta_i} X_2.
\end{eqnarray*}



    2. For each possible $(u,v)$ combination, solve for the optimal value of $r$ using equation 25:

      \begin{eqnarray*}
r &=& \ensuremath{\frac{\vert S_v\vert}{\vert S_u\vert}}= \en...
...ert\hat{Y}_{i=v}\vert}{\vert\alpha_{uv}\vert \vert S_u\vert}}.
\end{eqnarray*}



    3. For each possible $(u,v)$ combination, normalize $\hat{Y}_{i \neq (u\vert v)}$ by dividing by $\hat{Y}_{i=v}$ as per equation 26:

      \begin{eqnarray*}
\hat{Y}_{i \neq (u\vert v)} = \ensuremath{\frac{\hat{Y}_{i
\neq(u\vert v)}}{\vert\hat{Y}_{i=v}\vert}}.
\end{eqnarray*}



    4. For each $(u,v,r^{\mathrm{optimal}}_{uv})$ combination, look up the stored histogram for $Y_{i \neq
(u\vert v)}$. To calculate $P(D\vert u,v,r^{\mathrm{optimal}}_{uv})$, record the histogram value for the bin indexed nearest the value $\hat{Y}_{i \neq (u\vert v)}$.
    5. For each $(u,v,r^{\mathrm{optimal}}_{uv})$ combination, calculate $P(u,v,r^{\mathrm{optimal}}_{uv}\vert D)$ using Bayes' rule (as in equation 27):

      \begin{eqnarray*}
P(u,v,r\vert D) = \ensuremath{\frac{p(D)\vert u,v,r)p(u,v,r)}{p(D)}}.
\end{eqnarray*}



    6. From the ($N$ choose 2) values of $P(u,v,r^{\mathrm{optimal}}_{uv}\vert D)$, record the $(u,v,r)$ combination that scores most highly. Optionally, record other high scoring combinations.
    7. Estimate $\hat{S}_u$ and $\hat{S}_v$ with equation 10. Calculate the corresponding $\hat{r}$ to ensure agreement with the assumed $r$. If there is disagreement, try the other high-scoring combinations, or decide to set $\hat{S}_u = \hat{S}_v =
0$.
  6. Perform IFFT and COLA processing on the estimated $\hat{S}_u$ and $\hat{S}_v$ signals to recover the original sources.


next up previous contents
Next: Example: Three Source Application Up: Bayesian Two-Source Models for Previous: Maximizing for : DASSS   Contents
Aaron S. Master 2003-11-01