On nonnegative matrix factorization algorithms for signal-dependent noise with application to electromyography data

Karthik Devarajan, Vincent C K Cheung, Karthik Devarajan, Vincent C K Cheung

Abstract

Nonnegative matrix factorization (NMF) by the multiplicative updates algorithm is a powerful machine learning method for decomposing a high-dimensional nonnegative matrix V into two nonnegative matrices, W and H, where V ~ WH. It has been successfully applied in the analysis and interpretation of large-scale data arising in neuroscience, computational biology, and natural language processing, among other areas. A distinctive feature of NMF is its nonnegativity constraints that allow only additive linear combinations of the data, thus enabling it to learn parts that have distinct physical representations in reality. In this letter, we describe an information-theoretic approach to NMF for signal-dependent noise based on the generalized inverse gaussian model. Specifically, we propose three novel algorithms in this setting, each based on multiplicative updates, and prove monotonicity of updates using the EM algorithm. In addition, we develop algorithm-specific measures to evaluate their goodness of fit on data. Our methods are demonstrated using experimental data from electromyography studies, as well as simulated data in the extraction of muscle synergies, and compared with existing algorithms for signal-dependent noise.

Figures

Figure 1
Figure 1
Illustration of the mean-variance relationship for the frog EMG data. Plot of the logarithm of the estimated standard deviation against the logarithm of the estimated mean for moving windows across time for each behavior of selected muscles and frogs. Each panel displays the mean-variance relationship for a particular behavior. A, Intact Jump B, Deafferented Jump C, Intact Swim D, Deafferented Swim. In each panel, the black solid line represents a linear fit to the data and estimates of the slope, root mean squared error (RSE) and adjusted R2 are listed at the top of each panel.
Figure 2
Figure 2
Selecting the number of muscle synergies for the JG algorithm using AIC. To determine the model order, the number of muscle synergies extracted was successively increased from 1 to 13; at each number of synergies, the AIC was calculated using equation (37). A, Plot of AIC against the number of muscle synergies extracted for both the intact (black solid) and deafferented (dotted) jump (4 frogs; mean ± SD). B, Plot of AIC for intact (solid black) and deafferented (dotted) swim. The model order with minimum AIC was found to be 3 for jump, and 4 for swim (*).
Figure 3
Figure 3
The number of muscle synergies selected for the different NMF algorithms. For each behavior (A, intact and deafferented jump; B, intact and deafferented swim) and each algorithm, the number of muscle synergies selected for each frog was determined by selecting the rank with minimum AIC. Note that the selected numbers for all behaviors and algorithms were quite consistent across animals.
Figure 4
Figure 4
Signal-dependent noise NMFs outperformed the Gaussian NMF. In this application, we are primarily interested in each algorithm’s ability to identify structures shared between the intact and deafferented data sets; thus, our measures of algorithm performance are based on quantifying the similarity between the intact and deafferented muscle synergies. For both the scalar-product (A and C) and principal-angle (B and D) measures, overall the seven NMFs based on signal-dependent noise outperformed the Gaussian NMF in their ability to extract features shared between data sets. In each graph, the level of similarity achieved by the Gaussian algorithm (black) is marked by a horizontal black dotted line for ease of visual inspection.
Figure 5
Figure 5
Muscle synergies extracted by the JG algorithm were physiologically interpretable. A, Intact (black) and deafferented (white) muscle synergies for swimming (frog 2) returned by the Gaussian algorithm. The scalar product similarity between each synergy pair is indicated above the pair. The intact and deafferented synergies for pair 4 were totally dissimilar. B, Intact (black) and deafferented (white) muscle synergies for swimming (frog 2) returned by the JG algorithm. Here, even in the least similar pair (pair 4, scalar product = 0.62), the sets of muscles found to be active in the intact and deafferented synergies were still identical. C, The correlation coefficient between the activation of muscle synergy 3 (the extension synergy) and those of muscle synergies 1, 2, and 4, respectively, before (black) and after (white) deafferentation. Note that the correlation between synergies 3 and 4 increased 5-fold after deafferentation. This suggests that sensory feedback is essential in triggering or maintaining the activation of synergy 4 during the flexion phase of the swim cycle.
Figure 6
Figure 6
Comparison of results using NMF algorithms derived from the same noise distribution. We performed a comparison of the muscle synergies extracted by different NMF algorithms from the same EMG data set in order to understand the effects of the NMF-noise distribution and the cost function employed on the muscular compositions of the extracted muscle synergies. A, In each frog, the set of muscle synergies extracted by each algorithm was matched to the set returned by the gamma-based KLGH algorithm (*), and their similarity was quantified by the scalar product values averaged across the synergy set. Shown in the plot are values averaged across frogs (N = 4; mean ± SD). Values for the KLGH were 1.0 by definition. In this comparison, scalar product values from the gamma algorithms tended to be higher than those from the Gaussian or IG-based algorithms. This difference is especially obvious for the intact jump and deafferented jump data sets. B, Same as A, except that the comparison was performed by matching synergies of each algorithm to synergies returned by the IG-based KLIGH algorithm (*). In this comparison, scalar product values from IG-based algorithms tended to be higher than those from the Gaussian or gamma-based algorithms. Again, this difference is especially obvious for the intact jump and deafferented jump data sets.
Figure 7
Figure 7
Both the noise distribution and the cost function employed for formulating the NMF update rules could influence the muscular compositions of the extracted muscle synergies. Here we show the muscle synergies extracted from one particular data set (frog 2, deafferented jump) by different NMF algorithms. The results returned by the four gamma-based algorithms were almost identical (as suggested by Fig. 6). However, the gamma-synergies were clearly different from the Gaussian and IG-based synergies. Also, the muscle synergies returned by the three IG-based synergies were also somewhat different from each other. Thus, both the noise distribution and the cost function used for deriving the NMF update rules could influence the structures of the basis vectors extracted.
Figure 8
Figure 8
Gaussian NMF outperformed the signal-dependent noise NMFs in data sets corrupted by Gaussian noise. We evaluated the performance of each algorithm in simulated data sets (N = 10) generated by known W (15 × 5 matrix) and H (5 × 5000 matrix), but corrupted by random Gaussian noise at different signal-to-noise ratios (SNR). A, Performance of NMF algorithms in identifying the basis vectors (W). Performance of each algorithm in each data set was quantified by the scalar product between the extracted vectors and the original vectors, averaged across the 5 basis vectors in the W matrix. Shown in the plot are mean scalar product values, defined as above, averaged across 10 simulated data sets. The Gaussian NMF algorithm outperformed all IG-based NMF algorithms and 2 of the gamma-based NMF algorithms ( KLGd and JG) over a wide range of SNR (*; Student’s t-test; p < 0.05). B, Performance of NMF algorithms in identifying the coefficients (H). Performance of each algorithm in each data set was quantified by the Pearsons correlation coefficient (ρ) between the extracted coefficients and the original coefficients (over a total of 5 × 5000 = 25,000 values). Shown in the plot are ρ values averaged across the 10 simulated data sets. The Gaussian NMF algorithm outperformed all of the gamma- and IG-based NMF algorithms over almost all tested SNR (*; p < 0.05).
Figure 9
Figure 9
Gamma-based NMF algorithms outperformed the Gaussian NMF algorithm in data sets corrupted by gamma noise. We evaluated the performance of each algorithm in simulated data sets (N = 10) generated by known W (15 × 5 matrix) and H (5 × 5000 matrix), but corrupted by random gamma noise at different signal-to-noise ratios (SNR). A, Performance of NMF algorithms in identifying the basis vectors (W). Performance of each algorithm in each data set was quantified by the scalar product between the extracted vectors and the original vectors, averaged across the 5 basis vectors in the W matrix. Shown in the plot are mean scalar product values, defined as above, averaged across 10 simulated data sets. Gamma-based algorithms outperformed the Gaussian algorithm (but not the IG-based algorithms) at moderate noise magnitude (*; Student’s t-test; p < 0.05); but at high noise magnitudes, the gamma algorithms performed better than both Gaussian- and IG-NMF algorithms (+; p < 0.05). B, Performance of the NMF algorithms in identifying the coefficients (H). Performance of each algorithm in each data set was quantified by the Pearsons correlation coefficient (ρ) between the extracted coefficients and the original coefficients (over a total of 5 × 5000 = 25,000 values). Shown in the plot are ρ values averaged across the 10 simulated data sets. Gamma-based algorithms outperformed the Gaussian algorithm, but not the IG-based algorithms, at moderate noise levels (*; p < 0.05).
Figure 10
Figure 10
The inverse Gaussian NMF algorithms outperformed the Gaussian- and gamma-based NMF algorithms in data sets corrupted by inverse Gaussian noise. We evaluated the performance of each algorithm in simulated data sets (N = 10) generated by known W (15 × 5 matrix) and H (5 × 5000 matrix), but corrupted by random inverse Gaussian (IG) noise at different signal-to-noise ratios (SNR). A, Performance of NMF algorithms in identifying the basis vectors (W). Performance of each algorithm in each data set was quantified by the scalar product between the extracted vectors and the original vectors, averaged across the 5 basis vectors in the W matrix. Shown in the plot are mean scalar product values, defined as above, averaged across 10 simulated data sets. At moderate noise levels, IG-based algorithms clearly outperformed both the Gaussian and gamma algorithms (*; Student’s t-test; p < 0.05); at high and low noise levels, IG-based algorithms still performed better than the Gaussian (but not the gamma) algorithm (+; p < 0.05). B, Performance of the NMF algorithms in identifying the coefficients (H). Performance of each algorithm in each data set was quantified by the Pearsons correlation coefficient (ρ) between the extracted coefficients and the original coefficients (over a total of 5 × 5000 = 25,000 values). Shown in the plot are ρ values averaged across the 10 simulated data sets. IG-based algorithms outperformed the Gaussian NMF over a wide range of SNR (*; p < 0.05).

Source: PubMed

Подписаться