Clustering Signed Networks with the Geometric Mean of Laplacians

Date

Mar 15, 2016

Event

16th GAMM Workshop Applied and Numerical Linear Algebra with special emphasis on Linear Algebra Challenges in Machine Learning

Location

Hamburg, Germany

We define a signed network $G^{\pm}$ as a pair of graphs $G^{\pm}=(G^+, G^-)$, where $G^+=(V,E^+)$ encodes positive (friendship) relations and $G^-=(V,E^-)$ encodes negative (enmity) ones. In this talk we discuss major drawbacks of popular extensions of spectral clustering to signed networks. Our analysis shows that in expectation under the Stochastic Block Model (SBM) existing approaches do not recover ground-truth clusters even when either $G^+$ or $G^-$ contains no noise.

This problem arises as existing approaches merge the informations from $G^+$ and $G^-$ through a form of arithmetic mean of Laplacians or adjacencies of the positive and negative part. We propose to use the geometric mean of the Laplacian matrix $L$ of $G^+$ and the signless Laplacian $Q$ of $G^-$, defined by $$ L \# Q = L^\frac{1}{2} ( L^{-\frac{1}{2}} Q L^{-\frac{1}{2}} )^\frac{1}{2} {L}^\frac{1}{2}$$

We show that in expectation under the SBM the extremal eigenvectors of the matrix $L \# Q$ recover ground-truth clusters in any reasonable clustering setting, outperforming existing approaches.

We propose a numerical method to efficiently compute some extremal eigenvectors of the geometric mean $L \# Q$ without ever computing $L \# Q$ itself. Our algorithm is based on the inverse power method and the extended Krylov subspace technique, and applies to the geometric mean $A \# B$ of a generic pair of positive definite matrices $A$ and $B$.