Estimating the Clustering Coefficient Using Sample Complexity Analysis


In this work we present a sampling algorithm for estimating the local clustering of each vertex of a graph. Let $G$ be a graph with $n$ vertices, $m$ edges, and maximum degree $\Delta$. We present an algorithm that, given $G$ and fixed constants $0 < \varepsilon, \delta, p < 1$, outputs the values for the local clustering coefficient within $\varepsilon$ error with probability $1 - \delta$, for every vertex $v$ of $G$, provided that the (exact) local clustering of $v$ is not ’too small.’ We use VC dimension theory to give a bound for the number of edges required to be sampled by the algorithm. We show that the algorithm runs in time $\mathcal{O}(\Delta \lg \Delta + m)$. We also show that the running time drops to, possibly, sublinear time if we restrict $G$ to belong to some well-known graph classes. In particular, for planar graphs the algorithm runs in time $\mathcal{O}(\Delta)$. In the case of bounded-degree graphs the running time is $\mathcal{O}(1)$ if a bound for the value of $\Delta$ is given as a part of the input, and $\mathcal{O}(n)$ otherwise.

In LATIN 2022 - 15th Latin American Theoretical Informatics Symposium
André Vignatti
André Vignatti
Associate Professor