The Subspace Enumeration Method

In this post, we briefly explain the main idea of [CDKP23], which uses a novel spectral method named "subspace enumeration" to solve the extensively studied task of approximately counting the number of independent sets on a given bipartite graph \(G\).

The purpose of this post is only to sketch the proof (and its novel idea) in an easy-to-understand way. For more details, please refer to the original paper.


Overview

The problem setting we will discuss about in this post is \(d\)-regular (and connected WLOG) bipartite graph \(G = (X, Y, E)\) where \(d = \delta n\).

The \(d\)-regular condition is often required in approximate counting problems, and here we will additionally utilize its spectral properties.

First we describe some common approaches towards counting the number of independent sets \(i(G)\) on \(G\). For a possible independent set \(I\), enumerating its \(X\)-part \(I_X := I \cap X\) gives \[ i(G) = \sum_{A \subseteq X} 2^{|Y \backslash N(A)|} = 2^{|Y|} \sum_{A \subseteq X} 2^{-|N(A)|}. \]

Since there might be many subsets \(A \subseteq X\) having the same neighborhood, it seems a more efficient way to enumerate \(B \subseteq Y\) and compute the contribution of those having \(B\) as (exact) neighborhood together. Denote the collection of these subsets by \(\mathcal{N}_B\), then \(\left\langle \mathcal{N}_B, \subseteq \right\rangle\) is a join-semilattice, implying there exists a greatest element \(A_B \in \mathcal{N}_B\) s.t. every \(A \subseteq X\) with \(N(A) = B\) is a subset of \(A_B\). We call such subsets closed, and the minimum closed subset containing \(A\) the closure of \(A\), denoted by \([A]\). Now we enumerate closed subsets (of \(X\)) instead of subsets of \(Y\), which may contain many "useless" ones (i.e. \(\mathcal{N}_B = \varnothing\)): \[ i(G) = 2^{|Y|} \sum_{A \subseteq X\ \text{closed}} \mathcal{D}_A \cdot 2^{-|N(A)|}, \quad \mathcal{D}_A := |\mathcal{N}_{N(A)}|. \]

Trying to proceed further, we will soon find that two vertices \(x, y \in A\) may be "correlated" (i.e., have common neighbors), which leads to a difference between \(|N(A)|\) and \(\sum_{u \in A} |N(u)|\). A natural attempt to resolve the term \(|N(A)|\) is to consider sets of vertices with no correlation between separately. Formally, we say two vertices \(u, v \in V\) are 2-linked if \(\exists w \in V: u \sim w \sim v\). A 2-linked subset \(A \subseteq X\) is called a polymer, and polymers are compatible if they are not 2-linked (i.e., not correlated). Let \(\mathcal{P}(X)\) be all polymers in \(2^X\). With this characterization of correlated vertices, we can uniquely partition \(A\) into compatible polymers (also abbreviated as components in below, denoted by \(\mathcal{K}(A)\)) \(A_1, \ldots, A_k\) s.t. \(|N(A)| = \sum_{i \in [k]} |N(A_i)|\). Observing that \(A\) is closed iff all its components are closed, \[ \begin{aligned} i(G) &= 2^{|Y|} \sum_{\substack{A \subseteq X\ \text{with} \\ \text{all components closed}}} \mathcal{D}_A \cdot 2^{-|N(A)|} = 2^{|Y|} \sum_{k \geq 0} \sum_{\substack{\left\lbrace A_1, \ldots, A_k \right\rbrace \subseteq \mathcal{P}(X) \\ \text{compatible and all closed}}} \prod_{i \in [k]} \mathcal{D}_{A_i} \cdot 2^{-|N(A_i)|}, \\ \mathcal{D}_A &:= |\mathcal{N}_{N(A)}| = \prod_{K \in \mathcal{K}(A)} |\mathcal{N}_{N(K)}|. \end{aligned} \]

A standard step then is to make a discussion on the expansion of (the components of) \(A\). We call a subset \(A \subseteq X\) \(t\)-expanding if \(|N(A)| = |[A]| + t\), and correspondingly \(t\)-contracting if \(|N(A)| < |[A]| + t\). The subsets with stronger expansion tend to have (exponentially) smaller weights, so we can only enumerate the \(t_0\)-contracting polymers (which contributes the main part), then approximate the sum of \(t_0\)-expanding polymers' weights. With the "threshold parameter" \(t_0\) properly picked to ensure a provably reasonable deduction, \[ \begin{aligned} i(G) &= 2^{|Y|} \sum_{\substack{A \subseteq X\ \text{with} \\ \text{all components closed} \\ \text{and $t_0$-contracting}}} \mathcal{D}_A \cdot 2^{-|N(A)|} \cdot \Xi_A, \\ \Xi_A &:= \sum_{k \geq 0} \sum_{\substack{\left\lbrace B_1, \ldots, B_k \right\rbrace \subseteq \mathcal{P}(X \backslash N^2(A)) \\ \text{compatible and all not $t_0$-contracting}}} 2^{-\sum_{i \in [k]} |N(B_i)|}. \end{aligned} \]

With an additional notation \(\mathcal{A}(G)\) denoting the collection of all \(A \subseteq X\) with all components closed and \(t_0\)-contracting, we have the final form \[ i(G) = 2^{Y} \sum_{A \in \mathcal{A}(G)} \mathcal{D}_A \cdot 2^{-|N(A)|} \cdot \Xi_A. \]

Now the task of computing \(i(G)\) splits into three parts:

  1. enumerate \(A \in \mathcal{A}(G)\),
  2. (approximately) compute \(\mathcal{D}_A\) for all possible \(A \in \mathcal{A}(G)\),
  3. (approximately) compute \(\Xi_A\) for all possible \(A \in \mathcal{A}(G)\).

The last two parts are mostly identical to that in [JPP21] and do not introduce much novelty, so here we omit the details (maybe I will write notes about [JPP21] and complete these parts in the future) and focus on the first one. To enumerate \(A \in \mathcal{A}(G)\) efficiently, the subspace enumeration technique is introduced. As a brief outline, the proof consists of three pieces:

  1. every \(A \in \mathcal{A}(G)\) corresponds to a small cut,
  2. a collection \(\mathcal{C}^{\rm cut}\) of "principal" cuts can be efficiently constructed using subspace enumeration s.t. every small cut is close to some "principal" cut,
  3. given a "principal" cut \(C \in \mathcal{C}^{\rm cut}\), we can enumerate those small cuts being close to \(C\).

By the term cut we refer to a subset \(C \subseteq V = X \cup Y\), with its value \(|\nabla(C)| := |(C \times (V \backslash C)) \cap E|\) defined as the number of edges with precisely one endpoint in \(C\).

The first part holds naturally by setting cut \(S_A = A \cup N(A)\), where \(|\nabla(S_A)| \leq t d\) since every \(A \in \mathcal{A}(G)\) is \(t\)-contracting.

The Subspace Enumeration Method

The second part starts from a spectral observation: \(|\nabla(C)|\) can be written as \(\vec{c}^T L \vec{c}\), where \(\vec{c}\) denotes the indicator vector of \(C\) with ground set \(V\), and \(L = dI-A\) denotes the Laplacian matrix of \(G\). For a \(d\)-regular \(G\), \[ \operatorname{Tr}(A^2) = Nd = \sum_{i \in [N]} \lambda_i^2, \] thus by an averaging argument, at most \(4N/d\) eigenvalues of \(A\) have absolute value \(\geq d/2\). By the symmetry of the spectrum of \(A\) about 0, at most \(4n/d\) eigenvalues of \(L\) are \(\leq d/2\). We denote the precise amount by \(k\), and let \(E = \left\lbrace \vec{e}_1, \ldots, \vec{e}_N \right\rbrace\) be an orthonormal basis of eigenvectors of \(L\) with corresponding eigenvalues sorted: \[ 0 = \mu_1 \leq \cdots \leq \mu_k \leq \frac{d}{2} < \mu_{k+1} \leq \cdots \leq \mu_{N-k} < \frac{3d}{2} \leq \mu_{N-k+1} \leq \cdots \leq \mu_N = 2d. \]

Abusing notations slightly, for a set (or cut) labeled by a capital letter, we use the same letter in lowercase for its indicator vector and its coordinates over \(E\) (e.g., \(\vec{s}= \sum_{i \in [N]} s_i \vec{e}_i\) is the indicator vector of set \(S\)). Let \(U = \operatorname{span}(\left\lbrace \vec{e}_1, \ldots, \vec{e}_k \right\rbrace)\), intuitively for every small cut \(S\) with \(|\nabla(S)| \leq td\), \(\vec{s}\) is close to its projection \(\vec{u}\) onto \(U\). (Here we take \(d\) as a "unit" of measure because \(G\) is \(d\)-regular, and of course \(t \geq 1\) since even the smallest cut of \(G\) will have value \(\geq d\).) To be precise, \[ td \geq |\nabla(S)| = \vec{s}^T L \vec{s}= \sum_{i \in [N]} \mu_i s_i^2 \geq \frac{d}{2} \sum_{i=k+1}^N s_i^2 = \frac{d}{2} \bigl\lVert \vec{s}-\vec{u} \bigr\rVert_2^2, \] which implies \(\left\lVert \vec{s}-\vec{u} \right\rVert_2 \leq \sqrt{2t}\).
Now the goal becomes constructing a collection of cuts "dominating" \(U\). In [CDKP23], a brute-force method is chosen since (WLOG) a cut \(S\) has \(L^2\)-norm \(\left\lVert \vec{s} \right\rVert_2 \leq \sqrt{n}\): taking an \(\epsilon\)-net \[ \mathcal{E}:= \left\lbrace \vec{p}= \sum_{i \in [k]} x_i \vec{e}_i : x_1, \ldots, x_k \in \frac{\epsilon}{\sqrt{k}} \mathbb{Z}, \left\lVert \vec{p} \right\rVert_2 \leq \sqrt{n} \right\rbrace \] is acceptable by bounding its size with \(|\mathcal{E}| \leq (2\sqrt{nk} / \epsilon)^k = n^{O(1/\delta)}\). But it's not sufficient since the lattice points in \(\mathcal{E}\) may be not an indicator vector (i.e. 0-1 vector). Fortunately, simply rounding them to the nearest 0-1 vector (breaking ties arbitrarily is OK, and we choose \(1/2 \mapsto 1\) for consistency) will not introduce much error: suppose the nearest lattice point to \(\vec{u}\) is \(\vec{p}\) and \(p\) is rounded to \(\vec{c}\), we know \(\vec{c}\) is closer to \(\vec{p}\) than \(\vec{s}\) (noticing that \(\vec{s}\) is also an 0-1 vector), thus \(\left\lVert \vec{c}-\vec{p} \right\rVert_2 \leq \left\lVert \vec{s}-\vec{p} \right\rVert_2 \leq \sqrt{2t} + \epsilon\) by the triangle inequality, and furthur \(\left\lVert \vec{c}-\vec{s} \right\rVert_2 \leq 2\sqrt{2t} + 2\epsilon\). Let \(C\) be the cut corresponding to \(\vec{c}\) and pick \(\epsilon= \sqrt{2}\), \[ \left\lvert S \triangle C \right\rvert = \left\lVert \vec{s}-\vec{c} \right\rVert_2^2 \leq 8 (\sqrt{t} + 1)^2 \leq 32t. \] Also \(C\) is a small cut: \[ |\nabla(C)| \leq |\nabla(S)| + d\left\lvert S \triangle C \right\rvert \leq td + 32td = 33td. \]

Here we finsh the second part and state the lemma:

Lemma 2.1 ([CDKP23], 5). Let \(G = (V, E)\) be a \(d\)-regular bipartite graph on \(N = 2n\) vertices. There is a set \(\mathcal{C}^{\rm cut} \subseteq 2^V\) s.t.

  1. \(|\mathcal{C}^{\rm cut}| \leq n^{O(1/\delta)}\),
  2. \(\forall t \geq 1, S \subseteq V: \bigl( |\nabla(S)| \leq td \implies \exists C \in \mathcal{C}^{\rm cut}: |S \triangle C| \leq 32t \wedge |\nabla(C)| \leq 33td \bigr)\).

Moreover, \(\mathcal{C}^{\rm cut}\) can be constructed in time \(n^{O(1/\delta)}\).

This lemma is somehow similar to the graph container method: set of representatives "dominating" all objects with certain properties exists and can even be constructed.

Bounded (and Algorithmic) Dominance of "Principal" Cuts

In this section we show the third part. Let \(C_X := C \cap X\) be the \(X\)-part of "principal" cut \(C\) and \(C_Y\) likewise, our goal is to enumerate all small cuts \(S_A\) corresponding to some \(A \in \mathcal{A}(G)\) and is "dominated" by \(C\) (in particular, \(|S_A \triangle C| \leq 32t\) as stated in Lemma 2.1). For convience we relax the condition and get \(\max(|A \triangle C_X|, |N(A) \triangle C_Y|) \leq 32t\).

A straightforward observation is that every vertex in \(C_Y\) having \(>32t\) neighbors inside \(C_X\) is in \(N(A)\) (more precisely, \(N(A_X)\) where \(A_X := A \cap C_X\)), thus we can reduce the range of \(A\) by enumerating subsets \(S'_Y\) of \(S_Y := \left\lbrace v \in C_Y: |N(v) \cap C_X| \leq 32t \right\rbrace\) and exclude \(N(S'_Y)\) from \(C_X\).

Now we have a restriction on \(A_X\). Moreover, since \(|\nabla(C)|\) is small, the number of neighbors of \(A_X\) outside \(C_Y\) is upper bounded, and \(|N(A_X) \cap C_Y|\) is lower bounded: \[ |N(A_X) \cap C_Y| \geq |A_X| - \frac{|\nabla(C)|}{d} \geq |A| - |A \triangle C_X| - \frac{|\nabla(C)|}{d} = |A| - 64t \] For those possible \(u \in A \backslash C_X\), if \(N(u) \backslash C_Y\) is large, then the disjoint union \((N(u) \backslash C_Y) \cup (N(A_X) \cap C_Y)\) is also provably large. A bound is therefore implied by the \(t\)-contracting property of \(A\): \[ |A| + t \geq |N(A)| \geq \Bigl\lvert (N(u) \backslash C_Y) \cup (N(A_X) \cap C_Y) \Bigr\rvert \geq |N(u) \backslash C_Y| + |A| - 64t, \] that is, \(|N(u) \backslash C_Y| \leq 65t\).

Conclude the two parts, we can enumerate \(A\):

  1. inside \(C_X\), we enumerate \(S'_Y \subseteq S_Y\) and set \(A_X = C_X \backslash N(S'_Y)\); (if extra vertices are excluded, then either \(C_Y \backslash S_Y\) are not completely covered, or it's the case enumerating other values of \(S'_Y\).)
  2. outside \(C_X\), we enumerate \(S'_X \subseteq S_X := \left\lbrace u \in X \backslash C_X: |N(u) \backslash C_Y| \leq 65t \right\rbrace\) and set \(A = A_X \cup S'_X\).

The number of such \(A\)'s is thus bounded by \(2^{|S_X| + |S_Y|}\). Noticing that the vertices in the two sets all have lower bounded contributions to \(|\nabla(C)|\), we can hence bound the amount by choosing \(t\) suitably: \[ 33td \geq |\nabla(C)| \geq |S_X|(d - 65t) + |S_Y|(d - 32t) \geq \frac{d}{2} \bigl( |S_X| + |S_Y| \bigr) \] where we choose \(t < d/130\) to derive the last inequality.
Finally we have \(2^{|S_X| + |S_Y|} \leq 2^{66t}\) and state the following lemma with more general parameters:

Lemma 2.2 (modified from [CDKP23], 6). Fix any \(c, t \geq 1\). Given a cut \(C\) with \(|\nabla(C)| \leq (c+1)td\), the number of closed \(t\)-contracting subsets \(A \subseteq X\) s.t. \(\max(|A \triangle C_X|, |N(A) \triangle C_Y|) \leq ct\) is at most \(2^{\frac{c+1}{1 - (2c+1) t/d}t}\). Moreover, these sets can be enumerated in time \(2^{\frac{c+1}{1 - (2c+1) t/d} t} n^{O(1)}\).

By picking \(t \leq d/(4c+2)\) we have the bound \(2^{2(c+1)t}\), and inserting \(c = 32\) results in the bound we need.

Combining the three parts shown above, we reach the end:

Lemma 2.3 ([CDKP23], 2). For any \(d\)-regular bipartite graph \(G = (X, Y, E)\) and \(t_0 \leq 2^{-8} d\), the set \(\mathcal{A}(G)\) has size \(n^{O(1/\delta)} \cdot 2^{66t_0}\) and can be enumerated in the same time.

References

  • [CDKP23] Charlie Carlson, Ewan Davies, Alexandra Kolla, and Aditya Potukuchi. Approximately Counting Independent Sets in Dense Bipartite Graphs via Subspace Enumeration. SODA 2023, [CoRR abs/2307.09533]
  • [JPP21] M. Jenssen, W. Perkins, and A. Potukuchi. Approximately counting independent sets in bipartite graphs via graph containers. Random Structures & Algorithms, 63(1):215–241, 2023. doi:10.1002/rsa.21145, [CoRR abs/2109.03744].