In this post, we briefly explain the main idea of [CDKP23], which uses a novel spectral method named "subspace enumeration" to solve the extensively studied task of approximately counting the number of independent sets on a given bipartite graph .
The purpose of this post is only to sketch the proof (and its novel idea) in an easy-to-understand way. For more details, please refer to the original paper.
Overview
The problem setting we will discuss about in this post is -regular (and connected WLOG) bipartite graph where .
The -regular condition is often required in approximate counting problems, and here we will additionally utilize its spectral properties.
First we describe some common approaches towards counting the number of independent sets on . For a possible independent set , enumerating its -part gives
Since there might be many subsets having the same neighborhood, it seems a more efficient way to enumerate and compute the contribution of those having as (exact) neighborhood together. Denote the collection of these subsets by , then is a join-semilattice, implying there exists a greatest element s.t. every with is a subset of . We call such subsets closed, and the minimum closed subset containing the closure of , denoted by . Now we enumerate closed subsets (of ) instead of subsets of , which may contain many "useless" ones (i.e. ):
Trying to proceed further, we will soon find that two vertices may be "correlated" (i.e., have common neighbors), which leads to a difference between and . A natural attempt to resolve the term is to consider sets of vertices with no correlation between separately. Formally, we say two vertices are 2-linked if . A 2-linked subset is called a polymer, and polymers are compatible if they are not 2-linked (i.e., not correlated). Let be all polymers in . With this characterization of correlated vertices, we can uniquely partition into compatible polymers (also abbreviated as components in below, denoted by ) s.t. . Observing that is closed iff all its components are closed,
A standard step then is to make a discussion on the expansion of (the components of) . We call a subset -expanding if , and correspondingly -contracting if . The subsets with stronger expansion tend to have (exponentially) smaller weights, so we can only enumerate the -contracting polymers (which contributes the main part), then approximate the sum of -expanding polymers' weights. With the "threshold parameter" properly picked to ensure a provably reasonable deduction,
With an additional notation denoting the collection of all with all components closed and -contracting, we have the final form
Now the task of computing splits into three parts:
- enumerate ,
- (approximately) compute for all possible ,
- (approximately) compute for all possible .
The last two parts are mostly identical to that in [JPP21] and do not introduce much novelty, so here we omit the details (maybe I will write notes about [JPP21] and complete these parts in the future) and focus on the first one. To enumerate efficiently, the subspace enumeration technique is introduced. As a brief outline, the proof consists of three pieces:
- every corresponds to a small cut,
- a collection of "principal" cuts can be efficiently constructed using subspace enumeration s.t. every small cut is close to some "principal" cut,
- given a "principal" cut , we can enumerate those small cuts being close to .
By the term cut we refer to a subset , with its value defined as the number of edges with precisely one endpoint in .
The first part holds naturally by setting cut , where since every is -contracting.
The Subspace Enumeration Method
The second part starts from a spectral observation: can be written as , where denotes the indicator vector of with ground set , and denotes the Laplacian matrix of . For a -regular , thus by an averaging argument, at most eigenvalues of have absolute value . By the symmetry of the spectrum of about 0, at most eigenvalues of are . We denote the precise amount by , and let be an orthonormal basis of eigenvectors of with corresponding eigenvalues sorted:
Abusing notations slightly, for a set (or cut) labeled by a capital letter, we use the same letter in lowercase for its indicator vector and its coordinates over (e.g., is the indicator vector of set ). Let , intuitively for every small cut with , is close to its projection onto . (Here we take as a "unit" of measure because is -regular, and of course since even the smallest cut of will have value .) To be precise, which implies .
Now the goal becomes constructing a collection of cuts "dominating" . In [CDKP23], a brute-force method is chosen since (WLOG) a cut has -norm : taking an -net is acceptable by bounding its size with . But it's not sufficient since the lattice points in may be not an indicator vector (i.e. 0-1 vector). Fortunately, simply rounding them to the nearest 0-1 vector (breaking ties arbitrarily is OK, and we choose for consistency) will not introduce much error: suppose the nearest lattice point to is and is rounded to , we know is closer to than (noticing that is also an 0-1 vector), thus by the triangle inequality, and furthur . Let be the cut corresponding to and pick , Also is a small cut:
Here we finsh the second part and state the lemma:
Lemma 2.1 ([CDKP23], 5). Let be a -regular bipartite graph on vertices. There is a set s.t.
- ,
- .
Moreover, can be constructed in time .
This lemma is somehow similar to the graph container method: set of representatives "dominating" all objects with certain properties exists and can even be constructed.
Bounded (and Algorithmic) Dominance of "Principal" Cuts
In this section we show the third part. Let be the -part of "principal" cut and likewise, our goal is to enumerate all small cuts corresponding to some and is "dominated" by (in particular, as stated in Lemma 2.1). For convience we relax the condition and get .
A straightforward observation is that every vertex in having neighbors inside is in (more precisely, where ), thus we can reduce the range of by enumerating subsets of and exclude from .
Now we have a restriction on . Moreover, since is small, the number of neighbors of outside is upper bounded, and is lower bounded: For those possible , if is large, then the disjoint union is also provably large. A bound is therefore implied by the -contracting property of : that is, .
Conclude the two parts, we can enumerate :
- inside , we enumerate and set ; (if extra vertices are excluded, then either are not completely covered, or it's the case enumerating other values of .)
- outside , we enumerate and set .
The number of such 's is thus bounded by . Noticing that the vertices in the two sets all have lower bounded contributions to , we can hence bound the amount by choosing suitably: where we choose to derive the last inequality.
Finally we have and state the following lemma with more general parameters:
Lemma 2.2 (modified from [CDKP23], 6). Fix any . Given a cut with , the number of closed -contracting subsets s.t. is at most . Moreover, these sets can be enumerated in time .
By picking we have the bound , and inserting results in the bound we need.
Combining the three parts shown above, we reach the end:
Lemma 2.3 ([CDKP23], 2). For any -regular bipartite graph and , the set has size and can be enumerated in the same time.
References
- [CDKP23] Charlie Carlson, Ewan Davies, Alexandra Kolla, and Aditya Potukuchi. Approximately Counting Independent Sets in Dense Bipartite Graphs via Subspace Enumeration. SODA 2023, [CoRR abs/2307.09533]
- [JPP21] M. Jenssen, W. Perkins, and A. Potukuchi. Approximately counting independent sets in bipartite graphs via graph containers. Random Structures & Algorithms, 63(1):215–241, 2023. doi:10.1002/rsa.21145, [CoRR abs/2109.03744].