Probability and Stochastic Analysis - University of Bonn

# Seminar on random constraint satisfaction

## Tuesdays 10-12, N0.008

Seminar schedule
DateName
30.5Julia Rosenzweig
30.5Marcus Rockel
20.6Albert Solá Vilalta
27.6Lukas Polten
27.6Steffen Böhmer
4.7Malte Grändorf
11.7Andreas Lietz
18.7Falko Hegerfeld

A constraint satisfaction problem is a kind of combinatorial problem with "hard" constraints. Rather than trying to give a general definition, let's consider some examples:

1. Graph coloring: you're given a graph, and a paintbox containing q different colors. Your task is to color the vertices of the graph so that no edge of the graph has two endpoints of the same color. If you succeed, the result is called a proper coloring of the graph.

2. Independent sets: given a graph, an independent set is a subset of the vertices, no two of which are connected by an edge. Independent sets are always easy to find (for example, any set of size one is an independent set), so the point is usually to find a large one.

3. k-SAT, or k-CNF: you're given a boolean formula, in conjunctive normal form. That means you have a bunch of variables (n, say) that can be true or false. A k-clause is a boolean formula of the form "x_1 or x_2 or ... or (not x_k)", where x_1 through x_k are some of the variables. A k-CNF formula is a formula of the form "C_1 and C_2 and ... and C_m", where C_1 through C_m are k-clauses. The main question is whether such a formula is satisfiable, meaning that there is some assignment to the variables that makes the whole formula true. (This example might seem a bit artificial compared to the previous two, but satisfiability of k-CNFs is a fundamentally important problem in theoretical computer science. For example, satisfiability of 3-CNFs is the "original" NP-complete problem.)

A random contraint satisfaction problem is just one where we choose the instance at random. For example, what's the probability that a random graph is q-colorable? What does a random independent set look like? Given a random CNF formula, what's the probability that it's satisfiable?

Why random constraint satisfaction? Constraint satisfaction problems are often computationally hard to solve in the worst case, but hard instances don't tend to pop up in practice. To understand this phenomemon, computer scientists started to study random instances. It turned out that random instances display striking behavior, much of which is not understood. For example, here's a famous open problem:

The random 3-SAT conjecture: There is a constant c (approximately 4.2) such that for any eps > 0,

• a random 3-CNF formula with n variables and (c + eps) n clauses is unsatisfiable with high probability, and
• a random 3-CNF formula with n variables and (c - eps) n clauses is satisfiable with high probability.

## Contents of this course

The students in this seminar course will read and present research articles (published between the early 2000's up until last year) on random constraint satisfaction problems. Some of these papers are quite technical, or require some background; these may be read in small teams.

We will try to cover at least the following three themes:

The second moment method is a classical method for proving the existence of combinatorial structures. Unfortunately, the "vanilla" method doesn't tend to do much for random constraint satisfaction problems. Nevertheless, some variants (such as the "weighted" second moment method [1] and the "conditional" second moment method [2]) have turned out useful.

Fourier analysis of boolean functions turns out to be surprisingly powerful [3, 4] for proving the existence of "sharp thresholds," like the one in the random 3-SAT conjecture above. One nice thing about these arguments is that they are very general, and don't require you to delve much into the gory details of each specific problem.

The cavity method is a heuristic, coming from statistical physics, for predicting the behavior of large random constraint satisfaction problems. Essentially, you pretend that there is rather more independence in the system than there really is. Recent work (e.g. [5, 6]) has made important strides in rigorously confirming some of these predictions.

## Prerequisites

The main prerequisite for this course is a strong background in (discrete) probability, and some familiarity with graph theory or combinatorics.

## References

This is a list of references for the discussion above, and also a list of papers that you could possibly present on. If a paper is crossed out, it means that someone has already asked to present it. This list is not exhaustive. If you're looking for something a little bit different, or if someone has already taken a topic that you're interested in, email me. I can suggest more papers.

[1] Achlioptas, Dimitris, and Yuval Peres. "The threshold for random k-SAT is 2^k log 2 - O(k)." Journal of the American Mathematical Society 17.4 (2004): 947-973.

[2] Weitz, Dror. "Counting independent sets up to the tree threshold." Proceedings of the thirty-eighth annual ACM symposium on Theory of computing. ACM, 2006.

[3] Friedgut, Ehud, and Jean Bourgain. "Sharp thresholds of graph properties, and the k-sat problem." Journal of the American mathematical Society 12.4 (1999): 1017-1054.

[4] Achlioptas, Dimitris, and Ehud Friedgut. "A sharp threshold for k-colorability." Random Structures and Algorithms 14.1 (1999): 63-70.

[5] Ding, Jian, Allan Sly, and Nike Sun. "Proof of the satisfiability conjecture for large k." Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing. ACM, 2015.

[6] Sly, Allan, Nike Sun, and Yumeng Zhang. "The number of solutions for random regular NAE-SAT." Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on. IEEE, 2016.