Topics in Probabilistic Methods for Discrete Mathematics

** Hemanshu Kaul**

I am teaching a semester-long graduate seminar course based on this course proposal, in Fall 2005. Participants include Faculty members from Mathematics, Computer Science, and Business School; and Graduate students from Mathematics, Computer Science, Electrical and Computer Engineering, Mechanical and Industrial Engineering, and Physics.

**Introduction.** Although combinatorial problems have arisen in all
fields of mathematics throughout history, discrete mathematics has
only come to prominence as a mathematical discipline over the past
century. It has been beneficial to adapt techniques from more mature
areas of mathematics to tackle various combinatorial problems. A
fruitful area of collaboration has been with probability. `The
probabilistic method' is now a well established part of the graduate
curriculum in combinatorics.

Over the past two decades additional techniques with a probabilistic flavor have been developed for applications to combinatorial problems, both algorithmic and existential. A traditional first course in probabilistic combinatorics gives only the briefest hint of these areas from probability and information theory, such as Concentration of Measure, Entropy, and Rapidly Mixing Markov Chains. At UIUC, the courses on 'Methods in Combinatorics' and 'Applied Probability' do not teach these topics except for some large deviation inequalities. Moreover, the focus in Probability courses tends to be on the abstract development of the theory, which can obscure the applicability of these methods in discrete mathematics. This course will introduce graduate students to these methods and provide applications in graph theory, combinatorics, combinatorial optimization, and theoretical computer science. This course should be of interest to graduate students in Combinatorics, Probability, Operations Research, Theoretical Computer Science, and ECE. The topics are discussed in more detail below.

**Concentration of Measure.** Inequalities
for concentration of measure are vital tools in probabilistic
combinatorics, probabilistic analysis of algorithms, randomized
algorithms, and stochastic combinatorial optimization. They show
that the probability of a random variable being far from its mean
(or median) is exponentially small, and they give bounds on
probabilities of rare events. The topics include the following:
introduction to concentration of measure in metric spaces and its
relation to isoperimetric inequalities, Chernoff--Hoeffding bounds
for sums of random variables and their generalizations, McDiarmid's
method of bounded differences for Lipschitz bounded functions and
its variants (leading to the Azuma--Hoeffding Martingale
Inequality), and isoperimetric inequalities under Hamming metric
(leading to Talagrand's convex distance isoperimetric inequality and
its variants). The focus is on developing the themes underlying the
various methods and illustrating the final results through
applications in graph theory, combinatorial optimization and
theoretical computer science. The main references, in addition to the instructor's lecture notes, include :

- N. Alon, J. Spencer,
*The Probabilistic Method*, 2nd ed., (Academic Press 2000). (esp. Chapter 7) - M. Habib, C. McDiarmid, J. Ramirez-Alfonsin, B. Reed,
*Probabilistic Methods for Algorithmic Discrete Mathematics*, (Springer, 1998). (esp. McDiarmid, Concentration, 195--248) - S. Janson, On concentration of probability, In
*Contemporary Combinatorics*, ed. B. Bollobas, Bolyai Society Mathematical Studies 10 (2002), 289--301. - C. McDiarmid, On the method of bounded differences, In
*Surveys in Combinatorics*, LMS lecture note series 141 (1989), 148--188. - J.M. Steele,
*Probability theory and Combinatorial Optimization*, (SIAM, 1997). - M. Talagrand, Concentration of measure and isoperimetric
inequalities in product spaces,
*Publ. Math. IHES*81 (1995), 73--205.

**Entropy.** Entropy of a random variable
measures the amount of uncertainty in the random variable or the
amount of information obtained when the random variable is revealed.
In the last decade, entropy has been applied to provide short and
elegant proofs for various counting and covering problems in graphs
and set systems. After introducing the elementary properties of the
entropy function and Shearer's Entropy Lemma, the focus will be on
combinatorial applications giving bounds on various extremal
problems. Examples include Bregman's bound on the permanent of a
0,1-matrix, bounds on the size of an intersecting family of graphs
and on the number of copies of a fixed subgraph, Dedekind's problem
on the number of monotone Boolean functions, and covering a complete
r-uniform hypergraph with a small number of r-partite
hypergraphs. We will also consider Friedgut's generalization of
Shearer's Lemma, leading to a common generalization of classical
inequalities such as those of Cauchy-Schwarz, Holder, etc. The main references, in addition to the instructor's lecture notes, include :

- I. Csiszar, J. Korner,
*Information Theory*, (Academic Press, 1981). - E. Friedgut, Hypergraphs, entropy and inequalities,
*The American Mathematical Monthly*111 (2004), 749--760. - E. Friedgut, J. Kahn, On the number of copies of one hypergraph
in another,
*Israel Journal of Mathematics*105 (1998), 251--256. - D. Galvin, P. Tetali, On weighted graph homomorphisms,
*DIMACS-AMS Special Volume*63 (2004), 97--104. - J. Radhakrishnan, Entropy and counting, In
*Computational Mathematics, Modelling and Algorithms*, ed. J.C. Mishra, (Narosa Publishers, New Delhi, 2003). - G. Simonyi, Graph entropy - a survey, In
*Combinatorial Optimization*, DIMACS Series Discrete Math. Theoret. Comput. Sci. 20 (A.M.S., 1995), 399--441.

**Rapidly Mixing Markov Chains.** Over the
past decade the Markov chain Monte Carlo (MCMC) method has emerged
as a powerful methodology for approximate counting, computing
multidimensional volumes and integrals, and combinatorial
optimization. The MCMC method reduces these problems to sampling
over an underlying set (of solutions or combinatorial structures)
w.r.t. a given distribution. This sampling is done by a Markov
Chain, on the underlying set, that converges to the required
(stationary) distribution. The primary step in the rigorous analysis
of such an MCMC algorithm is to show that the Markov chain is
rapidly mixing, i.e., it has a high rate of convergence to its
stationary distribution. This analysis tends to be an interesting
mix of probability and combinatorics. The topics will include the
equivalence of (approximate) counting and (almost) uniform sampling,
the relation between Fully Polynomial Randomized Approximation
Schemes (FPRAS) and rapid mixing of Markov Chains, and the study of
various methods for bounding the mixing rates of combinatorially
defined Markov Chains. These methods, including coupling,
conductance, and canonical paths, will be used in applications of
the MCMC method to the Knapsack problem, proper colorings of a
graph, linear extensions of a poset, permanent of a 0,1-matrix,
etc. The main references, in addition to the instructor's lecture notes, include :

- M. Jerrum, Mathematical foundations of the Markov chain Monte Carlo method, In
*Probabilistic Methods for Algorithmic Discrete Mathematics*, (Springer, 1998), 116--165. - M. Jerrum,
*Counting, Sampling and Integrating: Algorithms and Complexity*, Lectures in Mathematics (ETH Zurich, 2003). - M. Jerrum, A. Sinclair, The Markov chain Monte Carlo method: an approach to
approximate counting and integration, In
*Approximation Algorithms for NP-hard Problems*, ed. D Hochbaum, (PWS 1996), 482--520. - L. Lovasz, Random walks on graphs: a survey, In
*Combinatorics, Paul Erdos is Eighty Vol. 2*, ed. D. Miklos, V. T. Sos, T. Szonyi, (Bolyai Society, 1996), 353--398.