March 18

Add to Calendar 2025-03-18 16:15:00 2025-03-18 17:15:00 America/New_York Rapid Mixing at the Uniqueness Threshold Over the past decades, a fascinating computational phase transition has been identified in sampling from Gibbs distributions. However, the computational complexity at the critical point remains poorly understood, as previous algorithmic and hardness results all required a constant slack from this threshold. In this talk, I will introduce our recent result on resolving this open question at the critical threshold, thus completing the picture of the computational phase transition. Specifically, we show that for the critical hardcore model, the mixing time of Glauber dynamics is always polynomial and in the worst case super-linear in the number of vertices. Similar results are established in the critical Ising model. Based on joint work with Xiaoyu Chen, Yitong Yin, and Xinyuan Zhang. TBD

March 11

Add to Calendar 2025-03-11 16:15:00 2025-03-11 17:15:00 America/New_York Overparametrized systems: from Smale's 17th problem to two-layer neural networks Training modern machine learning models requires to optimize highly non-convex risk function and yet simple gradient-based methods are able to find global minima for very high-dimensional problems. Randomness and overparametrization are likely to have something to do with this magic.In the 90s, Stephen Smale conjectured that a system of n random polynomial equations in n complex unknowns can be solved in average-case polynomial time.I will present recent progress on the real-variables version of this problem and characterize the overparametrization that is required in that case to make the problem efficiently solvable.I will then show that closely related ideas can be used to analyze two-layer neural networks. In that case, reaching a global minimum (in absence of regularization) can be problematic from a statistical viewpoint, leading to overfitting. [Based on joint papers with Eliran Subag and with Pierfrancesco Urbani] TBD

March 04

Add to Calendar 2025-03-04 16:15:00 2025-03-04 17:15:00 America/New_York Pseudorandom Correlation Generators Correlated secret randomness is an important resource for many cryptographic applications. We initiate the study of pseudorandom correlation generators (PCGs), an analog of pseudorandom generators to the case of multi-party correlations, which offer the ability to securely generate large sources of correlated randomness from short correlated seeds using only local computation. In this talk, we survey the state of the art in PCGs, including constructions and usage within secure computation. TBD

February 25

Add to Calendar 2025-02-25 16:15:00 2025-02-25 17:15:00 America/New_York Good Locally Testable Codes An error-correcting code is locally testable (LTC) if there is a random tester that reads only a small number of bits of a given word and decides whether the word is in the code, or at least close to it.A long-standing problem asks if there exists such a code that also satisfies the golden standards of coding theory: constant rate and constant distance. Unlike the classical situation in coding theory, random codes are not LTC, so this problem is a challenge of a new kind.We construct such codes based on what we call (Ramanujan) Left/Right Cayley square complexes. These objects seem to be of independent group-theoretic interest. The codes built on them are 2-dimensional versions of the expander codes constructed by Sipser and Spielman (1996).The main result and lecture will be self-contained. But we hope also to explain how the seminal work of Howard Garland (1972) on the cohomology of quotients of the Bruhat-Tits buildings of p-adic Lie group has led to this construction (even though it is not used at the end).Based on joint work with I. Dinur, S. Evra, R. Livne, and S. Mozes. TBD

February 18

Add to Calendar 2025-02-18 16:15:00 2025-02-18 17:15:00 America/New_York On the Complexity of Neural Computation in Superposition Recent advances in neural networks interpretability suggest that superposition, the ability of a network to represent many more features than it has neurons, is a key mechanism underlying how neural networks compute. This work explores the theoretical foundations of computing in superposition, focusing on explicit, provably correct algorithms and their efficiency.We introduce a lower bound technique that provides the first general lower bounds for neural network size.  Using this technique, we show that for a broad class of problems, including permutations and pairwise logical operations, a neural network requires at least Omega(sqrt{m' log m'}) neurons and Omega(m' log m') parameters, where m' is the number of output features being computed. This establishes a limit to how much superposition a neural network can exhibit.  Conversely, we prove a nearly matching upper bound: fundamental logical operations, such as computing a set of m’ pairwise AND operations in parallel, can be computed using O(sqrt{m'} log m') neurons and O(m' log^2 m') parameters.These results have several implications.  They demonstrate an exponential gap between computing in superposition (the focus of this work) and merely representing the features in superposition, which requires only O(log m') neurons by the Johnson-Lindenstrauss Lemma.  Moreover, our lower bound implies that any of the methods used in practice to compress neural networks must produce a model with at least Omega(m' log m') parameters, regardless of the initial dense network size.  We hope that our results open a path for applying complexity-theoretic techniques to future research on neural network interpretability.Joint work with Nir Shavit. TBD

February 11

Add to Calendar 2025-02-11 16:15:00 2025-02-11 17:15:00 America/New_York "Local-to-Global" Theorems on High Dimensional Expanders Expansion in graphs is a well studied topic, with a wealth of applications in many areas of mathematics and the theory of computation.High dimensional expansion is a generalization of expansion from graphs to higher dimensional objects, such as simplicial complexes or partially ordered sets.High dimensional expanders are still far from understood, but one fascinating new aspect is how global properties emerge from local ones. This phenomenon has already led to progress on longstanding questions in the areas of error-correcting codes, and Probabilistically Checkable Proofs (PCPs).I will describe the two key definitions: cosystolic expansion and link expansion. I will then describe how these relate to some of the exciting new applications. TBD

December 10

Add to Calendar 2024-12-10 16:15:00 2024-12-10 17:15:00 America/New_York Coboundary Expansion Inside Chevalley Coset Complex HDXs Refreshments at 4:00 PMAbstract:Recent major results in property testing and PCPs were unlocked by moving to high-dimensional expanders (HDXs) constructed from C_d-type buildings, rather than the long-known A_d-type ones. At the same time, these building quotient HDXs are not as easy to understand as the more elementary (and more symmetric/explicit) coset complex HDXs constructed by Kaufman-Oppenheim [KO18] (of A-d-type) and O’Donnell-Pratt [OP22] (of Bd-, Cd-, Dd-type). Motivated by these considerations, we study the B_3-type generalization of a recent work of Kaufman-Oppenheim [KO21], which showed that the A_3-type coset complex HDXs have good 1-coboundary expansion in their links, and thus yield 2-dimensional topological expanders. The crux of Kaufman-Oppenheim’s proof of 1-coboundary expansion was: (1) identifying a group-theoretic result by Biss and Dasgupta on small presentations for the A_3-unipotent group over F_q; (2) "lifting" it to an analogous result for an A_3-unipotent group over polynomial extensions F_q[x]. For our B_3-type generalization, the analogue of (1) appears to not hold. We manage to circumvent this with a significantly more involved strategy: (1) getting a computer-assisted proof of vanishing 1-cohomology of B_3-type unipotent groups over F_5; (2) developing significant new "lifting" technology to deduce the required quantitative 1-cohomology results in B_3-type unipotent groups over F_{5^k}[x].Joint work with Noah G. Singer (Carnegie Mellon). 32-G449 (Kiva/Patil)

December 03

Add to Calendar 2024-12-03 16:15:00 2024-12-03 17:15:00 America/New_York A New Approach to Optimal Spectral Gaps Refreshments at 4:00 PMAbstract:It was conjectured by Alon in the 1980s that random d-regular graphs have the largest possible spectral gap (up to negligible error) among all d-regular graphs. This conjecture was proved by Friedman in 2004 in major tour de force. In recent years, deep generalizations of Friedman's theorem, such as strong convergence of random permutation matrices due to Bordenave and Collins, have played a central role in a series of breakthrough results on random graphs, geometry, and operator algebras.In joint work with Chen, Garza-Vargas, and Tropp, we recently discovered a surprisingly simple new approach to such results that is almost entirely based on soft arguments. This approach makes it possible to address previously inaccessible questions: for example, it enables a sharp understanding of the large deviation probabilities in Friedman's theorem, and establishes optimal spectral gaps of random Schreier graphs that require many fewer random bits than ordinary random regular graphs. I will aim to explain some of these results and some intuition behind the proofs. 32-124

November 26

Add to Calendar 2024-11-26 16:15:00 2024-11-26 17:15:00 America/New_York Recent Advances in Differential Privacy under Continual Observation Refreshments at 4:00 PMAbstract:Differential privacy is one of the most popular definitions of privacy, being used both in research as well as in industrial applications. The multi-query setting, where many queries have to be answered over a static data set, is already well understood for many algorithmic questions. The dynamic setting, where updates to the data set are interleaved with queries over the current data set, was introduced in 2010 by Dwork, Naor, Pitassi, and Rothblum, who called it differential privacy under continual observation. While there was not much work on that model in the 2010s, it has received significant attention in recent years, partially due to its use in private machine learning. I will survey the state-of-the-art in the continual observation setting and explain its motivation as well as the main algorithmic techniques that have led to the recent advances. 32-G449

November 19

Add to Calendar 2024-11-19 16:15:00 2024-11-19 17:15:00 America/New_York Vizing’s Theorem in Near-Linear Time Refreshments at 4:00 PMAbstract:In his classic result, Vizing (1964) proved that any graph of maximum degree ∆ can be edge colored using at most ∆+1 different colors. Vizing’s original proof is algorithmic and shows that such an edge coloring can be found in O(mn) time where m and n denote the number of edges and vertices respectively. This was subsequently improved to O~(m \sqrt{n}) independently by [Arjomandi ’82; Gabow et al. ’85]. This bound remained state-of-the-art for four decades, and only recently got improved to O~(min{n^2, mn^{1/4}}) [Assadi ’24; Bhattacharya et al. ’24].In this talk, I will present a completely new approach to edge coloring that leads to the first near-linear—in fact O(m log ∆)—time algorithm for Vizing’s theorem.Based on a recent joint work with Sepehr Assadi, Sayan Bhattacharya, Martın Costa, Shay Solomon, and Tianyi Zhang. 32-G449

October 08

Add to Calendar 2024-10-08 16:15:00 2024-10-08 17:15:00 America/New_York Learning to Defer in Content Moderation: The Human-AI Interplay Refreshments at 4:00 PMAbstract: Ensuring successful content moderation is vital for a healthy online social platform where it is necessary to responsively remove harmful posts without jeopardizing non-harmful content. Due to the high-volume nature of online posts, human-only moderation is operationally challenging, and platforms often employ a human-AI collaboration approach. A typical heuristic estimates the expected harmfulness of incoming posts and uses fixed thresholds to decide whether to remove the post and whether to send it for human review. This disregards the uncertainty in the machine learning estimation, the time-varying element of human review capacity and post arrivals, and the selective sampling in the dataset (humans only review posts filtered by the admission algorithm). We introduce a model to capture this human-AI interplay. Our algorithm observes contextual information for posts, makes classification and admission decisions, and schedules posts for human review. Only admitted posts receive human reviews on their harmfulness. These reviews help educate the machine-learning algorithms but are delayed due to congestion in the human review system. We propose a near-optimal learning algorithm that balances the classification loss from a selectively sampled dataset, the idiosyncratic loss of non-reviewed posts, and the delay loss of having congestion in the human review system. To the best of our knowledge, this is the first result for online learning in contextual queueing systems and hence our analytical framework may be of independent interest. This talk is based on joint work with Wentao Weng (Ph.D. student at MIT EECS). A preprint of the corresponding paper can be found here: https://arxiv.org/pdf/2402.12237. This work has been selected as a finalist in the 2024 INFORMS Junior Faculty Interest Group (JFIG) Paper Competition. 32-G449

September 24

Add to Calendar 2024-09-24 16:15:00 2024-09-24 17:15:00 America/New_York Near Optimal Alphabet-Soundness Tradeoff PCPs Refreshments at 4:00 PMAbstract: We show a nearly optimal alphabet-soundness tradeoff for NP-hardness of 2-Prover-1-Round Games (2P1R). More specifically, we show that for all \eps > 0, for sufficiently large M, it is NP-hard to decide whether a 2P1R instance of alphabet size M has value nearly 1 or at most M^{-1+\eps}. 2P1R are equivalent to 2-Query PCP, and are widely used in obtaining hardness of approximation results. As such, our result implies the following: 1) hardness of approximating Quadratic Programming within a factor of nearly log(n), 2) hardness of approximating d-bounded degree 2-CSP within a factor of nearly d/2, and 3) improved hardness of approximation results for various k-vertex connectivity problems. For the first two applications, our results nearly match the performance of the best known algorithms.Joint work with Dor Minzer. 32-G449

September 17

Add to Calendar 2024-09-17 16:15:00 2024-09-17 17:15:00 America/New_York Learning in Strategic Environments: from Calibrated Agents to General Information Asymmetry Abstract: In this talk I will discuss learning in principal-agent games where there is information asymmetry between what the principal and what the agent know about each other’s chosen actions. I will introduce a generalization of the standard Stackelberg Games (SGs) framework: Calibrated Stackelberg Games (CSGs). In CSGs, a principal repeatedly interacts with an agent who (contrary to standard SGs) does not have direct access to the principal’s action but instead best-responds to calibrated forecasts about it. I will show that in CSGs, the principal can achieve utility that converges to the optimum Stackelberg value of the game (i.e., the value that they could achieve had the agent known the principal’s strategy all along) both in finite and continuous settings, and that no higher utility is achievable. Finally, I will discuss a meta-question: when learning in strategic environments, can agents overcome uncertainty about their preferences to achieve outcomes they could have achieved absent any uncertainty? And can they do this solely through interactions with each other?Based on joint works with Nivasini Ananthakrishnan (UC Berkeley), Nika Haghtalab (UC Berkeley), and Kunhe Yang (UC Berkeley). 32-G449