Terrence Tao
On product representations of squares
I’ve just uploaded to the arXiv my paper “On product representations of squares“. This short paper answers (in the negative) a (somewhat obscure) question of Erdös. Namely, for any , let be the size of the largest subset of with the property that no distinct elements of multiply to a square. In a paper by Erdös, Sárközy, and Sós, the following asymptotics were shown for fixed :
- .
- .
- .
- for .
- for .
- for .
In the end, the argument turned out to be relatively simple; no advanced results from additive combinatorics, graph theory, or analytic number theory were required. I found it convenient to proceed via the probabilistic method (although the more combinatorial technique of double counting would also suffice here). The main idea is to generate a tuple of distinct random natural numbers in which multiply to a square, and which are reasonably uniformly distributed throughout , in that each individual number is attained by one of the random variables with a probability of . If one can find such a distribution, then if the density of is sufficienly close to , it will happen with positive probability that each of the will lie in , giving the claim.
When , this strategy cannot work, as it contradicts the arguments of Erdös, Särközy, and Sós. The reason can be explained as follows. The most natural way to generate a triple of random natural numbers in which multiply to a square is to set
for some random natural numbers . But if one wants all these numbers to have magnitude , one sees on taking logarithms that one would need which by elementary linear algebra forces so in particular each of the would have a factor comparable to . However, it follows from known results on the “multiplication table problem” (how many distinct integers are there in the multiplication table?) that most numbers up to do not have a factor comparable to . (Quick proof: by the Hardy–Ramanujan law, a typical number of size or of size has factors, hence typically a number of size will not factor into two factors of size .) So the above strategy cannot work for .However, the situation changes for larger . For instance, for , we can try the same strategy with the ansatz
Whereas before there were three (approximate) equations constraining three unknowns, now we would have four equations and six unknowns, and so we no longer have strong constraints on any of the . So in principle we now have a chance to find a suitable random choice of the . The most significant remaining obstacle is the Hardy–Ramanujan law: since the typically have prime factors, it is natural in this case to choose each to have prime factors. As it turns out, if one does this (basically by requiring each prime to divide with an independent probability of about , for some small , and then also adding in one large prime to bring the magnitude of the to be comparable to ), the calculations all work out, and one obtains the claimed result.
Erratum for “An inverse theorem for the Gowers U^{s+1}[N]-norm”
The purpose of this post is to report an erratum to the 2012 paper “An inverse theorem for the Gowers -norm” of Ben Green, myself, and Tamar Ziegler (previously discussed in this blog post). The main results of this paper have been superseded with stronger quantitative results, first in work of Manners (using somewhat different methods), and more recently in a remarkable paper of Leng, Sah, and Sawhney which combined the methods of our paper with several new innovations to obtain quite strong bounds (of quasipolynomial type); see also an alternate proof of our main results (again by quite different methods) by Candela and Szegedy. In the course of their work, they discovered some fixable but nontrivial errors in our paper. These (rather technical) issues were already implicitly corrected in this followup work which supersedes our own paper, but for the sake of completeness we are also providing a formal erratum for our original paper, which can be found here. We thank Leng, Sah, and Sawhney for bringing these issues to our attention.
Excluding some minor (mostly typographical) issues which we also have reported in this erratum, the main issues stemmed from a conflation of two notions of a degree filtration
of a group , which is a nested sequence of subgroups that obey the relation for all . The weaker notion (sometimes known as a prefiltration) permits the group to be strictly smaller than , while the stronger notion requires and to equal. In practice, one can often move between the two concepts, as is always normal in , and a prefiltration behaves like a filtration on every coset of (after applying a translation and perhaps also a conjugation). However, we did not clarify this issue sufficiently in the paper, and there are some places in the text where results that were only proven for filtrations were applied for prefiltrations. The erratum fixes this issues, mostly by clarifying that we work with filtrations throughout (which requires some decomposition into cosets in places where prefiltrations are generated). Similar adjustments need to be made for multidegree filtrations and degree-rank filtrations, which we also use heavily on our paper.In most cases, fixing this issue only required minor changes to the text, but there is one place (Section 8) where there was a non-trivial problem: we used the claim that the final group was a central group, which is true for filtrations, but not necessarily for prefiltrations. This fact (or more precisely, a multidegree variant of it) was used to claim a factorization for a certain product of nilcharacters, which is in fact not true as stated. In the erratum, a substitute factorization for a slightly different product of nilcharacters is provided, which is still sufficient to conclude the main result of this part of the paper (namely, a statistical linearization of a certain family of nilcharacters in the shift parameter ).
Again, we stress that these issues do not impact the paper of Leng, Sah, and Sawhney, as they adapted the methods in our paper in a fashion that avoids these errors.
Notes on the B+B+t theorem
A recent paper of Kra, Moreira, Richter, and Robertson established the following theorem, resolving a question of Erdös. Given a discrete amenable group , and a subset of , we define the Banach density of to be the quantity
where the supremum is over all Følner sequences of . Given a set in , we define the restricted sumset to be the set of all pairs where are distinct elements of .Theorem 1 Let be a countably infinite abelian group with the index finite. Let be a positive Banach density subset of . Then there exists an infinite set and such that .
Strictly speaking, the main result of Kra et al. only claims this theorem for the case of the integers , but as noted in the recent preprint of Charamaras and Mountakis, the argument in fact applies for all countable abelian in which the subgroup has finite index. This condition is in fact necessary (as observed by forthcoming work of Ethan Acklesberg): if has infinite index, then one can find a subgroup of of index for any that contains (or equivalently, is -torsion). If one lets be an enumeration of , and one can then check that the set
has positive Banach density, but does not contain any set of the form for any (indeed, from the pigeonhole principle and the -torsion nature of one can show that must intersect whenever has cardinality larger than ). It is also necessary to work with restricted sums rather than full sums : a counterexample to the latter is provided for instance by the example with and . Finally, the presence of the shift is also necessary, as can be seen by considering the example of being the odd numbers in , though in the case one can of course delete the shift at the cost of giving up the containment .Theorem 1 resembles other theorems in density Ramsey theory, such as Szemerédi’s theorem, but with the notable difference that the pattern located in the dense set is infinite rather than merely arbitrarily large but finite. As such, it does not seem that this theorem can be proven by purely finitary means. However, one can view this result as the conjunction of an infinite number of statements, each of which is a finitary density Ramsey theory statement. To see this, we need some more notation. Observe from Tychonoff’s theorem that the collection is a compact topological space (with the topology of pointwise convergence) (it is also metrizable since is countable). Subsets of can be thought of as properties of subsets of ; for instance, the property a subset of of being finite is of this form, as is the complementary property of being infinite. A property of subsets of can then be said to be closed or open if it corresponds to a closed or open subset of . Thus, a property is closed and only if if it is closed under pointwise limits, and a property is open if, whenever a set has this property, then any other set that shares a sufficiently large (but finite) initial segment with will also have this property. Since is compact and Hausdorff, a property is closed if and only if it is compact.
The properties of being finite or infinite are neither closed nor open. Define a smallness property to be a closed (or compact) property of subsets of that is only satisfied by finite sets; the complement to this is a largeness property, which is an open property of subsets of that is satisfied by all infinite sets. (One could also choose to impose other axioms on these properties, for instance requiring a largeness property to be an upper set, but we will not do so here.) Examples of largeness properties for a subset of include:
- has at least elements.
- is non-empty and has at least elements, where is the smallest element of .
- is non-empty and has at least elements, where is the element of .
- halts when given as input, where is a given Turing machine that halts whenever given an infinite set as input. (Note that this encompasses the preceding three examples as special cases, by selecting appropriately.)
Theorem 1 is then equivalent to the following “almost finitary” version (cf. this previous discussion of almost finitary versions of the infinite pigeonhole principle):
Theorem 2 (Almost finitary form of main theorem) Let be a countably infinite abelian group with finite. Let be a Følner sequence in , let , and let be a largeness property for each . Then there exists such that if is such that for all , then there exists a shift and contains a -large set such that .
Proof of Theorem 2 assuming Theorem 1. Let , , be as in Theorem 2. Suppose for contradiction that Theorem 2 failed, then for each we can find with for all , such that there is no and -large such that . By compactness, a subsequence of the converges pointwise to a set , which then has Banach density at least . By Theorem 1, there is an infinite set and a such that . By openness, we conclude that there exists a finite -large set contained in , thus . This implies that for infinitely many , a contradiction.
Proof of Theorem 1 assuming Theorem 2. Let be as in Theorem 1. If the claim failed, then for each , the property of being a set for which would be a smallness property. By Theorem 2, we see that there is a and a obeying the complement of this property such that , a contradiction.
Remark 3 Define a relation between and by declaring if and . The key observation that makes the above equivalences work is that this relation is continuous in the sense that if is an open subset of , then the inverse image
is also open. Indeed, if for some , then contains a finite set such that , and then any that contains both and lies in .
For each specific largeness property, such as the examples listed previously, Theorem 2 can be viewed as a finitary assertion (at least if the property is “computable” in some sense), but if one quantifies over all largeness properties, then the theorem becomes infinitary. In the spirit of the Paris-Harrington theorem, I would in fact expect some cases of Theorem 2 to undecidable statements of Peano arithmetic, although I do not have a rigorous proof of this assertion.
Despite the complicated finitary interpretation of this theorem, I was still interested in trying to write the proof of Theorem 1 in some sort of “pseudo-finitary” manner, in which one can see analogies with finitary arguments in additive combinatorics. The proof of Theorem 1 that I give below the fold is my attempt to achieve this, although to avoid a complete explosion of “epsilon management” I will still use at one juncture an ergodic theory reduction from the original paper of Kra et al. that relies on such infinitary tools as the ergodic decomposition, the ergodic theory, and the spectral theorem. Also some of the steps will be a little sketchy, and assume some familiarity with additive combinatorics tools (such as the arithmetic regularity lemma).
— 1. Proof of theorem —
The proof of Kra et al. proceeds by establishing the following related statement. Define a (length three) combinatorial Erdös progression to be a triple of subsets of such that there exists a sequence in such that converges pointwise to and converges pointwise to . (By , we mean with respect to the cocompact filter; that is, that for any finite (or, equivalently, compact) subset of , for all sufficiently large .)
Theorem 4 (Combinatorial Erdös progression) Let be a countably infinite abelian group with finite. Let be a positive Banach density subset of . Then there exists a combinatorial Erdös progression with and non-empty.
Let us see how Theorem 4 implies Theorem 1. Let be as in Theorem 4. By hypothesis, contains an element of , thus and . Setting to be a sufficiently large element of the sequence , we conclude that and . Setting to be an even larger element of this sequence, we then have and . Setting to be an even larger element, we have and . Continuing in this fashion we obtain the desired infinite set .
It remains to establish Theorem 4. The proof of Kra et al. converts this to a topological dynamics/ergodic theory problem. Define a topological measure-preserving -system to be a compact space equipped with a Borel probability measure as well as a measure-preserving homeomorphism . A point in is said to be generic for with respect to a Følner sequence if one has
for all continuous . Define an (length three) dynamical Erdös progression to be a tuple in with the property that there exists a sequence such that and .Theorem 4 then follows from
Theorem 5 (Dynamical Erdös progression) Let be a countably infinite abelian group with finite. Let be a topological measure-preserving -system, let be a -generic point of for some Følner sequence , and let be a positive measure open subset of . Then there exists a dynamical Erdös progression with and .
Indeed, we can take to be , to be , to be the shift , , and to be a weak limit of the for a Følner sequence with , at which point Theorem 4 follows from Theorem 5 after chasing definitions. (It is also possible to establish the reverse implication, but we will not need to do so here.)
A remarkable fact about this theorem is that the point need not be in the support of ! (In a related vein, the elements of the Følner sequence are not required to contain the origin.)
Using a certain amount of ergodic theory and spectral theory, Kra et al. were able to reduce this theorem to a special case:
Theorem 6 (Reduction) To prove Theorem 5, it suffices to do so under the additional hypotheses that is ergodic, and there is a continuous factor map to the Kronecker factor. (In particular, the eigenfunctions of can be taken to be continuous.)
We refer the reader to the paper of Kra et al. for the details of this reduction. Now we specialize for simplicity to the case where is a countable vector space over a finite field of size equal to an odd prime , so in particular ; we also specialize to Følner sequences of the form for some and . In this case we can prove a stronger statement:
Theorem 7 (Odd characteristic case) Let for an odd prime . Let be a topological measure-preserving -system with a continuous factor map to the Kronecker factor, and let be open subsets of with . Then if is a -generic point of for some Følner sequence , there exists an Erdös progression with and .
Indeed, in the setting of Theorem 5 with the ergodicity hypothesis, the set has full measure, so the hypothesis of Theorem 7 will be verified in this case. (In the case of more general , this hypothesis ends up being replaced with ; see Theorem 2.1 of this recent preprint of Kousek and Radic for a treatment of the case (but the proof extends without much difficulty to the general case).)
As with Theorem 1, Theorem 7 is still an infinitary statement and does not have a direct finitary analogue (though it can likely be expressed as the conjunction of infinitely many such finitary statements, as we did with Theorem 1). Nevertheless we can formulate the following finitary statement which can be viewed as a “baby” version of the above theorem:
Theorem 8 (Finitary model problem) Let be a compact metric space, let be a finite vector space over a field of odd prime order. Let be an action of on by homeomorphisms, let , and let be the associated -invariant measure . Let be subsets of with for some . Then for any , there exist such that
The important thing here is that the bounds are uniform in the dimension (as well as the initial point and the action ).
Let us now give a finitary proof of Theorem 8. We can cover the compact metric space by a finite collection of open balls of radius . This induces a coloring function that assigns to each point in the index of the first ball that covers that point. This then induces a coloring of by the formula . We also define the pullbacks for . By hypothesis, we have , and it will now suffice by the triangle inequality to show that
Now we apply the arithmetic lemma of Green with some regularity parameter to be chosen later. This allows us to partition into cosets of a subgroup of index , such that on all but of these cosets , all the color classes are -regular in the Fourier () sense. Now we sample uniformly from , and set ; as is odd, is also uniform in . If lies in a coset , then will lie in . By removing an exceptional event of probability , we may assume that neither of these cosetgs , is a bad coset. By removing a further exceptional event of probability , we may also assume that is in a popular color class of in the sense that since the set of exceptional that fail to achieve this only are hit with probability . Similarly we may assume that Now we consider the quantity which we can write as Both factors here are -uniform in their respective cosets. Thus by standard Fourier calculations, we see that after excluding another exceptional event of probabitiy , this quantity is equal to By (1), (2), this expression is . By choosing small enough depending on , we can ensure that and , and the claim follows.Now we can prove the infinitary result in Theorem 7. Let us place a metric on . By sparsifying the Følner sequence , we may assume that the grow as fast as we wish. Once we do so, we claim that for each , we can find such that for each , there exists that lies outside of such that
Passing to a subsequence to make converge to respectively, we obtain the desired Erdös progression.Fix , and let be a large parameter (much larger than ) to be chosen later. By genericity, we know that the discrete measures converge vaguely to , so any point in the support in can be approximated by some point with . Unfortunately, does not necessarily lie in this support! (Note that need not contain the origin.) However, we are assuming a continuous factor map to the Kronecker factor , which is a compact abelian group, and pushes down to the Haar measure of , which has full support. In particular, thus pushforward contains . As a consequence, we can find such that converges to , even if we cannot ensure that converges to . We are assuming that is a coset of , so now converges vaguely to .
We make the random choice , , where is drawn uniformly at random from . This is not the only possible choice that can be made here, and is in fact not optimal in certain respects (in particular, it creates a fair bit of coupling between , ), but is easy to describe and will suffice for our argument. (A more appropriate choice, closer to the arguments of Kra et al., would be to in the above construction by , where the additional shift is a random variable in independent of that is uniformly drawn from all shifts annihilated by the first characters associated to some enumeration of the (necessarily countable) point spectrum of , but this is harder to describe.)
Since we are in odd characteristic, the map is a permutation on , and so , are both distributed according to the law , though they are coupled to each other. In particular, by vague convergence (and inner regularity) we have
and where denotes a quantity that goes to zero as (holding all other parameters fixed). By the hypothesis , we thus have for some independent of .We will show that for each , one has
outside of an event of probability at most (compare with Theorem 8). If this is the case, then by the union bound we can find (for large enough) a choice of , obeying (3) as well as (4) for all . If the grow fast enough, we can then ensure that for each one can find (again for large enough) in the set in (4) that avoids , and the claim follows.It remains to show (4) outside of an exceptional event of acceptable probability. Let be the coloring function from the proof of Theorem 8 (with ). Then it suffices to show that
where and . This is a counting problem associated to the patterm ; if we concatenate the and components of the pattern, this is a classic “complexity one” pattern, of the type that would be expected to be amenable to Fourier analysis (especially if one applies Cauchy-Schwarz to eliminate the averaging and absolute value, at which point one is left with the pattern ).In the finitary setting, we used the arithmetic regularity lemma. Here, we will need to use the Kronecker factor instead. The indicator function of a level set of the coloring function is a bounded measurable function of , and can thus be decomposed into a function that is measurable on the Kronecker factor, plus an error term that is orthogonal to that factor and thus is weakly mixing in the sense that tends to zero on average (or equivalently, that the Host-Kra seminorm vanishes). Meanwhile, for any , the Kronecker-measurable function can be decomposed further as , where is a bounded “trigonometric polynomial” (a finite sum of eigenfunctions) and . The polynomial is continuous by hypothesis. The other two terms in the decomposition are merely meaurable, but can be approximated to arbitrary accuracy by continuous functions. The upshot is that we can arrive at a decomposition
(analogous to the arithmetic regularity lemma) for any , where is a bounded continuous function of norm at most , and is a bounded continuous function of norm at most (in practice we will take much smaller than ). Pulling back to , we then have Let be chosen later. The trigonometric polynomial is just a sum of characters on , so one can find a subgroup of of index such that these polynomial are constant on each coset of for all . Then lies in some coset and lies in the coset . We then restrict to also lie in , and we will show that outside of an exceptional event of proability , which will establish our claim because will ultimately be chosen to dependon .The left-hand side can be written as
The coupling of the constraints and is annoying (as is an “infinite complexity” pattern that cannot be controlled by any uniformity norm), but (perhaps surprisingly) will not end up causing an essential difficulty to the argument, as we shall see when we start eliminating the terms in this sum one at a time starting from the right.We decompose the term using (5):
By Markov’s inequality, and removing an exceptional event of probabiilty at most , we may assume that the have normalized norm on both of these cosets . As such, the contribution of to (6) become negligible if is small enough (depending on ). From the near weak mixing of the , we know that for all , if we choose large enough. By genericity of , this implies that From this and standard Cauchy-Schwarz (or van der Corput) arguments we can then show that the contribution of the to (6) is negligible outside of an exceptional event of probability at most , if is small enough depending on . Finally, the quantity is independent of , and in fact is equal up to negligible error to the density of in the coset . This density will be except for those which would have made a negligible impact on (6) in any event due to the rareness of the event in such cases. As such, to prove (6) it suffices to show that outside of an event of probability . Now one can sum in to simplify the above estiamte to If is such that is small compared with , then by genericity (and assuming large enough), the probability that will similarly be small (up to errors), and thus have a negligible influence on the above sum. As such, the above estimate simplifies to But the left-hand side sums to one, and the claim follows.
Two announcements: AI for Math resources, and erdosproblems.com
This post contains two unrelated announcements. Firstly, I would like to promote a useful list of resources for AI in Mathematics, that was initiated by Talia Ringer (with the crowdsourced assistance of many others) during the National Academies workshop on “AI in mathematical reasoning” last year. This list is now accepting new contributions, updates, or corrections; please feel free to submit them directly to the list (which I am helping Talia to edit). Incidentally, next week there will be a second followup webinar to the aforementioned workshop, building on the topics covered there. (The first webinar may be found here.)
Secondly, I would like to advertise the erdosproblems.com website, launched recently by Thomas Bloom. This is intended to be a living repository of the many mathematical problems proposed in various venues by Paul Erdős, who was particularly noted for his influential posing of such problems. For a tour of the site and an explanation of its purpose, I can recommend Thomas’s recent talk on this topic at a conference last week in honor of Timothy Gowers.
Thomas is currently issuing a call for help to develop the erdosproblems.com website in a number of ways (quoting directly from that page):
- You know Github and could set a suitable project up to allow people to contribute new problems (and corrections to old ones) to the database, and could help me maintain the Github project;
- You know things about web design and have suggestions for how this website could look or perform better;
- You know things about Python/Flask/HTML/SQL/whatever and want to help me code cool new features on the website;
- You know about accessibility and have an idea how I can make this website more accessible (to any group of people);
- You are a mathematician who has thought about some of the problems here and wants to write an expanded commentary for one of them, with lots of references, comparisons to other problems, and other miscellaneous insights (mathematician here is interpreted broadly, in that if you have thought about the problems on this site and are willing to write such a commentary you qualify);
- You knew Erdős and have any memories or personal correspondence concerning a particular problem;
- You have solved an Erdős problem and I’ll update the website accordingly (and apologies if you solved this problem some time ago);
- You have spotted a mistake, typo, or duplicate problem, or anything else that has confused you and I’ll correct things;
- You are a human being with an internet connection and want to volunteer a particular Erdős paper or problem list to go through and add new problems from (please let me know before you start, to avoid duplicate efforts);
- You have any other ideas or suggestions – there are probably lots of things I haven’t thought of, both in ways this site can be made better, and also what else could be done from this project. Please get in touch with any ideas!
I for instance contributed a problem to the site (#587) that Erdős himself gave to me personally (this was the topic of a somewhat well known photo of Paul and myself, and which he communicated again to be shortly afterwards on a postcard; links to both images can be found by following the above link). As it turns out, this particular problem was essentially solved in 2010 by Nguyen and Vu.
(Incidentally, I also spoke at the same conference that Thomas spoke at, on my recent work with Gowers, Green, and Manners; here is the video of my talk, and here are my slides.)
Marton’s conjecture in abelian groups with bounded torsion
Tim Gowers, Ben Green, Freddie Manners, and I have just uploaded to the arXiv our paper “Marton’s conjecture in abelian groups with bounded torsion“. This paper fully resolves a conjecture of Katalin Marton (the bounded torsion case of the Polynomial Freiman–Ruzsa conjecture (first proposed by Katalin Marton):
Theorem 1 (Marton’s conjecture) Let be an abelian -torsion group (thus, for all ), and let be such that . Then can be covered by at most translates of a subgroup of of cardinality at most . Moreover, is contained in for some .
We had previously established the case of this result, with the number of translates bounded by (which was subsequently improved to by Jyun-Jie Liao), but without the additional containment . It remains a challenge to replace by a bounded constant (such as ); this is essentially the “polynomial Bogolyubov conjecture”, which is still open. The result has been formalized in the proof assistant language Lean, as discussed in this previous blog post. As a consequence of this result, many of the applications of the previous theorem may now be extended from characteristic to higher characteristic.
Our proof techniques are a modification of those in our previous paper, and in particular continue to be based on the theory of Shannon entropy. For inductive purposes, it turns out to be convenient to work with the following version of the conjecture (which, up to -dependent constants, is actually equivalent to the above theorem):
Theorem 2 (Marton’s conjecture, entropy form) Let be an abelian -torsion group, and let be independent finitely supported random variables on , such that
where denotes Shannon entropy. Then there is a uniform random variable on a subgroup of such that
where denotes the entropic Ruzsa distance (see previous blog post for a definition); furthermore, if all the take values in some symmetric set , then lies in for some .
As a first approximation, one should think of all the as identically distributed, and having the uniform distribution on , as this is the case that is actually relevant for implying Theorem 1; however, the recursive nature of the proof of Theorem 2 requires one to manipulate the separately. It also is technically convenient to work with independent variables, rather than just a pair of variables as we did in the case; this is perhaps the biggest additional technical complication needed to handle higher characteristics.
The strategy, as with the previous paper, is to attempt an entropy decrement argument: to try to locate modifications of that are reasonably close (in Ruzsa distance) to the original random variables, while decrementing the “multidistance”
which turns out to be a convenient metric for progress (for instance, this quantity is non-negative, and vanishes if and only if the are all translates of a uniform random variable on a subgroup ). In the previous paper we modified the corresponding functional to minimize by some additional terms in order to improve the exponent , but as we are not attempting to completely optimize the constants, we did not do so in the current paper (and as such, our arguments here give a slightly different way of establishing the case, albeit with somewhat worse exponents).
As before, we search for such improved random variables by introducing more independent random variables – we end up taking an array of random variables for , with each a copy of , and forming various sums of these variables and conditioning them against other sums. Thanks to the magic of Shannon entropy inequalities, it turns out that it is guaranteed that at least one of these modifications will decrease the multidistance, except in an “endgame” situation in which certain random variables are nearly (conditionally) independent of each other, in the sense that certain conditional mutual informations are small. In particular, in the endgame scenario, the row sums of our array will end up being close to independent of the column sums , subject to conditioning on the total sum . Not coincidentally, this type of conditional independence phenomenon also shows up when considering row and column sums of iid independent gaussian random variables, as a specific feature of the gaussian distribution. It is related to the more familiar observation that if are two independent copies of a Gaussian random variable, then and are also independent of each other.
Up until now, the argument does not use the -torsion hypothesis, nor the fact that we work with an array of random variables as opposed to some other shape of array. But now the torsion enters in a key role, via the obvious identity
In the endgame, the any pair of these three random variables are close to independent (after conditioning on the total sum ). Applying some “entropic Ruzsa calculus” (and in particular an entropic version of the Balog–Szeméredi–Gowers inequality), one can then arrive at a new random variable of small entropic doubling that is reasonably close to all of the in Ruzsa distance, which provides the final way to reduce the multidistance.
Besides the polynomial Bogolyubov conjecture mentioned above (which we do not know how to address by entropy methods), the other natural question is to try to develop a characteristic zero version of this theory in order to establish the polynomial Freiman–Ruzsa conjecture over torsion-free groups, which in our language asserts (roughly speaking) that random variables of small entropic doubling are close (in Ruzsa distance) to a discrete Gaussian random variable, with good bounds. The above machinery is consistent with this conjecture, in that it produces lots of independent variables related to the original variable, various linear combinations of which obey the same sort of entropy estimates that gaussian random variables would exhibit, but what we are missing is a way to get back from these entropy estimates to an assertion that the random variables really are close to Gaussian in some sense. In continuous settings, Gaussians are known to extremize the entropy for a given variance, and of course we have the central limit theorem that shows that averages of random variables typically converge to a Gaussian, but it is not clear how to adapt these phenomena to the discrete Gaussian setting (without the circular reasoning of assuming the polynoimal Freiman–Ruzsa conjecture to begin with).
AI Mathematical Olympiad – Progress Prize Competition now open
The first progress prize competition for the AI Mathematical Olympiad has now launched. (Disclosure: I am on the advisory committee for the prize.) This is a competition in which contestants submit an AI model which, after the submissions deadline on June 27, will be tested (on a fixed computational resource, without internet access) on a set of 50 “private” test math problems, each of which has an answer as an integer between 0 and 999. Prior to the close of submission, the models can be tested on 50 “public” test math problems (where the results of the model are public, but not the problems themselves), as well as 10 training problems that are available to all contestants. As of this time of writing, the leaderboard shows that the best-performing model has solved 4 out of 50 of the questions (a standard benchmark, Gemma 7B, had previously solved 3 out of 50). A total of $ ($1.048 million) has been allocated for various prizes associated to this competition. More detailed rules can be found here.
Talks at the JMM
Earlier this year, I gave a series of lectures at the Joint Mathematics Meetings at San Francisco. I am uploading here the slides for these talks:
- “Machine assisted proof” (Video here)
- “Translational tilings of Euclidean space” (Video here)
- “Correlations of multiplicative functions” (Video here)
I also have written a text version of the first talk, which has been submitted to the Notices of the American Mathematical Society.