Arithmetic subsequences in permutations

Rosie Zhao and I have just uploaded our paper “Arithmetic subsequences in a random ordering of an additive set” to the arXiv. In its simplest form, the problem we consider is the following. Given an ordering of the numbers $1$ through $n$, what is the length of the longest arithmetic subsequence that is embedded in this permutation? For example, in the ordering $(2,7,1,6,3,4,5)$, the longest arithmetic subsequence is $(2,3,4,5)$, which has length $4$. If we store the permutation in an array, the problem of finding the length of the longest arithmetic subsequence is a popular programming interview question that can be solved efficiently using dynamic programming (it is a medium-difficulty problem on LeetCode). But if we take the ordering to be random, with each of the $n!$ possibilities occurring with equal probability, then the length $L_n$ of the longest arithmetic subsequence becomes a random variable. Our paper studies the asymptotic behaviour of $L_n$ as $n$ gets large.

Update. (28 Sep 2021) Our paper was accepted to Integers: Electronic Journal of Combinatorial Number Theory! There were some small errors in the original lemmas, which have been corrected in both the published and arXiv versions as well as in the blog post below.

The integer case

Let $[1,n]$ denote the discrete interval ${1,2,\ldots,n}$. We regard an arithmetic progression in a sequence to be a subsequence


where $a$ is called the base point, $r$ is the step size, and $k$ is the length. Traditionally, arithmetic progressions are studied as subsets of some larger set, and one is interested in finding out if sets contain long arithmetic progressions as subsets. In our case, we start with the arithmetic progression $[1,n]$, scramble its elements into a sequence, and look for a long subsequence. For small $n$, using the dynamic programming techniques mentioned above, one can exhaustively compute the distribution of the longest arithmetic subsequence $L_n$. Letting $f_n(k)$ denote the number of permutations of $[1,n]$ whose longest arithmetic subsequence is of length $k$, we have the following table (click to enlarge):

Notice that each row sums to $n!$, since there are $n!$ orderings of $[1,n]$, and we can compute probabilities by simply dividing; for example, the probability that $L_9 = 3$ is $168368/(9!)\approx 46.4\%$. The first main result of our paper is the following theorem:

Theorem 4. Let $L_n$ denote the longest arithmetic sequence in an ordering of $[1,n]$, chosen uniformly at random. There exists a function $\psi(n)$ with $\psi(n)\sim 2\log n/\log\log n$ such that the probabilility that $L_n$ is not in the interval $[\psi(n)-6, \psi(n)+1]$ tends to $0$ as $n$ approaches infinity.

(In fact, we proved a slightly more general form of this theorem in the paper, considering $[1,n]^d\subseteq {\bf Z}^d$ for $d\geq 1$ and finding a function $\psi(n,d)\sim 2d\log n/\log\log n$.) To prove the theorem, we let $N_k$ denote the number of arithmetic subsequences of length $k$ in the random ordering. By the union bound, we have \({\bf P}\{L_n\geq k\}\leq {\bf E}\{N_k\}\), which goes to $0$ if $k\geq \psi(n)+1$. Then by the second moment method, it can be shown that

\[{\bf P}\{L_n < k\} = {\bf P}\{N_k= 1\} \le { {\bf V}\{N_k\}\over {\bf E}\{ {N_k} \}^2 } = {k^5\over P_n(k)/k!},\]

where $P_n(k)$ is the number of arithmetic sequences of length $k$ that use elements of the set $[1,n]$. This approaches $0$ when $k<\psi(n)-6$. By a counting argument, one can show that

\[{\bf E}\{N_k\} \sim {n^2\over k!(k-1)},\]

which can be inverted to show that $\psi(n)\sim 2\log n/\log\log n$.

Cyclic groups

If we consider addition in the cyclic group ${\bf Z}/n{\bf Z}$, then $L_n$ is larger, since any arithmetic progression in $[1,n]\subseteq {\bf Z}$ is also a progression in ${\bf Z}/n{\bf Z}$, but not the other way around. For example, the sequence $(0,2,6,1,3,5,4)$ has the $4$-term arithmetic subsequence $(0,6,5,4)$ when addition is taken modulo $7$ (the base point is $0$ and the step size is $6$), but it has no arithmetic subsequence of length $4$ when the addition is ordinary integer addition. The distribution of $L_n$ for small $n$ in the cyclic case looks like this ($g_n(k)$ is the cyclic analogue of $f_n(k)$ from Table 1):

We prove the following analogue of Theorem 4 in the cyclic case:

Theorem 7. For a positive integer $n$, let $L_n$ denote the longest arithmetic subsequence in an ordering of ${\bf Z}/n{\bf Z}$, chosen uniformly at random. There exists a function $\chi(n)$ with $\chi(n)\sim 2\log n/\log\log n$ such that $L_n$ is in the interval $[\chi(n)-6, \chi(n)+1]$ with probability tending to $1$ as $n\to\infty$.

Note that while both $\psi(n)$ from Theorem 4 and $\chi(n)$ from Theorem 7 both have the same leading asymptotic term, we always have $\psi(n)<\chi(n)$. A closer look at Table 2 also reveals some interesting curiosities regarding $L_n$ in the cyclic case. Firstly, \({\bf E}\{L_n\}\) does not increase monotonically with $n$ as in the integer case. For example, we have \({\bf E}\{L_7\} = 4.25\) and \({\bf E}\{L_8\}\approx 4.136\). This is because there are “more ways” to form arithmetic subsequences in permutations of ${\bf Z}/n{\bf Z}$ when $n$ is prime. We also notice that there are only permutations without arithmetic subsequences of length $3$ when $n$ is a power of $2$. This can be formulated as the following theorem:

Theorem 8. The number $g_n(2)$ of orderings of ${\bf Z}/n{\bf Z}$ that do not contain any arithmetic subsequence of length $3$ equals $2^{n-1}$ if $n=2^m$ for some $m\geq 1$, and is zero otherwise. Any ordering of ${\bf Z}/2^m{\bf Z}$ that contains no progression of length $3$ consists of $2^{m-1}$ elements of the same parity, followed by the remaining $2^{m-1}$ elements of the opposite parity.

The proof was by induction in the case that $n$ is a power of $2$, and for $n$ not a power of $2$ we can reduce to the case where $n$ is an odd prime.

Possible noncommutative generalisations

The first and second moment methods we used did not rely on the fact that the groups we considered (${\bf Z}^d$ and ${\bf Z}/n{\bf Z}$) are abelian. One could extend these results to random orderings of other finite groups, such as the dihedral group $D_n$ or the symmetric group $S_n$. In the noncommutative case, one needs to specify whether the step size is multiplied by the base point on the left or on the right, but if the sequence contains every element of the (finite) group, then the distribution of $L_n$ will be the same for both choices.

Leave a comment