On Fibonacci numbers and the golden ratio

Some friends, one of whom is a high school student, asked me to explain the phenomenon that the ratio of successive Fibonacci numbers tends toward the golden ratio.  Some of the explanations I have seen involve fairly heavy use of linear algebra; others involve generating functions and the use of some analysis.  I wanted to give an explanation that I would have had a prayer of understanding when I was a high school student.  I’m not sure whether I have succeeded—there are a great many things I found hard to understand as a high school student (and now as well)—so I would be happy to hear about alternatives or other suggestions.  As I see it, there are five main steps in the reasoning.  This proof is the one in the Wikipedia article on the Fibonacci numbers, but I have expanded the explanation quite a bit.  I highly recommend Peter Cameron’s series of posts about the Fibonacci numbers (see I, II, III, IV, V, and VI).  Part 3 of that series is closely related to the topic of this post, but at a somewhat higher level.

Step 1: As is often the case in mathematics, in order to understand a phenomenon, it is helpful to consider a more general phenomenon.  The sequence of Fibonacci numbers is just one of infinitely many “Fibonacci-like” sequences that obey the Fibonacci rule F(n) = F(n-1) + F(n-2).  The Fibonacci numbers themselves are generated under the assumptions F(0) = 1, F(1) =1, but if you choose F(0) and F(1) to be any other random numbers that you like, and then apply the Fibonacci rule to generate F(2), F(3), and so on, you get another Fibonacci-like sequence.
An example of such a sequence is the Lucas numbers, which are generated from the starting pair F(0) = 2, F(1) = 1.  The resulting sequence is 2, 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, 199, \ldots
You can try generating some random Fibonacci-like sequences, and you will find, unless you are extremely unlucky (or perhaps I should say lucky), that the ratio of successive terms tends toward the golden ratio, just as ratios of Fibonacci numbers themselves do.  What I mean by “extremely unlucky” is that there do exist choices of starting pair for which you get a ratio different from the golden ratio, but these choices are very rare, and you are unlikely to pick one of them by chance.
Step 2: If you multiply every number in a Fibonacci-like sequence by the same value, the resulting sequence is also Fibonacci-like.  For example, if you multiply every Fibonacci number by 5, you get the sequence 5, 5, 10, 15, 25, 40, 65, 105, \ldots, which is Fibonacci-like since it obeys the Fibonacci rule, F(n) = F(n-1) + F(n-2).  Furthermore, if you add two Fibonacci-like sequences together, term-by-term, the resulting sequence is Fibonacci-like.  For example, adding the Fibonacci and Lucas sequences gives

(1+2, 1+1, 2+3, 3+4, 5+7, 8+11, 13+18, 21+29, \ldots) = (3, 2, 5, 7, 12, 19, 31, 50, \ldots),

which is Fibonacci-like.
Make sure you understand why both of these facts are true.  You can, of course, combine the two operations together.  For example, you could multiply the Fibonacci numbers by 5 and the Lucas numbers by 12, and then add the resulting sequences term-by-term, and the result will be another Fibonacci-like sequence.  In this example you would get (29, 17, 46, 63, 109, \ldots).
Step 3: Any Fibonacci-like sequence can be expressed as a combination of the Fibonacci sequence and the Lucas sequence.  If we let f = (1, 1, 2, 3, 5, 8, 13, 21, \ldots) be the Fibonacci sequence and we let \ell = (2, 1, 3, 4, 7, 11, 18, 29, \ldots) be the Lucas sequence, then any other Fibonacci-like sequence equals x\cdot f + y\cdot\ell for some suitable choice of the numbers x and y.
For example, consider the Fibonacci-like sequence with starting pair 4, 9.  The sequence this pair produces is (4, 9, 13, 22, 35, 57, 92, \ldots).  How can we express this in terms of Fibonacci and Lucas sequences as x\cdot f + y\cdot\ell?  The sequence x\cdot f is (x, x, 2x, 3x, 5x, 8x, 13x, \ldots) and the sequence y\cdot\ell is (2y, y, 3y, 4y, 7y, 11y, 18y, \ldots).  The sequence x\cdot f + y\cdot\ell is therefore (x+2y, x+y, 2x+3y, 3x+4y, 5x+7y, \ldots).  To find x and y, we require that the terms in this sequence match our sequence (4, 9, 13, 22, 35, \ldots).  So we have to solve the simultaneous linear equations

\begin{aligned}x+2y&=4,\\ x+y&=9,\\ 2x+3y&=13,\\ 3x+4y&=22,\\ &\vdots\end{aligned}

In fact, we can ignore all the equations from the third on, since if the first two equations are satisfied, all the higher ones will automatically hold.  (Make sure you understand why.)  So we carry out the algebra to find the x and y that will make the first two equations hold true.  You should try this.  The answer is x = 14, y = -5.  You can then verify that if you multiply the Fibonacci numbers by 14, multiply the Lucas numbers by −5, and add the resulting sequences term-by-term, you do indeed get our  sequence (4, 9, 13, 22, 35, \ldots).
Since the same algebraic procedure can be carried out for any Fibonacci-like sequence whatsoever, any Fibonacci-like sequence is a combination of the Fibonacci and Lucas sequences.

There is nothing sacred about using the Fibonacci and Lucas numbers as a basis, however.  Any two Fibonacci-like sequences—as long as they are not multiples of each other—can be used as a basis for all other Fibonacci-like sequences.  So if A and B are Fibonacci-like sequences, and B is not simply equal to c\cdot A for some number c, then any Fibonacci-like sequence can be written as x\cdot A + y\cdot B.

Step 4: To understand why the ratio of successive terms in a Fibonacci-like sequence (usually) tends toward the golden ratio, I will choose the basis sequences A and B in a very special way. Since we are interested in ratios of successive terms in Fibonacci-like sequences, we ask ourselves, is there a Fibonacci-like sequence for which the ratio of successive terms doesn’t merely tend toward some fixed ratio, but exactly equals the same ratio for all pairs of successive terms in the sequence?

A sequence in which the ratio of successive terms is always the same is called a geometric sequence.  If the initial term is a and the ratio between successive terms is r, then the geometric sequence is (a, ar, ar^2, ar^3, ar^4, \ldots).  For example, if the intial term is a=3 and the ratio between successive terms is r=2, then the geometric sequence is (3, 6, 12, 24, 48, 96, \ldots).

Could a Fibonacci-like sequence be a geometric sequence?  For this to be the case, we need for the Fibonacci rule, F(n) = F(n-1) +F(n-2), to hold for the terms of the geometric sequence.  This means that a system of simultaneous equations must be satisfied:

\begin{aligned} ar^2&=ar+a,\\ ar^3&=ar^2+ar,\\ ar^4&=ar^3+ar^2,\\ &\vdots\end{aligned}

If the first of these equations holds, then all the others do to, since they are obtained by multiplying both sides of the first equation by r or r^2 or r^3 or a higher power of r.
So can we find a ratio r such that ar^2 = ar + a?  We can get rid of the factor of a, so we need to solve r^2 = r + 1, or r^2 - r - 1 = 0.  The quadratic formula gives two solutions which we will call \phi and \psi:

\begin{aligned}\phi&=\frac{1 + \sqrt{5}}{2},\\ \psi&=\frac{1 - \sqrt{5}}{2}.\end{aligned}

You will recognize \phi as the golden ratio.  Since \phi and \psi are the roots of r^2 - r - 1, you can write

r^2 -r - 1 = (r - \phi)(r - \psi)=r^2-(\phi+\psi)r+\phi\psi,

from which you can conclude that \phi+\psi=1 and \psi = -1 / \phi.  You can also check these directly, the second one by writing -1 / \phi = -2 / (1 + \sqrt{5}) and then rationalizing the denominator.
What this means is that for a Fibonacci-like sequence to be a geometric sequence, the ratio between successive terms has to be either \phi or \psi.  Choosing the initial term to be 1, the two geometric, Fibonacci-like sequences are

\begin{aligned}A &= (1, \phi, \phi^2, \phi^3, \phi^4, \ldots),\\ B &= (1, \psi, \psi^2, \psi^3, \psi^4, \ldots).\end{aligned}

Of course, neither if these is the Fibonacci or Lucas sequence, or any other familiar sequence.  The numbers in these sequences are not whole numbers, after all.  Rounded to three decimal places, these sequences are

\begin{aligned}A &=(1, 1.618, 2.618, 4.236, 6.854, 11.090, 17.944, 29.034, 46.979, 76.013, 122.992, \ldots),\\ B &= (1, -0.618, 0.382, -0.236, 0.146, -0.090, 0.056, -0.034, 0.021, -0.013, 0.008, \ldots).\end{aligned}

Among all Fibonacci-like sequences, these two sequences are very special: they are geometric.  If you multiply either of them by a number, the resulting sequence will still be geometric.  But as soon as you start forming other combinations, by adding sequences together, the result will no longer be geometric (although, as discussed in Step 2, it will remain Fibonacci-like).

Step 5: Just as in Step 3, where we expressed Fibonacci-like sequences using the Fibonacci and Lucas sequences as a basis, we can express any Fibonacci-like sequence using the geometric sequences A and B as a basis, that is, we can write any Fibonacci-like sequence as x\cdot A + y\cdot B.

As an example, let’s try to express the Lucas sequence in terms of A and B.  We need to find numbers x and y so that

x\cdot A + y\cdot B = (2, 1, 3, 4, 7, 11, 18, 29, \ldots).

So we need to solve the simultaneous equations

\begin{aligned}x + y &= 2\\ x \phi + y \psi &= 1.\end{aligned}

I’ll leave it for you to carry out the algebra, but you should find that x = 1, y = 1.  This means that the nth Lucas number, which we call \ell(n), equals \phi^n + \psi^n.  (Verify this.  You should find that \ell(0) = 2, \ell(1) = 1, \ell(2) = 3, and so on.)  This is a remarkable formula.  The Lucas numbers are whole numbers, whereas \phi and \psi are irrational numbers.  The formula says that if you add the nth powers of \phi and \psi together, you always get a whole number.
Now here’s the important point.  The nth power of \phi is huge once n reaches  appreciable size.  This is because \phi = 1.618\ldots is a number bigger than 1, and so \phi^n grows exponentially.  On the other hand, the nth power of \psi is tiny since \psi = -0.618\ldots is smaller than 1 in magnitude, which means that \psi^n decays exponentially.  Since \phi^n + \psi^n is a whole number, but \psi^n is tiny, we see that \phi^n will be very close to a whole number once n reaches appreciable size.
In other words, although the powers of the golden ratio are irrational numbers, these powers will be very nearly whole numbers once the power gets large.  Since the tiny term \psi^n is alternately positive and negative, the dominant term \phi^n will be alternately smaller than and bigger than the whole number it approximates.
At last the Fibonacci numbers!  We will write the sequence of Fibonacci numbers using the geometric sequences A and B as a basis.  The algebra is similar to the algebra for the Lucas sequence.  We require

\begin{aligned}x + y &= 1,\\ x \phi + y \psi &= 1.\end{aligned}

If you solve for x and y and rationalize the denominators you should find that

\displaystyle x = \frac{5 + \sqrt{5}}{10}\quad\text{ and }\quad y =\frac{5 - \sqrt{5}}{10}.

You can also divide numerators and denominators by \sqrt{5}, and you will find that

\displaystyle x = \frac{\sqrt{5} + 1}{2\sqrt{5}} = \frac{\phi}{\sqrt{5}}\quad\text{ and }\quad y = \frac{\sqrt{5} - 1}{2\sqrt{5}} = \frac{-\psi}{\sqrt{5}}.

Hence the nth Fibonacci number, which we call f(n), can be written as

\displaystyle f(n)=\frac{\phi^{n+1} - \psi^{n+1}}{\sqrt{5}}.

Let’s now compute the ratio of the nth and (n−1)st Fibonacci numbers:

\begin{aligned}\frac{f(n)}{f(n-1)} &= \frac{(\phi^{n+1} - \psi^{n+1}) / \sqrt{5}}{(\phi^n - \psi^n) / \sqrt{5}} = \frac{\phi^{n+1} - \psi^{n+1}}{\phi^n - \psi^n}\\ &= \frac{\phi - \psi^{n+1}/\phi^n}{1 - \psi^n/\phi^n} = \phi\,\frac{1 - (\psi/\phi)^{n+1}}{1 - (\psi/\phi)^n}.\end{aligned}

Observe that \psi / \phi is smaller than 1 in magnitude, so (\psi / \phi)^n is tiny when n gets large.  Hence the ratio of successive Fibonacci numbers, which equals \phi times a fraction whose numerator and denominator both get closer and closer to 1 as n gets large,  approaches \phi as n gets large.

Some final remarks.  I said that, with some rare exceptions, the ratio of successive terms in any Fibonacci-like sequence—call it g—will be close to \phi.  Let me explain this.  Using the sequences A and B as a basis, so that our sequence is expressed as g=x\cdot A + y\cdot B for some numbers x and y, we see that the nth term in our Fibonacci-like sequence g is g(n)=x \phi^n + y \psi^n.  Take the ratio of the (n+1)st and nth terms:

\displaystyle \frac{g(n+1)}{g(n)} = \frac{x \phi^{n+1} + y \psi^{n+1}}{x \phi^n + y \psi^n} = \phi\,\frac{x + y (\psi/\phi)^{n+1}}{x + y(\psi/\phi)^n}.

If x is not zero, then numerator and denominator can be divided by x to give

\displaystyle\phi\,\frac{1 + \frac{y}{x} \left(\frac{\psi}{\phi}\right)^{n+1}}{1 + \frac{y}{x} \left(\frac{\psi}{\phi}\right)^n},

which tends to \phi as n gets large, since (\psi/\phi)^n becomes tiny.  On the other hand, if x is zero, we cannot divide numerator and denominator by x.  In fact, the ratio g(n+1)/g(n) is simply \psi rather than \phi.  So this is the special case where the limiting ratio is not the golden ratio.  In this situation, the limiting ratio instead is −1 divided by the golden ratio.
We could say that any Fibonacci-like sequence is a mixture of the sequence A of powers of \phi and the sequence B of powers of \psi.  If there is no A in the mixture, then the ratio of successive terms is simply \psi.  If there is no B in the mixture, then the ratio of successive terms is simply \phi.  If the mixture contains both A and B, then since \phi^n gets huge while \psi^n gets tiny, the B contribution can be ignored for n large, and so the ratio of successive terms tends toward \phi.  An “unlucky choice” of initial terms would be 1, \psi, or more generally a, a \psi, which would generate a Fibonacci-like sequence that is a multiple of B, and therefore contains no A in the mixture.

If you have read this far, you may have noticed that linear algebra did make an appearance after all.  The use of the word “basis” is a clue that the idea of a vector space is lurking.  More specifically, the Fibonacci-like sequences form a two-dimensional vector space.  The focus on Fibonacci-like sequences that are also geometric sequences is a way of sneaking in the matrix diagonalization proof described in Peter Cameron’s third post.

Hadamard matrices: the construction of Scarpis

One of the earliest Hadamard matrix constructions to be discovered is that of Umberto Scarpis.  His construction appeared in

U. Scarpis, Sui determinanti di valore massimo, Rendiconti della R. Istituto Lombardo di Scienze e Lettere 31 (1898) 1441–1446.

Scarpis’s paper followed Hadamard’s by five years. Hadamard in turn had been preceded by Sylvester, who showed that, given a \{1,-1\} matrix H of size n with orthogonal rows, the matrix

\begin{bmatrix}H & H\\ H & -H\end{bmatrix}=\begin{bmatrix}1 & 1\\ 1 & -1\end{bmatrix}\otimes H

is  a \{1,-1\} matrix of size 2n with orthogonal rows.  More generally, the Kronecker product H_1\otimes H_2 is an mn-by-mn Hadamard matrix for given Hadamard matrices H_1 and H_2 of sizes m and n.  Using the 1-by-1 Hadamard matrix H = [1] as a starting point, Sylvester’s construction produces Hadamard matrices of size 2^k for every k ≥ 0.

The results from Hadamard’s paper that pertain to the present discussion are that

  • {+1,−1} matrices with orthogonal rows are the only real matrices with elements of unit modulus that attain his determinant bound;
  • the size of such a matrix is required to be 1, 2, or a multiple of 4;
  • in addition to Sylvester’s examples, such matrices exist of sizes 12 and 20, and hence of sizes that are multiples of powers of 2, 12, and 20.

Scarpis may have been motivated by an examination of the structure of Hadamard’s 12-by-12 example.   As in Sylvester’s construction, Scarpis constructs a larger Hadamard matrix from a smaller Hadamard matrix.  Specifically, if an n-by-n Hadamard matrix can be found, where n−1 is prime, then an n(n−1)-by-n(n−1) Hadamard matrix can be constructed.  Paley later proved that for any n ≡ 0 (mod 4) with n−1 a prime (or more generally, a prime power) an n-by-n Hadamard matrix can indeed be found.  Together, the results of Scarpis and Paley produce Hadamard matrices of sizes 12, 56, 132, 380, 552, and so on.

In this post I will describe Scarpis’s construction in the form in which he originally presented it, with only minor modifications in indexing and notation.

Scarpis proceeds as follows.  The given n-by-n Hadamard matrix, H, is assumed normalized so that its first row and column consist entirely of ones, and columns are assumed to be ordered so that row 2 consists of alternating elements +1, −1, +1, −1, etc.  Let C be the core of H, obtained by removing its first row and column and let H_2 be the matrix that results from removing row 2 of H.  Let p=n-1, which is prime by assumption.

Scarpis defines row vectors j, a_0, …, a_{p-1} of length p.  Here j is the all-ones vector and a_i is  row i+1 of -C.

Scarpis then defines p-by-np matrices

M=H_2\otimes j

and

M_r=\begin{bmatrix}  a_r & -a_0 & a_r & -a_{2r} & \cdots & a_{(p-2)r} & -a_{(p-1)r}\\ a_r & -a_1 & a_{r+1} & -a_{2r+1} & \cdots & a_{(p-2)r+1} & -a_{(p-1)r+1}\\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots\\ a_r & -a_{p-1} & a_{r+p-1} & -a_{2r+p-1} & \cdots & a_{(p-2)r+p-1} & -a_{(p-1)r+p-1}\end{bmatrix}

where r\in\{0,1,\ldots,p-1\} and index arithmetic is performed mod p.  Scarpis’s Hadamard matrix is then

\begin{bmatrix}M\\ M_0\\ M_1\\ \vdots\\ M_{p-1}\end{bmatrix}.

To verify that the construction works, one must verify (1) that rows of M are orthogonal, (2) that rows of M_r are orthogonal, (3) that rows of M are orthogonal to rows of M_r, and (4) that rows of M_r are orthogonal to rows of M_s for r\ne s.

  1. Orthogonality of rows of M follows from orthogonality of rows of H_2.
  2. Orthogonality of rows of M_r is a consequence of the relations a_r\cdot a_r=p and a_s\cdot a_t=-1 for s\ne t.
  3. Note that the alternation of signs in the columns of M_r matches that of row 2 of H, which is the row omitted in the definition of H_2.  Hence row 2 of H is orthogonal to all rows of H_2.  Since j\cdot a_s=1 for all s, orthogonality of any row of M with any row of M_r follows.
  4. The inner product of row u of M_r with row v of M_s, where r,s,u,v\in\{0,1,\ldots,p-1\}, r\ne s, is a_r\cdot a_s+\sum_{i=0}^{p-1}a_{ir+u}\cdot a_{is+v}.  In the summation over i, there is a unique i for which ir+u=is+v\pmod{p}, namely i=(v-u)/(r-s), where arithmetic is performed in \mathbf{F}_p.  (This is the only place where primality of p is used.)  Hence the expression for the inner product contains p terms equal to −1 and one term equal to p, yielding inner product 0.

In Paley’s famous paper on the construction of Hadamard matrices using quadratic residues in finite fields, he remarks (see Lemma 5) that an 1892-by-1892 Hadamard matrix can be constructed by applying Scarpis’s construction to the 44-by-44 Hadamard matrix obtained using Paley’s first construction, and that size 1892 (which is 44×43) cannot be obtained using the methods of Sylvester and Paley alone.  See

R. E. A. C. Paley, On orthogonal matrices, J. Mathematics and Physics 12 (1933) 311–320.

I will make one remark about Scarpis’s construction.  The matrix M consists of n p-by-p rank-1 submatrices.  Furthermore, the first p columns of each of the M_r form a p-by-p rank-1 submatrix.  These rank-1 blocks are, in some sense, as big as rank-1 blocks in Hadamard matrices are allowed to get.  The proposition that follows contains the precise statement.

Proposition: Let

H=\begin{bmatrix}J_{ab} & X\\Y & Z\end{bmatrix}

be an n-by-n Hadamard matrix containing an a-by-b all-ones block J_{ab}.  Then ab\le n, and equality is equivalent to either of the following conditions.

  • Every column sum of X is zero.
  • Every row sum of Y is zero.

If a is odd and a > 1, then (a+1)b\le n with equality if and only if the column sums of X all equal ±1.
Proof: Let x_1,\ldots,x_{n-b} denote the column sums of X.  Evaluate j_a^THH^Tj_a in two different ways:

\begin{aligned}j_a^THH^Tj_a&=\left\lvert\begin{bmatrix}a & \ldots a & x_1 & \ldots & x_{n-b}\end{bmatrix}\right\rvert^2=ba^2+x_1^2+\ldots+x_{n-b}^2\\ &=j_a^TnIj_a=na.\end{aligned}

Hence ba^2\le na, or ab\le n, with equality if and only if the column sums of X are all zero.   A similar analysis of the transposed matrix H^T establishes equivalence with the condition on the column sums of Y.

If a is odd, then the column sums of X have magnitude at least 1, so the inequality can be strengthened to
ba^2+(n-b)\le na, which implies that (a+1)b\le n when a> 1.  Equality is equivalent to x_j^2=1 for all 1\le j\le n-b.

If we restrict our attention to square (k-by-k) submatrices of Hadamard matrices of size n(n−1), where n ≡ 0 (mod 4), then k may not exceed n − 1.  Hence Scarpis’s construction contains maximal square rank-1 blocks.  Hadamard matrices containing large rank-1 blocks appear to be useful in constructing large-determinant matrices of size congruent to 1 (mod 4), and Scarpis’s Hadamard matrix of size 56 can, in fact, can be used to construct the current record-determinant matrix of size 57.  This record was first constructed by Jean-Charles Meyrignac by improving a matrix found by Bertram Felgenhauer during Lars Backstrom’s programming  contest.  Felgenhauer’s construction method is described in a post to the contest message board.

Binary matrices: determinant bounds and Gram matrices

This post provides some background for a series of planned follow-up posts on recent progress in constructing maximal- or large-determinant binary matrices.

How big can the determinant of a matrix be if its elements are restricted to the set {0, 1}?  How big can the determinant of a matrix be if its elements are restricted to the set {1, −1} instead?  These two questions are equivalent.  For consider an n×n matrix with elements taken from {1, −1} that has no elements −1 in its first row or column.  A series of determinant-preserving transformations produces an (n−1)×(n−1) matrix with elements taken from {0, 1}, multiplied by the scalar (-2)^{n-1}:

\left[\begin{array}{c|ccc}1 & 1 & 1 & 1\\ \hline 1 & 1 & 1 & -1\\1 & -1 & -1 & -1\\1 & 1 & -1 & -1\end{array}\right]\rightarrow\left[\begin{array}{c|ccc}1 & 1 & 1 & 1\\ \hline 0 & 0 & 0 & -2\\ 0 & -2 & -2 & -2\\ 0 & 0 & -2 & -2\end{array}\right]\rightarrow(-2)^{n-1}\begin{bmatrix}0 & 0 & 1\\1 & 1 & 1\\ 0 & 1 & 1\end{bmatrix}.

In the first transformation, the first row was subtracted from each subsequent row; in the second transformation, the first row and column were deleted and the factor −2 was pulled out of each remaining row.
A {1,  −1} matrix with no element −1 in row or column 1 is called normalized.  Any {1, −1} matrix may be brought to normalized form by multiplying rows and columns by −1 as necessary.  Normalization changes the determinant by at most a factor of −1.  As a consequence, the maximal determinant of an n×n matrix with elements in {1,  −1} equals the maximal determinant of an (n−1)×(n−1) matrix with elements in {0, 1}, multiplied by 2^{n-1}.

There are certain advantages to working with {1,  −1} matrices.  Every row and column has vector norm \sqrt{n}.  By Hadamard’s determinant bound, the maximal determinant is therefore n^{n/2}.  This bound is attained only when rows are pairwise orthogonal.  Since when n is odd the inner product of two rows is odd, the bound is never attained when n is odd and greater than 1.  We can go beyond this observation with the help of the following result:

Lemma 1.  Let i, j, k be {1, −1} vectors of length n.  Then i·j + i·k + j·k ≡ −n (mod 4).

Proof.  Define

  • a = the number of coordinate positions in which ij, and k have the same sign;
  • b = the number of coordinate positions in which i and j have the same sign and k has opposite sign;
  • c = the number of coordinate positions in which i and k have the same sign and j has opposite sign;
  • d = the number of coordinate positions in which j and k have the same sign and i has opposite sign.

Then

\begin{aligned}a+b+c+d&=n,\\ a+b-c-d&=i\cdot j,\\ a-b+c-d&=i\cdot k,\\ a-b-c+d&=j\cdot k.\end{aligned}

Adding the four equations and subtracting n from both sides yields 4an = i·j + i·k + j·k, which establishes the result.

A consequence of Lemma 1 is that if j and k are both orthogonal to i then j·k is congruent to −n (mod 4).  If n ≡ 2 (mod 4), this means that j and k are not orthogonal.   In other words, when n ≡ 2 (mod 4) it is impossible to have three pairwise orthogonal rows.  Therefore Hadamard’s bound cannot be attained when n > 2 and n ≡ 2 (mod 4).

This leaves 1, 2, and multiples of 4 as the matrix sizes in which Hadamard’s bound might be attained.  Matrices that do so are known as Hadamard matrices.  It is believed that Hadamard matrices exist of every allowed size.  This is the famous Hadamard conjecture.  (It is not clear that Hadamard actually made this conjecture; it certainly appears in Paley’s work on Hadamard matrices.  Interestingly, Wojtas attributes the conjecture to Sylvester, whose work preceded Hadamard’s by 26 years, but I have not been able to locate any statement concerning Hadamard matrices of non-power-of-2 sizes in Sylvester’s work.)

Can improvements on Hadamard’s bound be obtained when n > 2 and n congruent to 1, 2, or 3 (mod 4)?  The answer is yes, and such improvements were achieved by Barba, Ehlich, and Wojtas.  Let R be an n×n matrix with elements  in {1, −1}.  The key idea is to consider the Gram matrix G formed by the columns of R, that is, the matrix G=R^TR.  One could equally well consider the Gram matrix G′ formed by the rows of R, that is G'=RR^T.  The two Gram matrices are not equal in general—or even equivalent—and in some contexts it is useful to consider both of them at once.  (We define G and H to be equivalent if there is a permutation, applied both to rows and columns, followed by a set of negations, applied both to rows and columns, that transforms G into H.)

The Gram matrix is symmetric, positive-semidefinite, and has diagonal elements equal to n.  The columns of R may be partitioned into a set A consisting of those columns of R whose inner product with column 1 is congruent to n (mod 4), and a set B, possibly empty, consisting of those columns of R whose inner product with column 1 is congruent to n+2 (mod 4).  (Clearly A is non-empty since column 1 itself is in A.)  Lemma 1 implies that any two columns in A have inner product congruent to n (mod 4), that any two columns in B have inner product congruent to n (mod 4), and that any column of A has inner product congruent to n+2 (mod 4) with any column of B .  We may therefore assume, by reordering columns of R as necessary, that G can be partitioned as

G=\begin{bmatrix}S & U\\ U^T & T\end{bmatrix},

where S and T are symmetric, positive-semidefinite matrices with diagonal elements equal to n and off-diagonal elements congruent to n (mod 4), and U is a matrix with elements congruent to n + 2 (mod 4).

When n is even, this structure of G is independent of the normalization of the columns of R.  When n is odd, however, negating the columns of R in the set B results in a matrix for which all elements of G are congruent to n (mod 4).  Hence, by normalizing the columns of R in a suitable manner, we may always assume B to be empty.  The same normalization may also be carried out for the rows of R, so that the elements of G′ can be assumed congruent to n (mod 4) as well.  With such normalization in rows and columns, R is said to be parity normalized.

We will use the term candidate Gram matrix to denote an n×n symmetric, positive-semidefinite matrix ((g_{ij})) such that g_{ii}=n and g_{ij}+g_{ik}+g_{jk}\equiv -n\pmod{4}.  That is, a candidate Gram matrix satisfies the properties of actual Gram matrices described in the preceding paragraphs, including having elements compatible with Lemma 1.  When n is odd, any candidate Gram matrix is equivalent to a candidate Gram matrix all of whose elements satisfy g_{ij}\equiv n\pmod{4}.  When n is even, any candidate Gram matrix is equivalent to a partitioned matrix with blocks S, T, U satisfying the properties described above.  Unless otherwise stated, we will impose these additional requirements from now on.

There are properties of actual Gram matrices that we have not imposed on candidate Gram matrices.  For example, actual Gram matrices must have perfect-square determinant.  They must also satisfy criteria implied by the Hasse–Minkowski theory of rational equivalence of quadratic forms, which will be discussed in a future post.  The criteria that we have imposed are the ones that are locally true, that is, that are true of all of the principal minors of the matrix.  Having perfect-square determinant is not a local property.

One property of actual Gram matrices that candidate Gram matrices automatically do satisfy is that their determinant is divisible by 4^{n-1}.   This follows for n odd by subtracting row 1 of the candidate Gram matrix from each subsequent row and observing that each row except the first is divisible by 4.  It follows for n even by subtracting the first row of each partition from each subsequent row in that partition, and then observing that that the first row in each partition is divisible by 2, while each subsequent row in each partition is divisible by 4.

Ehlich and Wojtas independently showed in 1964 that when n ≡ 2 (mod 4), the candidate Gram matrix with maximal determinant has S and T both of size n/2, U = 0, and S and T both equivalent to the matrix (n-2)I_{n/2}+2J_{n/2}, where I_j is the j×j identity matrix and J_j is the j×j all-ones matrix.  In other words, the candidate Gram matrix with maximal determinant is equivalent to

G_2(n)=I_2\otimes\left((n-2)I_{n/2}+2J_{n/2}\right).

I plan to write a follow-up post describing this result in detail. Ehlich also found that a necessary condition for a decomposition G_2(n)=R^TR to exist, with R a {1, −1} matrix,  is that n − 1 be the sum of two squares.  In addition, he produced examples of such matrices R for all n up to 38, excluding 22 and 34, for which examples cannot exist since they do not satisfy the sum-of-two-squares condition.  Ehlich and Wojtas’s result establishes for n ≡ 2 (mod 4) that

\lvert\det R\rvert\le\sqrt{\det G_2(n)}=(2n-2)(n-2)^{(n-2)/2}.

In 1933, Guido Barba published a strengthening of Hadamard’s bound for odd n.  I have not translated his article from the Italian in order to study its arguments in depth, but Barba appears to state that, among symmetric, positive-semidefinite matrices with odd elements and diagonal elements equal to n, the determinant is maximized by matrices equivalent to G_1(n)=(n-1)I_n+J_n.  This implies that

\lvert\det R\rvert\le\sqrt{\det G_1(n)}=\sqrt{2n-1}\,(n-1)^{(n-1)/2}

for any {1, −1} matrix R of odd size n.  Ehlich, apparently unaware of Barba’s result, reproduced it in his 1964 paper.  Wojtas, on the other hand, cites Barba’s paper and its result.

There can be no {1, −1} matrix R satisfying R^TR=G_1(n) if 2n − 1 is not a perfect square.  Equivalently, there can be no such R if n is not the sum of two consecutive squares.  Supposing that n=q^2+(q+1)^2, solutions to R^TR=G_1(n) are equivalent to symmetric balanced incomplete block designs with parameters (v,k,\lambda)=(n,q^2,q(q-1)/2).  Such designs are known for all odd prime q, due to a construction of Brouwer, and for a finite number of other sizes.  The smallest q for which a design is not known is 6, which corresponds to n = 85.  Observe that the values of n for which Barba’s bound is attainable are all congruent to 1 (mod 4).

 That the matrix G_1(n) does not have all elements congruent to n (mod 4) when n is congruent to 3 (mod 4) suggests that Barba’s bound can be improved in this case.  In a sequel to his paper on the n ≡ 2 (mod 4) bound, Ehlich proves that this is in fact so.  He shows that among candidate Gram matrices of size congruent to 3 (mod 4),  the determinant is maximized by particular instances of matrices of a special form that we call Ehlich block form.  Let D(a, b, c, …) denote the block-diagonal matrix whose diagonal blocks are all-ones matrices of sizes a, b, c, …  Then the matrix E(a, b, c, …) = (n − 3)I − J + 4D(a, b, c, …) is said to be of Ehlich block form.  It has diagonal elements n, and off-diagonal elements 3 or −1, with the 3s appearing in blocks along the diagonal of the matrix.  For example,

E(2,2,2,1)=\begin{bmatrix}7 & 3 & -1 & -1 & -1 & -1 & -1\\ 3 & 7 & -1 & -1 & -1 & -1 & -1\\ -1 & -1 & 7 & 3 & -1 & -1 & -1\\ -1 & -1 & 3 & 7 & -1 & -1 & -1\\ -1 & -1 & -1 & -1 & 7 & 3 & -1\\ -1 & -1 & -1 & -1 & 3 & 7 & -1\\ -1 & -1 & -1 & -1 & -1 & -1 & 7\end{bmatrix}

is the Gram matrix of the determinant-maximizing {1, −1} matrix of size 7.  Furthermore, Ehlich proved that the optimal block sizes are determined by the following rules

  • there are s blocks of sizes r=\lfloor n/s\rfloor or r+1, with v = nsr of the latter, and u = sv of the former;
  • the number of blocks, s, is given by the following table.
    n 3 7 11 15 ≤ n ≤ 59 63 ≤ n
    s 3 5 5 or 6 6 7

     

This result implies the bound

\lvert\det R\rvert\le(n-3)^{(n-s)/2}(n-3+4r)^{u/2}(n+1+4r)^{v/2}\sqrt{1 - \frac{ur}{n-3+4r} - \frac{v(r+1)}{n+1+4r}}.

The methods that Ehlich used to obtain this result are well worth examining in detail, and I will do so in a subsequent post.

In size 7, Ehlich’s optimal candidate Gram matrix, E(2,2,1,1,1), satisfies \det E(2,2,1,1,1)/4^6=84, whereas the Gram matrix, E(2,2,2,1), of the actual maximal determinant {1, −1} matrix satisfies \det E(2,2,2,1)/4^6=81.  Clearly there can be no decomposition R^TR=E(2,2,1,1,1) since 84 is not a perfect square.  Cohn has determined the conditions under which Ehlich’s optimal candidate Gram matrix has perfect-square determinant; Tamura has carried out a detailed analysis using the Hasse–Minkowski criterion of the conditions under which Ehlich’s optimal candidate Gram matrix might decompose as G=R^TR.  One consequence is that, apart from n = 3, the smallest n for which a decomposition is possible is 511.  At present, Ehlich’s bound is not known to be attained for any n > 3.  A future post will discuss these analyses.

It is interesting to see how the improved bounds of Barba, Ehlich and Wojtas, and Ehlich compare with the Hadamard bound, h(n)=n^{n/2}.  We compute that

  • the Barba bound for n ≡ 1 (mod 4) is asymptotically \sqrt\frac{2}{e}\,h(n)\approx 0. 8577639\,h(n);
  • the Ehlich–Wojtas bound for n ≡ 2 (mod 4) is asymptotically \frac{2}{e}\,h(n)\approx 0. 7357589\,h(n);
  • the Ehlich bound for n ≡ 3 (mod 4) is asymptotically \frac{2\cdot11^3}{7^3}\sqrt\frac{1}{7e^3}\,h(n)\approx 0. 6545204\,h(n).

The Gram matrices of maximal-determinant matrices can be quite a bit more complicated than are the candidate Gram matrices used to obtain the upper bounds.  Consider, for example, n = 9, 11, 17, 19, 21, and 37.  Current record-determinant matrices also provide interesting examples, such as n = 22, 23, 29, and 33.  All known maximal-determinant matrices have the property that their Gram matrices G=R^TR and G'=RR^T are equivalent.  The smallest example of a record-determinant matrix for which this is not the case is n = 45.

References

  1. G. Barba, Intorno al teorema di Hadamard sui determinanti a valore massimo, Giorn. Mat. Battaglini 71 (1933) 70–86.
  2. A. E. Brouwer, An infinite series of symmetric designs, Math. Centrum Amsterdam Report ZW 202/83 (1983).
  3. J. H. E. Cohn, Almost D-optimal designs, Utilitas Math. 57 (2000) 121–128.
  4. H. Ehlich, Determinantenabschätzungen für binäre Matrizen, Math. Z. 83 (1964) 123–132.
  5. H. Ehlich, Determinantenabschätzungen für binäre Matrizen mit N≡3 mod 4, Math. Z. 84 (1964) 438–447.
  6. H. Ehlich and K. Zeller, Binäre Matrizen, Z. Angew. Math. Mech. 42 (1962) T20–T21.
  7. J. Hadamard, Résolution d’une question relative aux déterminants, Bull. Sci. Math. 17 (1893) 240–246.
  8. R. E. A. C. Paley, On orthogonal matrices, J. Mathematics and Physics 12 (1933) 311–320.
  9. J. J. Sylvester, Thoughts on inverse orthogonal matrices, simultaneous sign successions, and tesselated pavements in two or more colours, with applications to Newton’s rule, ornamental tile-work, and the theory of numbers, London Edinburgh and Dublin Philos. Mag. and J. Sci. 34 (1867) 461–475.
  10. H. Tamura, D-optimal designs and group divisible designs, Journal of Combinatorial Designs 14 (2006) 451–462.
  11. J. Williamson, Determinants whose elements are 0 and 1, Amer. Math. Monthly 53 (1946) 427–434.
  12. M. Wojtas, On Hadamard’s inequality for the determinants of order non-divisible by 4, Colloq. Math. 12 (1964) 73-83.

The Hadamard maximal determinant problem: website up-to-date

Since early 2002, Bruce Solomon and I have maintained an online database of record-determinant {1,−1}-matrices.  Around that time Bruce was setting lots of new records by means of computer search techniques that grew out of his collaboration with Roland Dowdeswell, Michael Neubauer, and Kagan Tumer, using some key ideas that had been developed by Warren Smith in his 1988 Princeton University PhD dissertation.  Trevor Welsh was also at that time establishing many new records for larger sizes congruent to 3 (mod 4), and, a short while later, I succeeded in proving maximality of Smith’s conjectured 15×15 maximal-determinant matrix.  The website was a way to keep the community apprised of the rapid progress that was then being made.   (There had been an earlier site, developed by Roland Dowdeswell, that was connected to a permanently running search program.  That site appears to have been taken offline.)  Eventually progress slowed, and the day-to-day maintenance of the site fell to me.

In 2005 there was another spate of new records, set by the participants in the programming contest organized by Lars Backstrom.  I regret that, while I continued to perform sporadic updating of the site, the records for certain matrix sizes have remained out-of-date since that time.

The reason for this post is to announce that, finally, the page has been brought up-to-date.  The sizes that are affected by this revision are n=45 in the 1 (mod 4) column, n=70, 78, and 94 in the 2 (mod 4) column, and n=35, 39, 43, and 63 in the 3 (mod 4) column.  My apologies to anyone who may have worked to improve any of the records that were out-of-date.  I will strive to keep the page more current in the future.  Even though the main page is up-to-date, many of the subsidiary pages for particular n are still out of date.  In particular, there are a number of classification results that have not yet made it to those pages.

I have plans to write a series of follow-up posts explaining the developments that led to these improvements.   Here’s a brief summary:

  • the methods using “doubly three-normalized” Hadamard matrices described in Orrick, W. P. and Solomon, B., Large-determinant sign matrices of order 4k+1, Discrete Math. 307 (2007) 226–236.  (arXiv preprint) were used to find the new record for n=45 by Brent, Orrick, Osborne, and Zimmermann.  The result was reported, but not described in detail, in http://arxiv.org/abs/1112.4160.
  • the results for n=70, 78, and 94 have been available at Tomas Rokicki’s website since 2005.  All are constructed from pairs of bordered, circulant matrices and were found by Rokicki or by Orrick and Rokicki.  These results have not been published, but a paper exists in draft form.  Additional information about the structure of the record for n=70, which is similar to the structure of the records for n=22, 34, and 106, can be found in the REU report of Adam Vollrath.
  • the results for n=35, 39, 43, and 63 were found and reported to the site by Hiroki Tamura in January 2011.  They are related to the work in Tamura, Hiroki, D-optimal designs and group divisible designs, Journal of Combinatorial Designs 14 (2006) 451–462.  Similar results for n=27, 31, and 59 were reported to the site by Tamura back in August 2005 (and added to the site at that time).  Apparently none of these results have yet been published, but a paper may be in the works.
Follow

Get every new post delivered to your Inbox.