This web page documents my current state of knowledge about a combinatorial problem that occurred to me in the summer of 2013 and intrigued me by having no obviously simple answer. Several people have been kind enough to help me gather data and thoughts on the problem, so I'm putting up a web page collating all of that.
I have so far not heard of any previous work on the problem, although there is a strong possibility that such work exists and I just haven't heard about it. Please send details, if you have any!
The problem is a one-dimensional dissection. Let m and n be positive integers. Suppose we have m matchsticks, each of length n. We want to turn them into n sticks of length m, by cutting them into shorter fragments and then gluing the fragments back together.
The catch is that we don't like short fragments, and we like them less the shorter they are. So we want all the fragments to be nice and large. Specifically, we want to find a dissection whose shortest fragment length is as long as we can make it.
Obviously, one possible dissection is to cut up every stick into n pieces of length 1, and then stick them all back together in groups of m, giving a shortest fragment of 1. If m and n have a common factor, we can improve on this by cutting them into larger pieces, still all the same size. In general, this gives us a minimum fragment of gcd(m,n). (Another way to achieve this same result would be to glue all the sticks into one long stick and then cut that into n equal pieces. That dissection uses fewer fragments, but the shortest fragment length is still gcd(m,n), and we don't care about the details of the dissection other than that.)
In many cases, however, you can do better than that. The case that first brought this problem to my attention was the one with n=6 and m=5. The trivial gcd solution gives you a minimum fragment of 1, but in fact we can dissect 6 sticks into 5 sticks with minimum fragment 2, by cutting up every length-5 stick as (2 + 3) and then reassembling as two lots of (2 + 2 + 2) and three lots of (3 + 3).
More interestingly still, the minimum fragment need not always even be an integer. (That was the aspect that surprised me enough to make the question really stick in my mind.) For example, consider n=5 and m=4: obviously if you use integer-sized fragments then you can't make the biggest fragment any larger than 1 (each length-5 stick would need a fragment of odd length, and you can't cut an odd piece from a length-4 stick without generating a piece of length 1). However, if you don't insist on integer fragments, the dissection of 5 into 4 can be done with shortest fragment 3/2: cut four of the length-4 sticks slightly unevenly into 3/2 + 5/2 and the last one evenly into 2 + 2, then make two length-5 sticks as 5/2 + 5/2 and the other two as 3/2 + 3/2 + 2.
To begin with, we'll establish the convention that m < n. (The problem is obviously symmetric in interchange of the two variables, and the case m = n requiring no cuts at all is so uninteresting that we may as well ignore it.)
We've established that the minimum fragment size need not be an integer, so an obvious next question is, need it even be rational?
The answer is yes, and we can prove it as follows. Consider an arbitrary dissection of m sticks into n containing at least one irrational fragment length. The lengths of all the fragments generate a finite-dimensional vector space over ℚ; let { e_{1}, …, e_{k} } be a basis for that space in which e_{1} is rational. Then the shortest irrational fragment length must have a nonzero coefficient of some basis element other than e_{1}, say e_{i}.
Now we construct a new dissection based on the old one, in which we adjust every fragment length by adding ε times its original value's coefficient of e_{i}, for some constant ε. Doing this must preserve the property that the fragments making up each input and output stick sum to the right total length (proof: the original e_{i} coefficients must have summed to zero in order to generate an integer stick length previously, so adding a constant multiple of all those e_{i} coefficients doesn't change the overall sum). So now we choose |ε| small enough that no two original fragment lengths change their ordering and no fragment length becomes negative or more than the entire stick size. (We can do this, since those constraints give us finitely many upper bounds all of which are strictly positive, so we can certainly find an ε smaller than all of them and still nonzero. The only interesting case is when a fragment was the whole of its stick – in which case it has a zero coefficient of e_{i} anyway, so we aren't changing it at all and no bound is placed on ε by the requirement for it not to overflow.) Finally, we choose the sign of ε so that it lengthens rather than shortening the smallest irrational fragment.
So now we've constructed a new dissection of basically the same shape as the old one, in which every fragment length is adjusted by at most a small amount, with the properties that:
Also, if we were to strengthen our optimisation criterion by imposing a tie-breaking condition in which dissections with the same shortest fragment were compared by looking at the next shortest fragment and so on, then any dissection optimal under that stronger condition would have to have every fragment length rational (because, again, we could always lengthen the shortest irrational one, without changing any of the rational fragment lengths shorter than it). This implies that not only must the minimum fragment size always be rational, but also there must always be an all-rational dissection which attains it. So we can rule irrationals completely out of consideration, which is a relief!
A problem with searching for dissections is, once you've found one, how do you prove there isn't a completely different one that does better?
One way to solve that problem is to find a genuinely exhaustive search strategy, one of which is described in the next section. But exhaustive searches are expensive, so it would be nice to have some other tools in our toolbox as well, such as strategies for proving that no solution for a given n,m can be better than a given value. Then, if you can find an upper bound proof and a dissection attaining that bound, you know you don't have to look any further.
One such upper bound proof, largely due to writinghawk, works by counting the pieces in the dissection. For this proof, we start by assuming that every length-m stick (i.e. every shorter stick) is cut into at least two pieces; this doesn't lose generality, because if any stick remains whole in a dissection, we can just cut it in half and put both the pieces into the same n-stick. (And this does not decrease the minimum fragment length unless all m-sticks were whole, i.e. m divides n, which is a special case with such an obvious answer that we don't need this proof anyway.)
So, let's suppose that every piece is at least length s. Also, every piece must be at most length m − s (because it must have been part of some m-stick, which had a piece of at least s cut off it).
That means that each stick of length n has to be cut into at least n/(m−s) and at most n/s pieces (otherwise some piece would have to be too long or too short). Because the number of pieces also has to be an integer, that really means it's at least ⌈n/(m−s)⌉ and at most ⌊n/s⌋. Summing those expressions over all n-sticks tells us that the total number of pieces P in the whole dissection must satisfy the inequality m⌈n/(m−s)⌉ ≤ P ≤ m⌊n/s⌋.
By similarly considering the number of pieces that a stick of length m is cut into, we derive a second inequality n⌈m/(m−s)⌉ ≤ P ≤ n⌊m/s⌋. (This is identical to the first inequality, but with n and m swapped everywhere they appear except in the divisor m−s).
So when s becomes big enough that either of those inequalities is self-inconsistent (with its LHS strictly larger than its RHS), or when the two are incompatible (because they define ranges for P that do not meet), there can be no dissection with s that large. This allows us to establish an upper bound on s for a given instance of the problem, which is often (but, results below indicate, not always) actually achievable.
A second upper-bound proof, due to Tom Womack, applies in a restricted set of cases, some of which are not solved optimally by the above proof. Specifically: if m/3 is an integer and is also a multiple of n−m, then no solution can be better than m/3.
Proof: suppose the shortest fragment is m/3 + ε for some ε > 0. Then if any segment of stick less than twice that length arises while you're dissecting, it must remain whole. In particular, any fragment shorter than 2m/3 + 2ε must be monolithic in this way.
So consider a stick of length m from which we cut a piece of length m/3+ε. The other piece of that stick has length 2m/3−ε and hence is monolithic; so it must have been cut as a whole from a longer stick of length n. Let n = m+k, because we'll want to talk about k a lot. Now we reason as follows:
We've done some computer searching to find the answers, or the probable answers, to small cases. We've got two different search strategies:
Iterate over adjacency matrices. This strategy relies on the idea that we can assume, without loss of generality, that each input stick contributes at most one piece to each output stick. (This is the reverse of the assumption made in the proof of writinghawk's upper bound above! But fortunately they don't have to both be true at the same time.) So you could draw up an n × m matrix indicating which input sticks contributed to which output sticks. Given such a matrix, the problem of finding an optimal dissection with that matrix can be written as a linear programming problem. (One variable per dissection fragment is the obvious way, but then the objective function ‘take the minimum’ isn't linear. But if you add an extra variable showing the minimum size itself, and then make the other variables say by how much each other piece exceeds the minimum fragment, then it is linear.) So, in principle, we can just iterate over all the possible 2^{nm} adjacency matrices, and solve a linear programming problem for each one! Of course this takes exponential time, but it turns out reasonably practicable in small cases as long as you prune branches of the search tree early and often if you can easily show they can't give a result as good as the best one you already have.
ILP over possible stick partitions. A completely different approach is to enumerate the various ways in which a stick can be partitioned into smaller pieces. (There are infinitely many, of course, but if you pick a denominator d and constrain to multiples of 1/d then that's not too bad.) This approach requires us to make a guess at the minimum fragment, so that we can enumerate partitions which use pieces no smaller than that. Once we have a list of the partition types for the input and output sticks, we express them as vectors showing how many pieces of what lengths they use, and then solve an integer-linear-programming problem to find combinations of them for which all the piece counts add up. Again, ILP is a computationally hard problem, but this algorithm seems to work OK in small cases because if you start your minimum fragment too big and lower it until you get a solution, you never have too many possible partitions to worry about.
The first of these strategies is genuinely exhaustive: assuming no bugs in the search program, the algorithm must find the best possible dissection. The second, however, leaves open the possibility that a better dissection might exist using a denominator larger than any we had considered. Examination of the data we have so far strongly suggests the conjecture that no denominator larger than n is ever needed in an optimal dissection (though denominators larger than m can certainly occur), but we have no proof of that, and so results generated by the second search program cannot be considered reliable unless an upper bound proof confirms that they can't be beaten.
Disregarding that weakness of the ILP approach, the two search algorithms complement each other well, because once n gets a bit larger we find that the matrix iteration algorithm works best for small m, while the ILP approach works best for large m (i.e. not much smaller than n). So running both has allowed us to collect a reasonable amount of data.
FIXME: link to git repo.
The table below shows the best minimum fragment size for all the n,m pairs I currently know the answer to, and the best upper and lower bounds I have for a few values I don't know.
The table cells are colour-coded to indicate our confidence in each number. Green indicates that the correctness of the result depends only on our upper-bound proofs; greenish-yellow indicates that the correctness depends on our exhaustive search program not having a bug in it; yellow indicates that the correctness depends on our conjecture that denominators are no larger than n (and also on a search program not having a bug); red indicates that we actually don't know the value and have listed a range of possible answers.
Each table cell links to a section further down the page giving an explicit dissection and discussing which (if any) upper bound proofs apply in that case.
Looking at the above table, there are clear patterns in the m=3 and m=4 columns. The sequence of fractional values 5/4, 4/3, 11/8, 7/5, 17/12, … shows an obvious regularity if you rewrite it as 3/2 − 1/4, 3/2 − 1/6, 3/2 − 1/8, 3/2 − 1/10, 3/2 − 1/12, …. And the fractions in the m=4 column tend upwards to 2 in a similar harmonic series too. The upper bound proofs seem to be tight everywhere in these columns as well (assuming the two uncertain cases in the m=4 column go as expected), so it may be tractable to establish a rigorous proof completely solving the problem for these two values of m.
There's also a nice pattern down the diagonal n=m+1. Our current bounding proofs are not reliably tight in that region, but it might be worth trying anyway to prove that that pattern holds along the whole diagonal, in the hope that in the process we can discover another useful bounding proof!
The patterns in columns 3 and 4 suggest a more ambitious conjecture to me. Each of those columns has obviously different behaviour depending on the value of n mod m (e.g. in the m=3 column the sequence of fractions is interrupted every third cell because something obviously different happens when n is a multiple of 3), but if you split up each column into m subcolumns by taking every mth cell, the pattern in each subcolumn is much simpler, being either constant or harmonic. So the more ambitious conjecture isf that perhaps a similarly simple pattern (of some sort) might hold in every column, if you split it into subcolumns by the value of n mod m. I don't think we have enough data here to say anything with confidence about that idea, though.
The case n=29,m=19 is especially intriguing for the fact that the largest denominator used in the whole dissection is 28, though the denominator of the minimum fragment is only 4. (And that dissection was found by the ILP search program, which tries small denominators first, so I think there probably is no dissection with the same minimum fragment and less silly other fragment lengths.)
FIXME: credit Ian, Tom, writinghawk, Jack for deAOCisation, anyone else?
The output below is generated by a script from a collection of data files and simple on-the-fly algorithms, so that I can easily re-run it if and when I acquire more data. So it will look a bit stilted and formulaic; sorry about that.