Sunday, April 12, 2009

A further remark on the orb puzzle

Here is a further remark on the orb puzzle also called the egg puzzle and (in)famously also called, because of interview forums, the 2 orb puzzle or the 2 egg puzzle.

Back here, in India, the object that guys have been using, since time immemorial, is an egg.

NOTE: In what follows, I would be using egg (and not orb) as the object of experiment. The title was chosen as "A further remark on the orb puzzle" as a homage to the mind of whoever first came up with this remarkable puzzle. Now you can gear yourself up for the further remarks, which to my best knowledge have not been accumulated elsewhere.

If you want to know what the puzzle and how to solve it, you can read my previous post.

Now that you have settled (assumption being you are aware of the puzzle and its solution) you can start reading what follows.

Somewhere, while doing the above question, my thoughts turned to a strange corner. I had solved the problem in its full generality and had also tried cases with different number of eggs and floors.

With 3 eggs, it took you 9 turns to cross 100 for the first time. With 4 eggs, this number was 8.

With 5 eggs, the number came out to be 7.

I started wondering, could I beat binary search with this method?

I just wanted to see the number of floors you scale when you were allowed 7 drops. Could it exceed 128?

I tried with 6 eggs. Answer? 126.

7 eggs. 127.

Using more than 7 eggs, as expected, gives 127 because max drops is 7. And thus, for the following number of eggs, I had

8? Again 127
20? Again 127.

I then had a look at floors scaled with 6 drops. The answer was 63. 63 appeared the first time (with 6 being the fixed number of drops) when I used 6 eggs.

127 appeared (with 7 being the fixed number of drops) the first time when I allowed myself at least 7 eggs. Wherefrom, it remained 7 regardless of the number of eggs.

Hmmm..Something was brewing in the mathematical universe. The number of floors scaled with a certain number of drops had an upper bound which could not be pushed by upping the number of eggs.

What was going on?

I found the answer in a pdf by Umesh Nair. Though he does not address this issue directly, the treatment he invoked for solving this question could be readily extended to solve my problem.

In particular, he made the interesting observation that if you are allowed a total of 'd drops' then you could not scale more than 2^d floors.

Once I read this statement, it was pretty obvious!

Can you see why?

The answer lies, as Mr. Nair points out, in the innocous fact that for a 'particular setting' ('e eggs', 'n floors', the 'break floor' being the b-th floor) there exists a unique sequence of drops which locates the 'break floor'.

You change the number of 'break floor', you see a different drop sequence emerge. There is, thus, a one-to-one mapping between drop-sequence and a 'particular setting'.

And therefore, the number of drop sequences equals the number of floors you scale for a particular setting.

And obviously, you have scenarios where you do not end up using all of your 'd drops' for a particular setting.

For example, even though in case of 2 eggs and 100 floors, you are allowed a total of 14 drops, if the 'break floor is 1' you can locate it in 2 drops! (not all 14 drops are needed).

Therefore, some drop-sequences are smaller in size. Which means that with maxmium 'd drops' you can really not scale unbounded heights as you increase the number of eggs. Further that bound is 2^d (as remarked above) because some drop sequences being smaller, terminate the entire sub-drop tree that could have been rooted at that position.

Because for the 2 egg, 100 floor, 1st floor-break-floor setting, drop sequence 1st-at-14 and 2nd-at-1 terminates; you cant have, in this particular case (where 1is the break-floor) any other trials after completing your 2nd drop.

This is pretty interesting analysis!

Hats off to Mr. Nair for his work on this problem!

He has made a few other remarks as well. For example, he has converted the original recurrence (which you can also find in my previous post) to a combinatorial summnation.

I will write about that later.

So far, so good.

But that still does not answer why this egg method converges with binary search when number of floors and number of eggs are in binary-search-tandem.

(That is floors are 2^e - 1 when number of eggs is e). In case, you are not sure it does, think a little bit about it; i will not answer it here apart from mentioning that you should be able to make it out from what has been said above.

Call it the tandem-issue.

Intuitively, we would expect someone not familiar with the recursive solution this puzzle boasts of, to come up with the binary search strategy only, given number of eggs and floors are in binary-search-tandem. (In fact, guys usually come up with binary search even when the values are not in that tandem - but that is not what I intend to discuss).

The fact that one comes up, on an intuitive level, with binary search given the tandem complimented with binary search information-theoretic content, maybe reason enough for someone to point out that this solves the tandem-issue.

But such a reasoning is not complete. We need to answer why in binary-search-tandem settings, does the recurrence become as good as binary-search itself. We should not invoke reverse reasoning. That is, we should not say that binary search is what we can use in tandem-issues and since it is known to be best, thats what we do.

We have also got the responsibility to answer why our-recursive-way will also proceed the binary-search way. Sure, it could have figured out a different way requiring the same number of drops differently.

Then why did not it do so? Why another mathematical cataclysm?

Now that we are clear on the problem, we solve it.

We just need to show that T(d, d) = 2^d - 1.

Try it on your own. I will answer it soon in this space only.

For now, its goodbye

Friday, April 10, 2009

A few algorithmic puzzles

Hello There,

Today I will be telling you about a few algorithmic puzzles.

The talk basically involves a problem with its solution. So, if you are a reader who wants to try these out before peeking out at the solution, beware.

Also, I would say that unfortunately, one of these puzzles has been made too popular through interview forums and other discussions, which makes the solving techniques used for these kinds of puzzles look futile.

It is my opinion that interview forums should operate a little differently.

Too much digression, now I will come straight to the point.
Below, I present the first of the two questions that I will be tackling in this blog.

Prob.1) Given 2 special eggs and a building 100 floors high, can you dear reader, find the minimum number of drops in which you can find the height after which these 2 eggs start breaking.

Oh yes, the eggs are special in the sense that they wont start breaking unless dropped from or above the threshold height.

Sol.) The key fact here is to realise that you need to prove the minimality of your solution for which the most general approach that you can use is a recursive one.

Infact, in my opinion, for handling discrete minimization/maximization problems, the best way is to establish a recursive equation. Once you have done that, establishing the lower/upper bound is as easy (hard) as solving the recurrence.

Okay, did you try this problem on your own so far. If yes, did you arrive at the first wrong answer most people do (19). Or did you arrive at the correct answer 14?

Whatever you did, probably reading the explanation for 14 below won't hurt. (I will not explain how people arrive at 19).

Well, the first thing we see is that, its a minimization problem. A good thing about these problems is that, they can always be inverted. You can recast it into a maximization problem which maybe (and in most cases, in my experience is) easier to answer.

There is actually a reason for this as well. Maximization problems tie well with out intuition of "overflow". The analogy is that a bucket is said to hold a 'maximum quantity' when adding anymore quantity leads to 'quantity spillage'.

In mathematical terms, you try to show that a set S contains maximum elements with Property P when adding anymore quantity of elements with Property P will spill one or more quantities. (However, there maybe a few subtle issues involved.) More on this later.

A small, but powerful technique.

In this case, the corresponding maximization problem is to determine the maximum height you can scale using 2 eggs and 'd drops.'

Vary 'the number of drops' till the height scaled is greater than or equal to the number of floors in the question.

Well, the above discussion suggests a not-too-hard/easy-to-find recurrence for this problem.

Try one last time before I give it away in its full grandeur.

Let us assume that T(e, d) indicates the height you scale when you have 'e eggs' and 'd drops'

Hmm. I have selected the notation. What remains is notion. As Gauss remarked, it is notion not notation which is important.

But a notation is also good enough as a starting point. Now - the notion.

Well, its recurrence, right? (Notation says so)

Then the corresponding notion is to setup on its RHS another invocation(s) of the same. (Notion)

Clearly it involves using on RHS some quantity smaller than the quantity you begin with so that the recursion terminates.

SO, what can that be? Well you pick an intelligent height, and go for the first drop. And then you have d-1 drops left. Hold it. You can see your recursion now.

After 1st intelligent drop, you can scale T(e, d-1) floors. The intelligent choice must be so placed, that if the egg breaks, then the culprit floor among the floors below is locat-able with 'e-1 eggs' and 'd-1 drops'. That is the intelligent drop is one level above T(e-1, d-1)!

Thus, T(e, d) = T(e, d-1) + [T(e-1, d-1) + 1]

And for recursion termination?

T(1, d) = d.

NOTE: I have not highlighted (maybe even not told)in this problem where does that 'overflow analogy' is. But rest assured, you can find it.

Will post it later, and then remove this note.

Prob. 2) Well, now we come to the second problem. It is from a competition called felicity (organised by IIIT Hyderabad).

The problem asks you to 'maximize' the number of elements in a set S. All of the elements in S come from a set A of natural numbers which contains numbers from 1 to 2000.

But you need to be sure that you do not include a number thrice (or one-third) and five-times (or one-fifth) of some number you already picked. These elements thrice (or one third) and five times (or one-fifth) are called respectively the first-kin and the second-kin of the number picked.

You need to determine the maximal size the set S can be.

Go on, try it! Below I produce the solution.

Sol.) The key fact here is that it is (indeed) a direct maximization problem. We can try to use the 'overflow analogy' as I demonstrate below. As of now I have not tried a recursive solution to this problem myself, but anyways I have put the 'overflow analogy' to good use here.

See if you can arrive at the solution using what has been said this far (which by the way is no hint..but its okay to try with the assurance that you may need to invoke this process, no matter how insignificant the 'place of invoking' is).

Well, if you tried and did not get it (or maybe did not try it at all) here is the solution again in full grandeur.

Ask yourself, when does overflow begin?
The answer is when maximum cardinality is reached, right?
And how exactly do I arrange matters to create it?

It is this question which puts your creativity on trial.
You can begin with a few cases simpler to analyze. For example limit the size of the set A to 10. How big set S can you create now?

The first choice, guys (usually) come up with. is S = {1 2 X 4 X X 7 8 9 X}.
(X indicates the absence of an element.)

Have we achieved true maximality?
Sadly, no.

Hmmm... these guys say that 'overflow analogy' says that placing any absent element into the set S would cause spill and therefore, they claim that they have achieved true maximality.

And this is where we change all the rules. This is precisely the subtlety I talked about earlier. Call it the subtle issue.

You can see that you can make S bigger by removing 1 and keeping 3 and 5 in the set. Thus, the notion of 'quantity spillage' in these problems is of a different kind. Here, your bucket holds maximum quantity when there are no left out quantities keeping which back spills less.
For example, consider one second answer : S = {X 2 3 4 5 X 7 8 X X}. Hell, its no good than the first one.
This configuration (in second answer) is bad as we can allow 'overflow of just one quantity (2) by adding two quantities (6, 10).

A little experimentation shows true maximality is achieved when S is the set with precisely the above modification= {X X X X 4 5 6 7 8 9 10}

You can see that adding any other element leads to 'spillage'. Also, notice that no quantity 'spills less than it contributes'. That is, you can't (in this particular example) add two quantities spilling out just one.

Thus, we have achieved true maximality.

Similarly, for A = {1, 20} the maximal S is
{1(stop) 7,8,9,10,11,12,13,14,15,16,17,18,19,20}.

But, hold your horses. Do you see it?
See what?
An Aha! insight my friend.

The maximal solution outlined above has contiguous elements! Maybe its 2 or 3 sections but within each section, elements are contiguous.

We will prove that you can always arrange matters so that this condition holds true.

First, notice that your examples above for 10 and 20 elements have one thing in common.
Both of them end at the last element.

So we can try beginning in general with the last element. We will create a decreasing sequence of numbers till we hit n/3. We list all the numbers starting from n down to floor(n/3) + 1. Call these numbers Range-1-start and Range-1-end respectively. You can see that Range-1-start and Range-1-end hold a decreasing sequence of numbers.

Next, we make another sequence starting from (Range-2-start) ceiling(Range-1-start/5) - 1 and continue down till we hit floor(Range-2-start/3) + 1.

This way, you go on till you run out of elements.

Creating this sequence for 1st 300 numbers goes like this.

first-sequence: all numbers from 300 down to 101 (inclusive)
second-sequence: all numbers from 20 down to 7. (inclusive, 21 and others are not there as including them would result in throwing away elements of the first sequence).
third sequence - Containing just the number 1.

This way, you can achieve maximality for any n. It seems you have already proved it for its clear enough.

In case you have not, here is one proof. Including any discarded element will cause spill of one or more than one entries and you don't have the bad situation of more elements getting added at expense of fewer getting discarded in the above construction.

Consider the case when A holds 300 elements. Discarded elements are between 21 and 100 (inclusive) and also between 2 and 6 (again inclusive). Putting 'boundary elements' looks like a dangerous idea; we will analyze it later (It is related with the subtle issue). Putting non-boundary elements leads to spillage of two and is definitely wasteful.

Now back to 'boundary elements' (and neighbors). Putting these back spills an element from the first sequence which cannot lead to the subtle issue as the second-kin of the spilled element from the first set is not in any of the two sequences. Putting it back will lead to one other spillage. So it is also wasteful. And there, finally you have seen true maximality.

Also, I will help you with an extample of one boundary element. For example in the case of A holding 300 elements, if you add 100 to the set S, you spill 300. Hmm. can subtle issue arise? And can it lead to inclusion of two elements at the cost of one 300?

No, it won't. For after inclusion of 100, 300's first-kin, you cannot include its second-kin, 50. Including it will spill 10.

This is the general idea behind the proof which can be easily made more rigorous.

NOTE: Some of my friends told me that it was difficult to make the connection between consecutive-ness of elements and the maximality of the set S. I agree.

Also, they say that 'I got lucky' by guessing the pattern with 2 examples which normally requires many turns. And they also say what if the pattern did not hold for long. I could have got stuck.

True. But my answer is that I did not stop at 2 trials (examples for 10 and 20). I tried a total of 20 trials! (From |A| = 10 to |A| = 30)!!

Thats what mathematical problem solving is about. This is what research is about and this is what interview forums do not do. And yes, this is as far from this research as anything might be but my primary attack is not at its unimaginable light-yearish distance from research but it is primarily again at interview forums which do not promote students to think.

They make you more of an analyst and less of a designer.
And the subject was called 'Design and Analysis of algorithms'. Absurd

Enough for now.

There are a few other problems, I would later include in this post itself. Right now, I want to rest.