Martin Gardner, the first Mathemagician, the first editor of the column called Mathematical games in Scientific American and an author of over 70 thought provoking books on topics mostly related to mathematics, physics, logic and philosophy passed away on May 22 this year.

Given that he was 95, it had to happen sooner or later, but still his passing away came as a blow to me. I am a huge admirer of books by Gardner. I love trying out new puzzles and to me Gardner's books were God sent at a time when all I could see around myself were books by George Summers, Shakuntla Devi and Ravi Narula. I did not find these books to be any good as most of the puzzles they contained were not good enough. I could solve most of the problems of these books when I tried them and most of the time I did not feel like trying them as they were too boring -- to me, they looked like problems of grocery bill variety. I was of belief that good puzzles ought to have some "sort of life". Essentially what I was looking for were questions of the type which required some Aha! moments as Mr. Gardner put it.

Then I ran across several of his books - Aha! Insight, Aha! Gotcha, Puzzles and Diversions, Knotted Doughnuts, Penrose Tiles to Trapdoor Ciphers, Colossal book of Mathematics and a few other books. These books made mathematics look like sport. I am not saying they made mathematics look easy -- they made mathematics look fun. I also joined the group of people whose lives were affected by Gardner. As Conway, Berkelamp and Guy put it in their Winning Ways for your Mathematical Plays, Gardner truly brought more math to millions than did anyone else.

If you have heard of some good puzzle that you think requires some Aha! moment, chances are that Martin Gardner first popularized the puzzle in some of his columns or through some of his books.

His passing away certainly shocked me. But Martin Gardner has left an undying legacy behind him. You should definitely go and see his books if you have not seen them so far. I promise, it will be an Aha! experience.

I Salute you Mr. Gardner. Farewell, Sir.

## Thursday, June 3, 2010

## Tuesday, February 2, 2010

### Analyzing Riffle Shuffles

Hello Everybody,

I am writing this narrative describing properties of riffle shuffle that we analysed in Markov Chains class recently. Its in three parts. The parts are identified by the ******* marks. If you want to skip to the technical material, read the last part - the one that follows the last star marks.

"No, no, no, its not possible", I checked the board again for the fourth (or fifth) time. But there it was. Sitting on the board it was a masterpiece of reasoning which produced that result.

I was still in the

The result was so awe-inspiring that I immediately decided - I was going to blog about it. I told Prof Randall that I was about to blog on the topic. She gave a Dumbledoric smile. But I knew I had undertaken some responsibility. I started searching my mind for the way the lecture began.

**********************************************************************************

It started rather innocuously. Prof Randall was teaching something about hypercubes - and I drifted. I suddenly recalled a movie called hypercube and my attention was partly swayed. When I came back to myself, I had skipped a few sentences. "

But it was okay. I was able to follow the other parts of the lecture including the hypercube part itself. A few moments later, a doubt grabbed me. I wanted to get it clarified and lo -

As it were, I was there sitting partly distracted by the latest turn of events, when a voice (a question that Prateek Bhakta asked) from behind brought me to my senses and made me attend the rest of the lecture that finished with a wonderful demonstration of some curious properties of riffle shuffles that I intend to discuss in today's post. This is the heart of today's post and forms the last part.

**********************************************************************************

So, what in Merlin's name is a Riffle Shuffle? Well it is a way to shuffle a deck of cards which we will shortly see randomizes the order of the cards in the deck in very quickly. Our analysis here uses some part of the lecture but also draws heavily from many other sources which I will mention towards the end of this post.

Let us begin the tour with a fairly simple question. You have got a deck of cards labelled $1,2...n$ from top to bottom on a table. Let us call this initial ordering of cards the identity permutation.

When can you claim that your shuffles have randomized the order of the cards in the deck? When indeed?

Put simply, this question asks "how many shuffles suffice?"

At a certain level you might feel like this question asks you to define randomness for which we do not have a very comfortable mathematical notion. But still, in some sense, we can answer this question. We rely on what we call

Informally, the idea is to estimate the "distance" of two probability distributions over all the possible permutations of cards in the deck.

Total variation between two probability distributions $P$ and $Q$ is given by $||P - Q|| = \frac{1}{2} \cdot \sum_{\pi \in S_n} \vert P(\pi) - Q(\pi) \vert$ where $\pi$ is one permutation and $S_n$ is the group of all possible permutations.

You pick one of these distributions as the uniform distribution and the other one as the distribution you generate after $k$ shuffles of your scheme. You hope to develop a scheme for which $k$ is not inordinately large (of course quantities of the order of $52!$ are an abomination, but so are quantities like $200,\ 100$ or even $50$ shuffles).

Hmmmm....what should we do. What indeed? The first idea might be to invoke what we call the coupon collector. The problem involves something like throwing a $n$-sided-dice and asks you to estimate the expected number of throws after which you can claim that all the faces have turned up atleast once. The answer can be given by summing up over a geometric random variable.

Here, in card shuffling, the idea can be to keep removing the top-card and inserting it someplace in the deck randomly. The

The idea for this is again intuitively obvious. There are two things that you will need a proof for. One - that it really achieves small total variation from the uniform distribution; two - it does so within the stated time bound.

That one holds can be understood this way - the order of cards below the original bottom card is really random. In expectation, when $i$ cards are already below the bottom card, it takes $\frac{n}{i+1}$ shuffles to get it in the desired position (i.e., anywhere below the bottom card). This explains the $n \cdot log(n)$ moves answer that you obtain if you do shuffling this way.

But you are not happy with this. With $n = 52$ cards, this works out to be more than $200$ shuffles. But we are not out of ideas yet. Not so fast. We will arrive at our target in less than 10 shuffles. Below we show how.

This technique is called Riffle Shuffle (henceforth RS). It involves cutting the deck of cards into two halves and then interleaving them in some fashion (so that you obtain a maximum of two interleaved increasing sequences) and keep repeating this process quite a few times. If you label the cards in the left hand all with a $0$ and those in the right all with a $1$, the two increasing sequences are those that are $0$-identified and those that are $1$-identified. This method really arrives at a distribution close to the uniform pretty fast.

Lets ask a simple question first. How many distinct RSs can you do on a deck of $n$ cards? It turns out, that if you have $t$ cards in one of your hands, number of possible RSs are $ n \choose t$ $-1$ (why?). Adding over all $t$'s, you get $2^n - n$ as the total number of possible RSs.

Lets also spare a look at the probability distribution on $S_n$ due to RS. It is fairly simple to explore and I would leave it to the reader to think why it comes out the way it does

$Rif(\pi) = \frac{n+1}{2^n}$ if $\pi$ is the identity permutation.

$Rif(\pi) = \frac{i}{2^n}$ if $\pi$ consists of two increasing sequences

$Rif(\pi) = 0$ otherwise

We are doing good. Now its time for the leap. "

Thing is, you can study what we call the inverse riffle shuffle (henceforth IRS) to analyze the number of number of riffle shuffles needed to randomize the deck. Both have the same mixing time! And that is amazing. Too amazing.

So, let me define IRS and tell you what its curious properties are which might help in understanding this beautiful concept. (The material below largely draws from the wonderful Proofs from the book by Martin Aigner)

An inverse shufle would take a subset of the cards in the deck, remove them from the deck, and place them on top of the remaining cards of the deck - while maintaining the relative order in both parts of the deck. Such a move is determined by the subset of the cards: Take all subsets with equal probability. Equivalently, assign a label $0$ or $1$ to each card, randomly and independently with probabilities $\frac{1}{2}$, and move the cards labelled $0$ to the top.

That IRS gives the same probability distribution as RS is obvious as you can check that you get the identity permutation precisely when all $0$-cards are above the $1$-cards.

So analyzing the IRSs is good enough! Also, as informally noted above, the inverse shuffles give a probability distribution $Rif(\bar{\pi})$ such that $Rif(\bar{\pi}) = Rif(\pi^{-1})$. Further, we know that Uniform permutation is pretty cool in that it allows $U(\pi) = U(\pi^{-1})$.

So, the following two quantites are same

$||Rif(\pi) - U|| = ||Rif(\bar{\pi}) - U||$.

Finally, we note when should we stop the iterations of IRS so that we can state with confidence that the deck is randomized. Let us suppose that you can do this after $k$ shufflings. Each shuffling, the IRS way, is according to the bit-that you assign to that card. SO, you can proceed by assuming that you assign a random $k$-bit string to the card. And then the cards fate is decided by the string it started out with. Two cards might still stay in same relative order if they started out with same bit strings. This can be avoided if all cards get distinct bit strings. So, the problem now reduces to a problem of Birthday Paradox. As I said, this was where I went through those stages

I would invite you to show why. For the benefit of others, you can leave the proof as a comment.

Since, its 6'0 clock in the morning and I want to catch some rest, I will not do that now.

Have a Good Day

-Akash

I am writing this narrative describing properties of riffle shuffle that we analysed in Markov Chains class recently. Its in three parts. The parts are identified by the ******* marks. If you want to skip to the technical material, read the last part - the one that follows the last star marks.

"No, no, no, its not possible", I checked the board again for the fourth (or fifth) time. But there it was. Sitting on the board it was a masterpiece of reasoning which produced that result.

I was still in the

*strong defiance*stage. The three stages of digesting (rather assimilating) a hard to believe truth, that someone in some corner of the world sometime pointed out, huddled through my mind*Strong Defiance, Mild Reluctance....finally Admittance*.The result was so awe-inspiring that I immediately decided - I was going to blog about it. I told Prof Randall that I was about to blog on the topic. She gave a Dumbledoric smile. But I knew I had undertaken some responsibility. I started searching my mind for the way the lecture began.

*"No, this is not where I want to start*", I told myself. After some pondering over the issue, I knew where to start now. Enjoy the rest of the post**********************************************************************************

It started rather innocuously. Prof Randall was teaching something about hypercubes - and I drifted. I suddenly recalled a movie called hypercube and my attention was partly swayed. When I came back to myself, I had skipped a few sentences. "

*Damn, not again*", I cursed myself. I cannot understand why this word hypercube triggers an imagery of sci-fi movies through my brain even in lectures.But it was okay. I was able to follow the other parts of the lecture including the hypercube part itself. A few moments later, a doubt grabbed me. I wanted to get it clarified and lo -

*I forgot my doubt*. This has happened to me before as well. But today was a different day. Two hated things happened on the same day in the same lecture. The irony is that I cannot even call it ironic as it was too moronic.As it were, I was there sitting partly distracted by the latest turn of events, when a voice (a question that Prateek Bhakta asked) from behind brought me to my senses and made me attend the rest of the lecture that finished with a wonderful demonstration of some curious properties of riffle shuffles that I intend to discuss in today's post. This is the heart of today's post and forms the last part.

**********************************************************************************

So, what in Merlin's name is a Riffle Shuffle? Well it is a way to shuffle a deck of cards which we will shortly see randomizes the order of the cards in the deck in very quickly. Our analysis here uses some part of the lecture but also draws heavily from many other sources which I will mention towards the end of this post.

Let us begin the tour with a fairly simple question. You have got a deck of cards labelled $1,2...n$ from top to bottom on a table. Let us call this initial ordering of cards the identity permutation.

When can you claim that your shuffles have randomized the order of the cards in the deck? When indeed?

Put simply, this question asks "how many shuffles suffice?"

At a certain level you might feel like this question asks you to define randomness for which we do not have a very comfortable mathematical notion. But still, in some sense, we can answer this question. We rely on what we call

*total variation distance*to save the day.Informally, the idea is to estimate the "distance" of two probability distributions over all the possible permutations of cards in the deck.

Total variation between two probability distributions $P$ and $Q$ is given by $||P - Q|| = \frac{1}{2} \cdot \sum_{\pi \in S_n} \vert P(\pi) - Q(\pi) \vert$ where $\pi$ is one permutation and $S_n$ is the group of all possible permutations.

You pick one of these distributions as the uniform distribution and the other one as the distribution you generate after $k$ shuffles of your scheme. You hope to develop a scheme for which $k$ is not inordinately large (of course quantities of the order of $52!$ are an abomination, but so are quantities like $200,\ 100$ or even $50$ shuffles).

Hmmmm....what should we do. What indeed? The first idea might be to invoke what we call the coupon collector. The problem involves something like throwing a $n$-sided-dice and asks you to estimate the expected number of throws after which you can claim that all the faces have turned up atleast once. The answer can be given by summing up over a geometric random variable.

Here, in card shuffling, the idea can be to keep removing the top-card and inserting it someplace in the deck randomly. The

*stopping rule*will be to say that you are done when you place the $n^{th}$ card someplace in the deck.The idea for this is again intuitively obvious. There are two things that you will need a proof for. One - that it really achieves small total variation from the uniform distribution; two - it does so within the stated time bound.

That one holds can be understood this way - the order of cards below the original bottom card is really random. In expectation, when $i$ cards are already below the bottom card, it takes $\frac{n}{i+1}$ shuffles to get it in the desired position (i.e., anywhere below the bottom card). This explains the $n \cdot log(n)$ moves answer that you obtain if you do shuffling this way.

But you are not happy with this. With $n = 52$ cards, this works out to be more than $200$ shuffles. But we are not out of ideas yet. Not so fast. We will arrive at our target in less than 10 shuffles. Below we show how.

This technique is called Riffle Shuffle (henceforth RS). It involves cutting the deck of cards into two halves and then interleaving them in some fashion (so that you obtain a maximum of two interleaved increasing sequences) and keep repeating this process quite a few times. If you label the cards in the left hand all with a $0$ and those in the right all with a $1$, the two increasing sequences are those that are $0$-identified and those that are $1$-identified. This method really arrives at a distribution close to the uniform pretty fast.

Lets ask a simple question first. How many distinct RSs can you do on a deck of $n$ cards? It turns out, that if you have $t$ cards in one of your hands, number of possible RSs are $ n \choose t$ $-1$ (why?). Adding over all $t$'s, you get $2^n - n$ as the total number of possible RSs.

Lets also spare a look at the probability distribution on $S_n$ due to RS. It is fairly simple to explore and I would leave it to the reader to think why it comes out the way it does

$Rif(\pi) = \frac{n+1}{2^n}$ if $\pi$ is the identity permutation.

$Rif(\pi) = \frac{i}{2^n}$ if $\pi$ consists of two increasing sequences

$Rif(\pi) = 0$ otherwise

We are doing good. Now its time for the leap. "

*No, no, no, its not possible", I checked the board again*.Thing is, you can study what we call the inverse riffle shuffle (henceforth IRS) to analyze the number of number of riffle shuffles needed to randomize the deck. Both have the same mixing time! And that is amazing. Too amazing.

So, let me define IRS and tell you what its curious properties are which might help in understanding this beautiful concept. (The material below largely draws from the wonderful Proofs from the book by Martin Aigner)

An inverse shufle would take a subset of the cards in the deck, remove them from the deck, and place them on top of the remaining cards of the deck - while maintaining the relative order in both parts of the deck. Such a move is determined by the subset of the cards: Take all subsets with equal probability. Equivalently, assign a label $0$ or $1$ to each card, randomly and independently with probabilities $\frac{1}{2}$, and move the cards labelled $0$ to the top.

That IRS gives the same probability distribution as RS is obvious as you can check that you get the identity permutation precisely when all $0$-cards are above the $1$-cards.

So analyzing the IRSs is good enough! Also, as informally noted above, the inverse shuffles give a probability distribution $Rif(\bar{\pi})$ such that $Rif(\bar{\pi}) = Rif(\pi^{-1})$. Further, we know that Uniform permutation is pretty cool in that it allows $U(\pi) = U(\pi^{-1})$.

So, the following two quantites are same

$||Rif(\pi) - U|| = ||Rif(\bar{\pi}) - U||$.

Finally, we note when should we stop the iterations of IRS so that we can state with confidence that the deck is randomized. Let us suppose that you can do this after $k$ shufflings. Each shuffling, the IRS way, is according to the bit-that you assign to that card. SO, you can proceed by assuming that you assign a random $k$-bit string to the card. And then the cards fate is decided by the string it started out with. Two cards might still stay in same relative order if they started out with same bit strings. This can be avoided if all cards get distinct bit strings. So, the problem now reduces to a problem of Birthday Paradox. As I said, this was where I went through those stages

*Strong Defiance, Mild Reluctance....finally Admittance*.I would invite you to show why. For the benefit of others, you can leave the proof as a comment.

Since, its 6'0 clock in the morning and I want to catch some rest, I will not do that now.

Have a Good Day

-Akash

### Some fundamentals of Markov Chains

Hello All,

This is a placeholder post. I will replace it with the contents that should be here soon.

Till then you might want to see the $7^{th}$ Chapter of Mitzenmacher-Upfal book

This is a placeholder post. I will replace it with the contents that should be here soon.

Till then you might want to see the $7^{th}$ Chapter of Mitzenmacher-Upfal book

## Monday, January 11, 2010

### Some snapshots of Markov chains and Monte Carlo Methods

Sorry for choosing such a bad name for today's blog. The problem is - this was the first lecture on the subject today, and I did not really know what else to call this blog.

Anyway, this semester, spring 2010, I am enrolled in a course called Markov Chains and Monte Carlo Method (henceforth MCMC) here at Georgia Tech. Our instructor is the wonderful professor Dana Randall and she has an amazing knack for surprising you in each of her lectures. I do not understand how she communicates her lectures this nice - which in essence is ,in my opinion, an art of explaining things effectively - an art I am really horrible at. I guess to explain this nice you have to understand

*nicer*.

Anyways, enough belaboring, let me get to the point; but wait let me get some other things out of the way first.

(The section below definitely need some rework; I hope I will get around to do that)

The exposition in this series of lectures by Prof Randall will be mainly technique driven. That is, we will see the techniques that have evolved while we work with Markov Chains to solve some important problems in Graph Theory, Statistical Physics, Molecular Biology and other curious areas. We will begin our tour with some classical counting problems and see the power of Markov chains as they solve the problem for us by developing appropriate context and motivation.

Last but not the least, since I do not trust my memory fully, I may skip some of what Professor taught in the class owing to my RAM (memento style temporary memory, well not exactly memento style) and thus do a poor job of presenting the lecture's material (with high probability) and sometimes I may add a few extra things from my knowledge that could make the lecture's material a little easier to follow (with vanishingly small probability).

Now the lecture.

So, as I said above, we begin the tour with some classical counting problems.

So, let me just mention the following problem. Can you decide in polynomial time whether an arbitrary undirected graph has got a perfect matching or not? If you can, can you explicitly find that matching? Going further, can you count the number of matchings in an arbitrary undirected graph? Lastly, can you return an arbitrary perfect matching from the graph (assuming graph has got some finite number, $n$ of perfect matchings, this question asks you to select anyone uniformly).

If I tell you how to solve the $3^{rd}$ the first two problems become trivial. Also, if I tell you answer to the second one, first one becomes trivial. No sweat.

Thus, the first problem is no harder than the second one and the second one is no harder than the third one. What about the other way round?

It turns out if you have access to some oracle which can answer the first question (viz whether the graph has got a perfect matching or not) you can answer the $2^{nd}$ question as well. And given an oracle access to some device that solves the $3^{rd}$ question, you can answer the $4^{th}$ one. We will see in future posts how to do all these (if I get around to write some).

But let me remind you about the problem of counting perfect matchings in a bipartite graph. Its "cute" in the sense that it has a very good linear algebraic interpretation. Turns out, solving the above counting problem is as tough as computing the permanent of a matrix. You know permanent? Its just the determinants with all the "minuses" gone. And surprisingly its hard to compute permanent whereas computing the determinant (which involves some zig-zagging through 'signs") is easy. In fact, computing the permanent is #$P-Complete$.

Thus, given the above information you can see that counting the number of perfect matchings in an arbitrary general graph must also be, well, #$P-Complete$.

Now, lets see how the knowledge that a graph has a perfect matching helps in constructing one. Hmmm....So I tell you that I have a graph which has got a erfect matching. What good is this information in finding out one?

Pretty good it turns out. We capture the essence of the above discussion by saying that graphs are self reducible with respect to the perfect matching property. We will prove this later.

Now, I will turn to discuss something about a special bipartite graph, namely the chessboard. Turns out in a chessboard the number of perfect matchings is easily computable. First, lets turn to the following famous problem. You have a $n*n$ chessboard. Can you tile it up using $2*1$ dominos?

You can easily see that its not possible for odd n. For even $n$ its clearly possible. Too trivial. Now I ask is it always the case that you can tile up an arbitrary connected figure which has got an even number of squares. The answer is no. Consider the traditional chessboard with two opposite corners removed. Say the removed squares are both black. The number of white squares being more, domino-tiling is not possible anymore.

Thus, it looks like the sufficient condition for tilability is an elusive beast. Not if Prof Randall has her way. Somewhere, in the course of her graduate studies, she made the following cute observation.

Consider the following special "marked" dominoes (alongwith an unmarked one)

Call these figures (a), (b), (c) and (d). The marked tiles, (a), (b) and (c) respectievely connect $(x, y)$ to $(x+1, y-1), (x+1, y+1)$ and $(x+2, y)$.

Then Prof Randall makes a nice series of observations beginning with a very simple idea. She first says that if you can tile the board with unmarked dominoes only, then you can trivially put marks on them and obtain a marked tiling.

But a marked tiling is not she is happy with. She is looking for a meaningful marking - a marking which can tell her something about the number of perfect matchings in the above kind of graphs. In short, she establishes a bijection between the number of domino tilings and the number of perfect matchings. The mental gymnastics she went through for this are not very clear to me, but I can try imparting some intuition by "cooking up" the following explanation. Possibly, she arrived at this result by some other genius inventive idea; but the description below should help.

*The discussion below uses the word domino and tiles interchangeably.*

**Note**First, let us assume we are on a $n*n$ grid where $n$ is even. (We will discuss the general case for $\RR \subseteq Z^2$ later). Further, to assist our discussion imagine that squares are coloured black and white like they are in a chessboard. Also, think of your dominoes as all being transparent. Now, put a red dot over the center of the region of the domino which lies over a black square. Let us identify the starting point of a tile as this red dot lying over a black region on the tile.

You can now imagine the tiles as being some sort of "teleporter" carrying the red dot from one tile to the other. They can carry it "horizontally through" the tile as shown in (c), or "diagonally" as in (a) and (b).

[I am not providing any diagrams, but should you need one, you can look here on page 9.]

Thus, you obtain a triangular lattice whose edges correspond to the direction the "teleporter" carried the red dot in. If you can chalk out a simply connected path for red dot starting at some left boundary tile (source) and ending at some right boundary tile (sink), you basically just figured out a tiling for the squares you traversed on the original board.

Thus figuring a simply connected path on the triangular lattice corresponds to figuring out a tiling for the corresponding portion of the chessboard! Figure out a series of such non-intersecting paths, each beginning at some source $s_i$ and ending at some sink $t_i$ such that all the paths are non-intersecting; you win.

(If paths intersect, your tiles overlap - thats why we need them all to be disjoint). The series of paths from $s_i$'s to $t_i$'s is called routing. Thus, the number of domino-tilings of the chessboard is the same as the number of routings! Also, notice that a source-sink pair, $(s_i, t_i)$ is respected by all tilings. That is, if one arbitrary routing has got a set of sources $(s_1, s_2, ..., s_k)$ and a set of sinks $(t_1, t_2,...,t_k)$, all other routings will have the same set of source and sinks. Its easy to see why. If it were not, paths would intersect leading to pathologies (not to mention other disasters like earthquakes, vampires, stinking socks, filth, etc.)

You might have noticed that under the assumption that the region was tilable, the number of sources and sinks is the same. Figure out why on your own. (Its too trivial).

The same analysis extends to any other tilable $\RR \subseteq Z^2$.

But as of now we have not mentioned when is a region tilable. Relax, we are coming to it as well. We would do it using some tools from group theory. Our proof will be a constructive one, if there exist some tilings, it will find one.

Allow me to relax a bit. I will resume posting soon.

Subscribe to:
Posts (Atom)