
Imagine that a particle is walking in a two-dimensional space, starting at the origin . At every time-step (or “epoch”)
it takes a
vertical step. At every step, the particle either moves up by
, or down by
. This walk is “unbiased” in the sense that the up/down steps are equiprobable.
In this post, we will discuss some natural questions about this “unbiased” or “simple” random walk. For example, how long will it take for the particle to return to zero? What is the probability that it will ever reach +1? When will it touch for the first time? Contents of this post are a summary of the Chapter “Random Walks” from the awesome “Introduction to Probability” (Volume I) by William Feller. Also, this is an excellent reference.
Edit: the Biased Case. The biased walk follows
.
The lazy walk follows
.
In the exposition, we’ll first state the unbiased case and when appropriate, follow it up with the biased case.
Definitions. Let be the position of the particle at time
if it starts at the origin. The probability that
is
.
To enrich our vocabulary, we also define
for the probability of an event when the walk starts at height .
Define
.
Clearly, .
Define
.
The hitting time, or the first passage time, is defined as
.
In particular, is the time to hit the origin for the first time.
Expressing
without
. For simple random walks,
.
.
Notice that the right-hand sides do not have . The trick, in short, is to observe that
- a
path of length
contains exactly
up moves and
down moves; and
- since the up/down probabilities are equal, the quantity of interest is
which is at most
. Stirling’s approximation does the rest.
The Reflection Principle. Consider a simple (possibly lazy) random walk. Let
be positive integers. The following two quantities are equal:
- the number of
paths that never touch or cross the origin, and
- the number of paths from
to
.
In short,
.
Since there is a bijection between the paths and the probabilities,
.
This also implies that
.
That is, the probability that
is negative after
steps is the same as the probability that
is positive _but_ has touched or crossed the origin.
Proof: For every path that starts at and touches zero for the first time at
, there is another—symmetric—path that starts at
and touches zero for the first time at
.
An Application of the Reflection Principle: the Ballot Theorem. The reflection principle can be used to give a closed-form expression for the probability that a path stays strictly above the origin all along.
The Ballot Theorem. The number of
paths of length
that never touches or crosses zero is
.
A Return to the Origin. Suppose a walk, starting at the origin, returns to the origin at epoch . It is clear that this path takes
steps up and
steps down, in some order. The probability that this happens is
as
.
The First Return to the Origin. Let denotes the probability that the path returns to the origin for the first time at epoch
. Suppose a (possibly not the first) return to the origin happens at epoch
. Then, the first return could have happened at epochs
. Thus
.
The Relation between
and
for simple or lazy simple random walks.
.
Using the bound
and integrating in
, we get
.
No Return to the Origin. For simple random walks, the probability that a return to the origin happens at
is the same as the probability that a return never occurs in
steps.
.
The above two observations, together, imply that
For a simple (possibly lazy) random walk,
.
The First Return to the Origin (Revisited). The above interpretation gives
.
This is equivalent to saying that the number of non-returning paths up to epoch
are of two kinds: those that return at
(for the first time), and those that do not return at
. Using
, we can show that
as
.
Integrating in for large
, we get that for a simple walk,
.
Contrast this with what we already know for a simple walk:
.
First Return to the Origin after Starting from One. The reflection principle can be used to show that
where
is the
th Catalan number.
The Last Return to the Origin. The probability that up to and including epoch
the last visit to the origin occurs at epoch
is given by
,
where
.
The first factor is the probability that a return occurs at epoch , and the second factor is the probability that a path of length
never returns to the origin. (Recall that
is also the probability of no return.)
Using where
, we have
.
Below is the plot of (in blue) and its integral from zero to
(in green). We remark that the integral equals
.
Staying on One Side for a Long Time. An integration of the above shows that for a fixed
and
sufficiently large,
.
The curve looks like the following:

This means an unbiased random walk is “more likely” to diverge either early or late (than diverging in the middle sections). In particular, the symmetry of the above curve implies that the probability that a path of length spends a
time on one side—either at the beginning or at the end—is
.
Number of Sign-Changes. The number of lead-changes in an
trial game is proportional to
. This is surprising because ordinarily, one would think that the number of side-changes is proportional to the length of the path. Specifically,
Suppose there are exactly
sign-changes in a path of length
. The corresponding probability is
.
Observe that this probability decreases as increases with
fixed. This means that a path is more likely to experience few sign-changes.
Suppose and
is large. We know that if
for some positive real
, then
where $\mathcal{R}(x)$ is the area of the standard normal distribution in the interval
.
The probability that there are at most
sign-changes in
epochs, tends to
.
Supremum. Let
. The probability that a path of length
has a maximum at most
equals
.
This probability, obtained via the reflection principle, is the same as the probability that the said path touched or crossed the line .
Maximum. Let
be a random variable denoting the maximum of a path of length
. Let
if
have different parity, and zero otherwise. Then
.
First Passage Time/Hitting Time. For a simple random walk, the probability that the first passage through
occurs at epoch
is given by
.
Moreover, as
, the probability that the first passage through
occurs before epoch
(for fixed
) tends to
.
This means, the waiting time for the first passage through scales as
.
The Ascent/Descent Generating Functions. Let us define
.
That is, .
Analogously, define
.
The generating function for the probabilities is
where and
. Likewise, the generating function for the probabilities
is
.
Suppose (so that the walk has a stronger down-pull), and in particular, let
and
for some
so that
. Then
.
On the other hand, if (so that the walk has a stronger up-pull) then
; this means that the walk eventually makes an ascent with certainty.
When , we know that the walk will eventually make the first ascent. But how long does it take to make it? Observe that
.
Thus
.
On the same note,
.
The
th Return to the Origin. An
th return at epoch
has the same probability as a first passage through
and epoch
, which equals
defined above.
Time Spent on the Positive Side. The number of paths of length
such that
and exactly
of its sides are above the axis, is independent of
, and equal to
where
.