A Mathematical Introduction to Markov Chains
From the Preface:
I have taught various undergraduate courses on Markov chains and
stochastic processes at Virginia Tech over the years. In every
instance I used a different published text and was always disappointed
in their emphases and coverage. So I began to write short
sections of notes to bring out the ideas and organize the material the
way I thought appropriate. After my retirement in 2016 I decided
to put those notes together, fill in some of the gaps, and try to turn
them into a coherent treatment. This document is the current
status of that effort.
So far I have organized the material in an order that makes logical
sense to me and developed proofs based on that
organization. Here are some of the ideas that have guided
my selection and organization of the material.
- This is a topic in mathematics.
Although Markov chains are used in many applications, and specific
applications help to illustrate the ideas, I want the mathematics of
Markov chains to be the focus. Students should see topics from
their previous mathematics courses at work here: linear algebra,
infinite series, elementary real analysis and differential equations.
- There are some ideas that unify the different Markov processes considered here. One is the central role of the generator. For discrete time chains this is the matrix A=P-I.
For a countably infinite state space this is an infinite matrix (viewed
an operator on sequences). For Markov jump processes this is the
operator A of equation (11.18). And for Brownian motion it is 1/2 d2/dx2.
One of the common roles it plays is to identify families of martingales
... Although I won't try to develop semigroup theory per se, or
the full martingale problem equivalence, I do hope students will
recognize that there are some central ideas that are common to the
different types of Markov processes discussed and which hold the whole
business together conceptually.
- Many problems about Markov processes can be reduced to solving a
system of equations for functions of the state variable which involve
the generator. Calculation of hitting probabilities, mean hitting
times, determining recurrence vs. transience, and explosion
vs. non-explosion, are all considered in this way.
- One of my disappointments with some published texts is the
reliance on a merely intuitive understanding of the Markov property and
conditional expectations. Although the measure-theoretic
foundations of such things cannot be developed at the undergraduate
level, I want students to recognize that there is some underlying
mathematical structure which makes those parts of our reasoning
rigorous, even if we don't always draw it out fully. So I try to
explain in Chapter 3 how the whole conception of probability theory is
founded on the Kolmogorov model of an underlying probability space with
random variables as (measurable) functions. I present in that
chapter some important working tools (dominated and monotone
convergence theorems, the strong law, ...) but with no attempt to
prove them. In the chapters which follow I will use those tools
but will sweep some measure-theoretic technicalities under the rug
(such as qualifications that functions be measurable or "except on a set
of probability 0"). I readily acknowledge that as a consequence
my treatment falls short of the level of rigor expected in advanced
texts. ...
- Contemporary software makes it possible to carry out calculations
and perform simulations to exhibit various properties. Using Matlab for instance students can easily perform Markov chain
calculations which are larger than they could attempt by hand.
Facility with such calculations should be part of a contemporary
mathematics education, so I want this text to both presume and
cultivate those skills. The m-files referred in the text to are
available from a web page accompanying this text.
But now some candid acknowledgment of what this document is not.
I hesitate to call it a "book" for several reasons. There are
additional topics which ought to be included ... and some chapters
which need to be expanded and supplied with more problems. Many
ideas could use better introduction. The use of Matlab is rather thin
in the latter chapters. The difficulty level is uneven and
probably strays beyond what most undergraduates can handle in
places. Surely there are inconsistencies in my notation.
There are also certain to be many typos, misspellings, even
mathematically incorrect statements (though I hope not many) which
further editing would improve. Whether I will eventually
improve all these things only time will tell. I hope that what I
have written out so far at least provides an organization and
development of the material that others may find useful in some way,
even if it is not quite suitable for a course text (yet).
Prerequisites, in addition to the standard freshman-sophomore calculus
and differential equations courses, would be a real analysis or
advanced calculus course which covers the connections between
continuity and sequential convergence, the properties of infinite
series and power series, a linear algebra course which includes the
study of diagonalization of matricies and eigenvalues, and (for Chapter
11) a course on differential equations which includes the matrix
exponential. Although not strictly necessary it would also be
helpful if the student has encountered some basic ideas about random
variables previously. An appendix provides a brief synopsis of
some supporting mathematical topics that may have escaped
students' backgrounds...
You are welcome to download the document and associated Matlab files
from the following links. I just ask that you do not redistribute
or post it yourself, but refer anyone else who might want to use it to
this page.
Download A Mathematical Introduction to Markov Chains: click here for the most recent version (dated on the title page).
Download Matlab m-files: click here for the directory and then download the specific ones you want.