A Mathematical Introduction to Markov Chains

by Martin Day

From the Preface:

I have taught various undergraduate courses on Markov chains and stochastic processes at Virginia Tech over the years. In every instance I used a different published text and was always disappointed in their emphases and coverage. So I began to write short sections of notes to bring out the ideas and organize the material the way I thought appropriate. After my retirement in 2016 I decided to put those notes together, fill in some of the gaps, and try to turn them into a coherent treatment. This document is the current status of that effort.

So far I have organized the material in an order that makes logical sense to me and developed proofs based on that organization. Here are some of the ideas that have guided my selection and organization of the material.

This is a topic in mathematics. Although Markov chains are used in many applications, and specific applications help to illustrate the ideas, I want the mathematics of Markov chains to be the focus. Students should see topics from their previous mathematics courses at work here: linear algebra, infinite series, elementary real analysis and differential equations.
There are some ideas that unify the different Markov processes considered here. One is the central role of the generator. For discrete time chains this is the matrix A=P-I. For a countably infinite state space this is an infinite matrix (viewed an operator on sequences). For Markov jump processes this is the operator A of equation (11.18). And for Brownian motion it is 1/2 d²/dx². One of the common roles it plays is to identify families of martingales ... Although I won't try to develop semigroup theory per se, or the full martingale problem equivalence, I do hope students will recognize that there are some central ideas that are common to the different types of Markov processes discussed and which hold the whole business together conceptually.
Many problems about Markov processes can be reduced to solving a system of equations for functions of the state variable which involve the generator. Calculation of hitting probabilities, mean hitting times, determining recurrence vs. transience, and explosion vs. non-explosion, are all considered in this way.
One of my disappointments with some published texts is the reliance on a merely intuitive understanding of the Markov property and conditional expectations. Although the measure-theoretic foundations of such things cannot be developed at the undergraduate level, I want students to recognize that there is some underlying mathematical structure which makes those parts of our reasoning rigorous, even if we don't always draw it out fully. So I try to explain in Chapter 3 how the whole conception of probability theory is founded on the Kolmogorov model of an underlying probability space with random variables as (measurable) functions. I present in that chapter some important working tools (dominated and monotone convergence theorems, the strong law, ...) but with no attempt to prove them. In the chapters which follow I will use those tools but will sweep some measure-theoretic technicalities under the rug (such as qualifications that functions be measurable or "except on a set of probability 0"). I readily acknowledge that as a consequence my treatment falls short of the level of rigor expected in advanced texts. ...
Contemporary software makes it possible to carry out calculations and perform simulations to exhibit various properties. Using Matlab for instance students can easily perform Markov chain calculations which are larger than they could attempt by hand. Facility with such calculations should be part of a contemporary mathematics education, so I want this text to both presume and cultivate those skills. The m-files referred in the text to are available from a web page accompanying this text.

But now some candid acknowledgment of what this document is not. I hesitate to call it a "book" for several reasons. There are additional topics which ought to be included ... and some chapters which need to be expanded and supplied with more problems. Many ideas could use better introduction. The use of Matlab is rather thin in the latter chapters. The difficulty level is uneven and probably strays beyond what most undergraduates can handle in places. Surely there are inconsistencies in my notation. There are also certain to be many typos, misspellings, even mathematically incorrect statements (though I hope not many) which further editing would improve. Whether I will eventually improve all these things only time will tell. I hope that what I have written out so far at least provides an organization and development of the material that others may find useful in some way, even if it is not quite suitable for a course text (yet).

Prerequisites, in addition to the standard freshman-sophomore calculus and differential equations courses, would be a real analysis or advanced calculus course which covers the connections between continuity and sequential convergence, the properties of infinite series and power series, a linear algebra course which includes the study of diagonalization of matricies and eigenvalues, and (for Chapter 11) a course on differential equations which includes the matrix exponential. Although not strictly necessary it would also be helpful if the student has encountered some basic ideas about random variables previously. An appendix provides a brief synopsis of some supporting mathematical topics that may have escaped students' backgrounds...

You are welcome to download the document and associated Matlab files from the following links. I just ask that you do not redistribute or post it yourself, but refer anyone else who might want to use it to this page.

Download A Mathematical Introduction to Markov Chains: click here for the most recent version (dated on the title page).

Download Matlab m-files: click here for the directory and then download the specific ones you want.