Tutorial 11: simulations of Markov models

Objectives

use conditional statements
understand how to use for loops to generate string of Markov states
perform matrix multiplication
generate a sequence of vectors by matrix multiplication
visualize the distribution of an array of states

Simulating Markov transitions

The behavior of a Markov model is random, so the next state cannot be predicted from the current state exactly. However, one can use a random number generator to produce a string of states, given its transition probabilities and an initial value. This is called a computer simulation, which does not give exact results but are rather numerical experiments, producing data that are consistent with a given model and its assumptions.

To simulate one time step for a two-state Markov model, we need to use a conditional statement (if or else) to determine the current state of the variable. The following script illustrates this decision:

In Markov models, the probabilities of the next state depend on the current state. For example, if the transition probability from 0 to 1 is 0.5, and the transition probability from 1 to 0 is 0.2, that means to simulate the next state, we need to use different probability distributions: (0.5, 0.5) for state 0, and (0.2, 0.8) for state 1. This simulation can be done using the function sample() with the option prob to specify the probability distribution on the states of the model. Here is an example of using sample to randomly choose one value out of the vector 0:1 with probabilities 0.9 and 0.1.

The code below implements this simulation, using probability variables P0to1 (for transition from 0 to 1), and P1to0 (for transition from 1 to 0). Then, if the current state is 0, P(0) is 1-P0to1, and P(1) is P0to1; and if the current state 1, P(0) is P1to0 and P(1) is 1-P1to0. If you run the code multiple times you will see that the transitions are random: sometimes the model remains in the same state, and other times it jumps to the other.

Exercises

Modify the script below to generate the new state with transition probability 0.2 from state 0 to 1 and transition probability 0.8 from state 2 to 1, and with initial state 2.

Modify the script below by placing everything starting with the if statement inside a for loop that repeats 20 times.

The previous script replaced the value of state with a new value every iteration. Copy that script and modify it by pre-allocating a vector of states state_vec prior to the for loop by using the function rep() and the initial value to create a vector of sufficient length (21). Inside the loop, check the current value of state_vec (index i) in the conditional statements, and assign the new value to the next index of state_vec. After the loop is done, use table() as shown below to report how many of the states have values 1 and 2.

Matrix multiplication

The easiest way to perform the cumbersome calculations for matrix multiplication is to outsource them to a computer. R provides a special operation symbol just for this purpose, which is an asterisk surrounded by percent signs: %*%. To illustrate we will multiply the matrix A and the vector b, shown below:

\[ A = \left(\begin{array}{cc}3& 1 \\ -5 & 0\end{array}\right); \; b = \left(\begin{array}{c} 10 \\ -2 \end{array}\right) \]

To perform this operation in R, we must first define the matrix and the vector, and then perform multiplication:

The probability distribution vector for a Markov model advances one time step at a time by multiplication with the transition matrix of the model. For example, let us take the QR cell state model from section 12.2 and assign the transition probability from Q to R (PQtoR) to be 0.05 and the probability from R to Q (PRtoQ) to be 0.1. The script below assigns the transition matrix and multiplies it by the initial vector prob0=(0.5, 0.5) (which means that initially the model is in states 1 and 2 with equal probability 0.5).

The element M[i,j] (i-th row and j-th column) contains the probability of transition from state number j to state number i. Take care to enter the transition probabilities in the correct order, as the matrix() function by default places the element by column (first fills the first column, then the second) as you can see in the script above. The result shows that after 1 time step, the probability distribution vector changes from (0.5,0.5) to (0.525, 0.475). What about taking many time steps?

Computers can perform repetitive operations much better than humans, so we will take advantage of their arithmetic proficiency. Since each time step involves multiplication of the current probability vector by the transition matrix, this can be done automatically with a for loop. The only difficulty is that, while the transition matrix remains the same, the probability vector needs to be updated. There are two ways of handing this: 1) replace the old vector with the new, with the disadvantage that the previous vectors all get over-written in memory; 2) save all of the probability vectors in a rectangular matrix, which means we can plot all the probability vectors over time and see their evolution. The following script takes the first approach:

Very important information: to access only one row of a matrix, use the first (row) index and leave the second index blank; similarly, to access only one column of a matrix, use the second (colum) index and leave the first index blank. For example:

Note the the print function prints both the row and the column as row vectors.

Exercises

Find and fix the error in the following matrix assignment for the matrix A:

\[ A = \left(\begin{array}{cc}0 & 1 \\ -5 & 10\end{array}\right) \]

Copy the correct assignment of matrix A from the exercise above, assign a vector \(b=(0,1)\), multiply A by b and assign the result to the vector b, then print it out.

Copy the code from the exercise above and use a for loop to repeat the matrix multiplication 10 times, each time assigning the result to vector b, then print out the resulting vector.

Copy the code from the exercises and modify it to save of all the vectors that were generated by matrix multiplication, by pre-allocating the matrix bs with two rows and 11 columns with zeros and assign the vector b to the first column. Use the for loop to assign the next column of the matrix bs as the product of the matrix A and the current column of the matrix bs, then print out the whole array bs.

Barplots for histograms and arrays

Visualizing a large number of values can be done by using the function table() to count the frequencies of different values and then using barplot() to plot those frequencies, resulting in a nice-looking histogram. This script visualizes a vector of length 100 of 0s and 1s, generated by rbinom():

A matrix can be visualized using barplot() as well, with each column represented by a bar with colors representing values of the different rows. For example, the 2 by 4 matrix below is visualized like this:

If you generate a matrix of probability distribution vectors for a Markov model, you can use barplot() to make a plot of the distributions over time (as shown in Chapter 13).