Tutorial 10: simulations of Markov models

Objectives

  • use conditional statements
  • understand how to use for loops to generate string of Markov states
  • perform matrix multiplication
  • generate a sequence of vectors by matrix multiplication
  • visualize the distribution of an array of states

Simulating Markov transitions

The behavior of a Markov model is random, so the next state cannot be predicted from the current state exactly. However, one can use a random number generator to produce a string of states, given its transition probabilities and an initial value. This is called a computer simulation, which does not give exact results but are rather numerical experiments, producing data that are consistent with a given model and its assumptions.

Below is a code to simulate one time step for a two-state Markov model. It generates a new state according to two transition probabilities, given an initial state. In the code below, the initial state is set to 1 and the transition probabilities are 0.6 and 0.4. The code uses a conditional statement to check what the initial state (in.state) is, and based on this use either transition probability from 1 to 2 (trans1to2) or transition probability from 2 to 1 (trans2to1) with a random number to make a random transition. A second conditional statement is used to assign new.state to a new state if the random number is less than the transition probability, and otherwise to leave new.state the same as in.state. The code also prints an error message if the initial state is neither 1 nor 2; it’s good practice to cover all possibilities, and not to assume that variables are set to the values that you expect. If this code is run multiple times, you will see that the transitions are random: sometimes the model remains in the same state, and other times it jumps to the other.

Exercises

  1. Modify the script below to generate the new state with transition probability 0.2 from state 1 to 2 and transition probability 0.8 from state 2 to 1, and with initial state 2.
  1. Modify the script below by placing everything starting with the decider assignment inside a for loop that repeats 20 times.
  1. The previous script replaced the value of state with a new value every iteration. Copy that script and modify it by pre-allocating a vector of states state_vec prior to the for loop by using the function rep() and the initial value to create a vector of sufficient length (21). Inside the loop, check the current value of state_vec (index i) in the conditional statements, and assign the new value to the next index of state_vec. After the loop is done, use table() as shown below to report how many of the states have values 1 and 2.

Matrix multiplication

The easiest way to perform the cumbersome calculations for matrix multiplication is to outsource them to a computer. R provides a special operation symbol just for this purpose, which is an asterisk surrounded by percent signs: %*%. To illustrate we will multiply the matrix A and the vector b, shown below:

\[ A = \left(\begin{array}{cc}3& 1 \\ -5 & 0\end{array}\right); \; b = \left(\begin{array}{c} 10 \\ -2 \end{array}\right) \] To perform this operation in R, we must first define the matrix and the vector, and then perform multiplication:

The probability distribution vector for a Markov model advances one time step at a time by multiplication with the transition matrix of the model. For example, let us take the same transition matrix as in section 11.2, and multiply it by the initial vector prob0=(0.5, 0.5) (which means that initially the model is in states 1 and 2 with equal probability 0.5).

The element M[i,j] (i-th row and j-th column) contains the probability of transition from state number j to state number i. Take care to enter the transition probabilities in the correct order, as the matrix() function by default places the element by column (first fills the first column, then the second) as you can see in the script above. The result shows that after 1 time step, the probability distribution vector changes from (0.5,0.5) to (0.525, 0.475). What about taking many time steps?

Computers can perform repetitive operations much better than humans, so we will take advantage of their arithmetic proficiency. Since each time step involves multiplication of the current probability vector by the transition matrix, this can be done automatically with a for loop. The only difficulty is that, while the transition matrix remains the same, the probability vector needs to be updated. There are two ways of handing this: 1) replace the old vector with the new, with the disadvantage that the previous vectors all get over-written in memory; 2) save all of the probability vectors in a rectangular matrix, which means we can plot all the probability vectors over time and see their evolution. The following script takes the first approach:

Very important information: to access only one row of a matrix, use the first (row) index and leave the second index blank; similarly, to access only one column of a matrix, use the second (colum) index and leave the first index blank. For example:

Note the the print function prints both the row and the column as row vectors.

Exercises

  1. Find and fix the error in the following matrix assignment for the matrix A:

\[ A = \left(\begin{array}{cc}0 & 1 \\ -5 & 10\end{array}\right) \]

  1. Copy the correct assignment of matrix A from the exercise above, assign a vector \(b=(0,1)\), multiply A by b and assign the result to the vector b, then print it out.
  1. Copy the code from the exercise above and use a for loop to repeat the matrix multiplication 10 times, each time assigning the result to vector b, then print out the resulting vector.
  1. Copy the code from the exercises and modify it to save of all the vectors that were generated by matrix multiplication, by pre-allocating the matrix bs with two rows and 11 columns with zeros and assign the vector b to the first column. Use the for loop to assign the next column of the matrix bs as the product of the matrix A and the current column of the matrix bs, then print out the whole array bs.

Barplots for histograms and arrays

Visualizing a large number of values can be done by using the function table() to count the frequencies of different values and then using barplot() to plot those frequencies, resulting in a nice-looking histogram. This script visualizes a vector of length 100 of 0s and 1s, generated by rbinom():

A matrix can be visualized using barplot() as well, with each column represented by a bar with colors representing values of the different rows. For example, the 2 by 4 matrix below is visualized like this:

barplot() can also be used to visualize the frequencies contained in an matrix array of values. Let us generate a matrix containing multiple strings of states, one in each row, with initial states all in column 1, and each successive column containing the states at that time step (this is not how the actual matrix of states is calculated in lab 7, this is just to generate a state matrix for illustration.)

The following script counts the number of each state (1 and 2) in each column, and plots those frequencies as bars. You will use this script to plot the frequencies of your simulations in part 3 of R lab 7.