# Sampling Multivariates and Processes [10/08]

Prepared by Zhu Jin, Srinivasan Rajappa & Yuhui Wu

### Stochastic Kinetic Model

Stochastic Kinetic Model is a highly multivariate jump process used to model chemical reaction networks, particularly those in biochemical and cellular systems.

This kind of model represents temporal evolution of a system among M populations interacting with one another through a set of $v$ events.

• V events happen with rates $h_k(x_t,c_k)$ where $k=1,…, v$. $c_k$ is rate constant ($k$ is a index of events) and $x_t$ represents M population
• Each event k changes population by $Δ_t$

The  Gillespie algorithm is as following:

1. At time $t=0$, the system is initialized into the initial state:
$(x(1),⋯x(M))←(x(1)0,⋯x(M)0)$.
2. Sample time to the next event no matter what event it is:
$τ∼Exponential(h_0(x,c) = \sum_{k=1}^v h_k(x,c_k))$.
3. sample the specific event on condition that there is event happening (the minimum of a set of exponential distributions is exponential distribution with the rate which is the sum of the event rates of those individual distributions. On condition that we have event which happens at the minimum of the times of the distribution then we can sample what specific event is happening), and this is a categorical distribution:
$k∼categorical(h_1h_0,…,h_vh_0)$.
4. Update time and the states of the populations: $t←t+τ, x←x+Δk$.
5. Repeat steps 2-5 until the termination condition is satisfied.

The reason that this algorithm sample many events, take the minimum and then move on is that the exponential distribution is used, which is a memory-less distribution, and at any time of the next event, there is no memory about what was done in the past.

This kind of models that involves different populations and that involves a set of events. It is common in both examples that involves a set of events. The events happen with different event rates and the rate involves a different rate constants. The rates also involve the current state of the system. The state of the system is expressed in terms of the different populations.

Susceptible-infectious-susceptible dynamics:

• $S+I→2I, rate = \alpha\cdot |S|\cdot |I|$
• $I→S, rate = \beta\cdot |I|$

For example, there are two different populations in susceptible dynamics – susceptible population and infectious population. These two populations interact with each other through two events – infection and recovery. The probability of getting infected is $R$ when an individual from susceptible population interact with one from infectious population. Since system has $S$ number of susceptible individuals and I number of infectious individuals then the total number of combinations that we can have if we choose one susceptible individual and one infectious individual is going to be $S*I$, the susceptible population times the infectious population. The total rate of the whole system is going to be alpha times susceptible population times the infectious population. It depends on how we look at it. If we look at it at the individual level then the rate is alpha but if we look at it at the system level then the rate is going to be larger than alpha. Similarly we have another event called recovery. The same happens here and also we know that after an infection event called the susceptible in the population is decreased by one and the infectious population is increased by one. After the recovery event the infectious population is decreased by one and the susceptible in the population is increased by one. This can actually be calculated from the chemical equations. Like we have before the reaction we have one I and after the reaction we have two Is. This means that I has increased by one. Here before the reaction we have one S and after the reaction we have zero $S$. This means that $S$ is decreased by one. The rate is defined per time united and also could be time unit or a function of time.

A lot of sampling-based algorithms as has been mentioned, but there is a challenge that condition on the observations, the sampling does not work so easy, because the sampling algorithm does not assume that there are observations. Thus, the Likelihood of Finite State Continuous Time Markov Process with Complete Observation is introduced.

### Sample rate constants from complete path stochastic kinetic dynamics

In a system from time $1$ to time $T$, there are $n$ events happened: $v_1,v_2,…,v_n$. Those events have happened at corresponding times $t_1,t_2,…,t_n$. At each time this event $vi$ happened with rate computed from the time of the previous event as

$$h_{v_i}(x_{t_{i−1}},c_{v_i})$$

and rate of all events is

$$h_0(x{t_{i−1}},c)=∑_k h_k(x_{t_{i−1}},c)$$

changing state from $x_0$ at time $t_0=0$ to $x_1,x_2,…,x_n.$

The state of this system is always $t_{i-1}$, and between $t_{i-1}$ to $t_i$ the rate of any event to happen is $h_0$ which is the sum of the rate for the individual events. The probability for this set of events and the corresponding happening times is going to be the probability distribution that the time to next event is from $t_i$ to $t_{i-1}$.

So the probability density function is

$$h_0(x_{t_{i−1}})exp(−h_0(x_{t_{i−1}},c)⋅(t_i−t_{i−1}))$$

This is a category distribution which suppose that there is an event happening then what is the event that happened? It’s probability is:

$$h_{v_i}(x_{t_{i−1}},c_{v_i})h_0(x_{t_{i−1}},c)$$

After the $n$th event and between the time of the $n$th event to this $t$, there is no further event, the probability of this is

$$exp(−h_0(x_{t_n},c)⋅(t_T−t_n))$$

Those are the three parts in the likelihood function. If the $h_0$ in the pdf function of the next event with $h_0$ of the multinomial distribution is cancelled, all of the $h_{v_i}$ are left over, as:

$$\prod_{i} h_{v_i}(x_{t_{i−1}},c_{v_i})$$

And then if the exponential distributions is merged, there will be $H_0$ from time $0$ to time $T$:

$$exp(−\sum_i h_0(x_{t_i},c)⋅(t_{i+1}−t_i))$$

So in the likelihood there are information about the jumps ($\prod_{i} h_{v_i}(x_{t_{i−1}},c_{v_i})$) and also the nonjumps ($exp(−\sum_i h_0(x_{t_i},c)⋅(t_{i+1}−t_i))$).

If $$h_k(x_t,c_k)=c_kg_k(x_t),$$

there is
$$L(c;x)=\prod_ic_{v_i}g_{v_i}(x_{t_{i−1}})exp(−\sum_i\sum_kc_kg_k(x_{t_i})⋅(t_{i+1}−t_i))$$$$∝\prod_{k∈events}c^{rk}_k exp(−\sum_{k∈events}c_k∫_t dtg_k(x_t))$$ where $$r_k=\sum_i 1(v_i=k)$$

Hence MLE of $c_k$ is the expected number of event $k$ to happen in unit time

$$c_k=r_k∫_tdtg_k(x_t)$$

### Sample from a proposal process with piecewise constant/linear event rate

An inhomogeneous or non-homogeneous Poisson process is similar to an ordinary Poisson process. The average rate of arrivals $\lambda(t)$ varies over time or in other words it can be understood that rate parameter of the process is a function of time.

In the example of Predator-Prey model, the population of predator/prey observed over time can be exhibiting a rate similar to $\lambda(t)$. In such situation where random phenomena occurs we can account this to Inhomogeneous-Poisson process.

Property of Independent Increments: Suppose that we have a random process $X=\{X_t: t \in T\}$ such that index set $T$ is either $N$ or $[0,\infty)$. Here X has independent increments if for $t_1, t_2, \cdots, t_n \in T$ with $t_1\le t_2 \cdots \le t_n$, the increments $X_{t_1}, X_{t_2}-X_{t_1}, \cdots , X_{t_n}-X_{t_n-1}$ are all independent.

A process that produces random points is a non-homogeneous Poisson process with rate $\lambda(t)$ if the counting process $N$ satisfies the following properties:

1. If $\{A_i:i \in I \}$ is a countable, disjoint collection of measurable subsets of $[0,\infty)$ then $\{N(A_i): i \in I\}$ is a collection of independent random variables.
2. If $A \subseteq[0,\infty)$ is measurable then $N(A)$ has the Poisson distribution with parameter m(A).

Some other properties include the following: For finding the number of events between $(t,t+dt]$ we can apply the same properties of poisson process for $\lambda(t)$. i.e.
$N_t(dt) \sim \mbox{Poisson}(\lambda(t) dt)$

This property gives an insight that sum of independent Poisson random variables is a Poisson with rate equaling sum of individual rates at any time $t$.

Now having said this, in contrast homogeneous Poisson process have event observations really close to one another. If not then this means that the rate cannot be assumed to be a constant. So if we apply the same notion for a non-homogeneous Poisson process then we may have several plausible paths between event rate at $t_1$ and event rate at $t_2$. The path refers to the variable rate that may be assumed at will.

Thus this brings us to objective of finding the best fit for a non-homogeneous Poisson process that will estimate the variable rate between two major observations. This engenders the requirement of sampling the random variable. The technique adopted is as follows:

1. First samples from the average of distribution is taken.
2. Later the time is adjusted by stretching there by accounting an accurate depiction of various time involved.

The sampling is done according to a uniform distribution. The time stretching exploits the fact that sum of Poisson random variable is again a Poisson random variable. This makes a sound basis for asserting the samples on a path between two observations.

This helps to understand the inhomogeneous in terms of homogeneous Poisson distribution. However, we have to mathematically understand in terms of $h$ function.

Assuming Y(t) is inhomogeneous Poisson process where $\lambda(t) = (1-t)\cdot h_0 + t\cdot h_1$ where $t \in (0,1)$ and assuming $X(s)$ is a homogeneous Poisson process with rate $(h_0 + h_1) / 2$ here $s$ is a state in the state space.

On solving using cumulative hazard function:

$M(s) = -\log(T_X > s) = s\cdot (h_0+h_1)/2$
and , $\delta(t) = h_0 \cdot t + t^2 (h_1-h_0)/2$

on inspection $\delta(t) = M(s) \Rightarrow t = (\sqrt{h_0^2 + (h_1^2 – h_0^2).s} – h_0)/(h_1 – h_0)$

Thus we can infer the following:

$X(\sqrt{h_0^2 + (h_1^2 – h_0^2).s} – h_0)/(h_1 – h_0))$ has distribution of $Y(t)$

The above proof shows the complete observation, complete path based on the two stochastic connected processes. This also helped to find probability path based on our sampling methods. Based on an accept reject filter we may be able to filter the correct probability path. This forms the basis for next sample.

Thus we are able to accept a sample for the event rates between two different observations within a time period.

### Metropolis-Hastings Sampling from Discrete Time Observations of Finite State Continuous Time Markov Process

Sample from finite state continuous time Markov process and discrete observations:

• $X_t\in{1,\dots,k}$
• State transition kernel $q_{i,j}=lim_{t\to 0} {(P(X_{t+\Delta t}=j|X_t=i)-1(i\equiv j)\over \Delta t}$ and initial state $X_0$
• Observation model $P(Y_t|X_t)$
• Observations $Y_0, Y_1, \dots, Y_T$

Plan:

• Initialise sample path to be consistent with the observed data. Repeat the following
• Sample rate constants from their full conditionals given the current sample path.
• For each of the T intervals, propose a new sample path consistent with its end-points and accept/reject it with a Metropolis-Hastings step
• Output the current rate constants.

Sample between Observations Assuming Homogeneous Event Rates
Now we begin to introduce the conception of Homogeneous Poisson process, the basic form of Poisson process is: Counting process $\{N(t),t\ge0\}$ is defined as Poisson process with rate $\lambda$, if:

• $N(0)=0$
• Independent increments (the numbers of occurrences counted in disjoint intervals are independent of each other)
• for any $s,t\ge0$, $P\{N(t+s)-N(s)=n\} = {(\lambda t)^n e^{-\lambda t}\over n!}, n=0,1,2, …$

Consider a homogeneous Poisson process with rate $\lambda$ on the finite interval $[0,T]$. Then the number of events happening in this interval has Poisson distribution, $R\sim{\rm Poisson}(\lambda T)$. In addition, the event times conditioned on $R$ have uniform distribution $U(0,T)$.
For any given $\lambda,p>0$, let $N\sim{\rm Poisson}(\lambda)$ and $X|N\sim{\rm Binomial}(N,p)$. Then $X\sim{\rm Poisson}(\lambda p)$
$P(X=k) = \sum_{i=k}^\infty P(X=k|N=i)P(N=i)\\ = \sum_{i=k}^\infty {i!\over k! (i-k)!} p^k {(1-p)}^{i-k}\times {\lambda^i e^{-\lambda}\over i!}\\ = \dots = {{(\lambda p)}^ke^{-\lambda p}\over k!}$
The following graph shows that Poisson process’s sample path: $T_{n}, n=1,2,3,…$ is the nth event’s happening time, let $T_{0}=0$. $X_{n}, n=1,2,3,…$ is the time interval between (n-1)th event and nth event.
Consider a homogeneous Poisson process with rate $\lambda$ on the finite interval $[0,T]$ Then the number of events happening in this interval has Poisson distribution, $R\sim{\rm Poisson}(\lambda T)$. In addition, the event times conditioned on $R$ have uniform distribution $U(0,T)$.
Given $N(t)=n$, the $n$ jumps $T_1,\cdots,T_n$ are uniformly distributed on interval $(0,t)$.
First, considering situation when n=1, which means event only happens 1 time in time interval [0,t], for $s\le t$,
$\mbox{P}(T_{1}\le s|N(t)=1)=\frac{\mbox{P}(N(s)=1)\mbox{P}(N(t)-N(s)=0)}{\mbox{P}(N(t)=1)}=\frac{\lambda se^{-\lambda s}e^{-\lambda (t-s)}}{\lambda te^{-\lambda t}}=\frac{s}{t}$
Which indicates that event happens uniformly for n=1, now we consider situation for n$\ge$2:
$\mbox{P}(t_{i}<T_{i}\le t_{i}+h_{i},i=1,\cdots,n|N(t)=n)\\$
$=\frac{\mbox{P}(\mbox{exactly 1 event in }(t_{i},t_{i}+h_{i}],i=1,\cdots,n,\mbox{ no events elsewhere in }(0,t))}{\mbox{P}(N(t)=n)}\\$
$= \frac{\lambda h_{1}\exp(-\lambda h_{1})\cdots\lambda h_{n}\exp\left(-\lambda h_{n}\right)\cdot\exp\left(-\lambda(t-h_{1}-\cdots-h_{n}\right)}{\exp\left(-\lambda t\right)(\lambda t)^{n}/n!}\\ = \frac{n!}{t^{n}}h_{1}\cdot h_{2}\cdot\dots\cdot h_{n}.$
Taking derivative over intervals $h_1,\cdots,h_n$, we get the desired results.