Created on July 23, 2023

Written by Some author

Read time: 21 minutes

Summary: In this blog, We introduces various discrete distributions and discusses their properties, particularly focusing on the counting distribution, Poisson distribution, negative binomial distribution, and binomial distribution. Counting Distribution, Poisson Distribution, Negative Binomial Distribution, Memoryless Property, Binomial Distribution, The (a,b,0)-Class

In this blog, we will introduce some discrete distribution and their property. Let's first consider counting distribution which can be useful in counting the number of claims in an insurance setting.

A counting distribution is a distribution of non-negative integer numbers such that it represents a/couple specific event(s) occur(s).

More formally, we can write $p_k = \Pr[N=k], k \in \mathbb{N}.$

And we can define the probability generating function of a counting distribution as $P(z)=P_{N}(z) = \mathbb{E}[z^N] = \sum_{k=0}^{\infty}z^k p_k.$

The probability generating function can provide us with the mean and variance of the distribution as we can see the following:

$$\frac{d}{dz}\left(\sum_{k=0}^{\infty}z^k p_k\right) = \sum_{k=0}^{\infty}kz^{k-1} p_k$$

By setting $z=1$, then we will have $$\mathbb{E}[N]=\sum_{k=0}^{\infty}k p_k = \frac{d}{dz}\left(\sum_{k=0}^{\infty}z^k p_k\right)\bigg|_{z=1}$$

And if we take another derivative, we will have

$$\frac{d^2}{dz^2}\left(\sum_{k=0}^{\infty}z^k p_k\right) = \sum_{k=0}^{\infty}k(k-1)z^{k-2} p_k$$

If we set $z=1$, then we will have

$$\mathbb{E}[N(N-1)] = \sum_{k=0}^{\infty}k(k-1) p_k = \frac{d^2}{dz^2}\left(\sum_{k=0}^{\infty}z^k p_k\right)\bigg|_{z=1}$$

We can also see that we can generate the probability from our probability generating function.

Specifically,

$$\sum_{k=0}^{\infty}z^k p_k \bigg|_{z =0} = p_0$$

$$\frac{d}{dz}\left(\sum_{k=0}^{\infty}z^k p_k \right) \bigg|_{z =0}= \left(\sum_{k=1}^{\infty}kz^{k-1} p_k \right) \bigg|_{z =0} = p_1$$

$$\frac{d^n}{dz^n}\left(\sum_{k=0}^{\infty}z^k p_k \right) \bigg|_{z =0} = p_n$$

We can also have $$p^{(m)}(z) = \mathbb{E}\left[\frac{d^m}{dz^m}z^N\right] = \mathbb{E}\left[{}^{N}P_{m}z^{N-m}\right]\\ = \sum_{k=m}^{\infty}{}^{k}P_{m}z^{k-m}p_k = m! p_m+\sum_{k=m+1}^{\infty}{}^{k}P_{m}z^{k-m}p_k$$

And so $p^{(m)}(0) = m! p_m \implies p_m =\frac{p^{(m)}(0)}{m!}.$

Poisson distribution.

Insurance often deals with count data, such as the number of insurance claims, accidents, or losses, which are discrete non-negative values. The Poisson distribution is specifically designed to model the number of events occurring in a fixed interval of time or space, making it a natural choice for handling such count data.

We will have the following for Poisson distribution as our probability mass function for our insurance claims. $$p_k = \Pr[N= k] = \frac{\exp(-\lambda) \lambda^{k}}{k!}$$

Now we can show some of the properties of poisson distribution. First, we will compute the pgf and mgf of our distribution:

$$P_{N}[z] = \mathbb{E}[z^N] = \sum_{k=0}^{\infty} z^k \exp(-\lambda)\frac{\lambda^{k}}{k!} = \exp(-\lambda)\sum_{k=0}^{\infty} \frac{\left(\lambda z\right)^{k}}{k!} = \exp(\lambda z-\lambda ) = \exp(\lambda (z-1))$$

And the mgf:

$$M_N[z] = \mathbb{E}[\exp(Nz)] = \sum_{k=0}^{\infty} \exp(z)^k \exp(-\lambda) \frac{\lambda^k}{k!}= \exp(-\lambda)\sum_{k=0}^{\infty} \frac{(\exp(z)\lambda)^k}{k!}= \exp(\lambda(\exp(z)-1))$$

More specifically, we have the following relationships for pgf and mgf:

$$M_{N}[z] = \mathbb{E}[\exp(Nz)] = \mathbb{E}[\exp(z)^N] = P_{N}[\exp(z)].$$

and

$$P_N[z] = \mathbb{E}[z^N] = \mathbb{E}[\exp(N\ln z)] = M_N[\ln z].$$

Note the above formulation applies to both discrete and continuous distribution since we are not restricting to $\sum$ or $\int$ here.

We can take derivative with moment generating function to get the first and second moments:

$$\frac{d}{dz}e^{\lambda(\exp(z)-1)}=e^{\lambda(\exp(z)-1)} \lambda\exp(z)$$

and

$$\frac{d^2}{dz^2}e^{\lambda(\exp(z)-1)}=e^{\lambda(\exp(z)-1)} (\lambda\exp(z))^2 + e^{\lambda(\exp(z)-1)} \lambda\exp(z)$$

And so $$\mathbb{E}[N] = \lambda, \mathbb{E}[N^2] = \lambda^2 + \lambda$$

so the variance will be $\operatorname{V}[N] = \lambda = \mathbb{E}[N],$ which is nice since the variance equals to the mean or equidispersion.

In insurance, it is crucial to accurately assess and predict the potential financial risk associated with insuring a large number of policies or individuals. When the variance is equal to the mean, it means that the level of uncertainty or volatility in the claim frequency is consistent with the average number of claims. This stability in risk assessment allows insurers to make more reliable predictions and set appropriate premiums to cover potential losses.

If we have a series of claims that following the Poisson distribution, then we will have the total number of claims to be a poisson distribution. On the other hand, if we have a total claims and we categorize them into different categories uniformly, then we will have individual category's claims to be Poisson distribution. The following theorems give some insights to the reader.

$\textbf{Theorem}$ We can state that if we have $n$ independent Poisson variables with parameters $\lambda_i$, denoted as $N_i$, then their sum $N= \sum_{i}N_i$ follows a Poisson distribution with a parameter equal to the sum of the individual parameters, which is $\lambda = \sum_i \lambda_i$.

$\textbf{Proof.}$ We can consider by looking into the pgf of our distribution. $P_{N}[z] = \mathbb{E}[z^N] = \mathbb{E}[z^{\sum_i N_i}] = \prod_i \mathbb{E}[z^{ N_i}] = \prod_i\exp(\lambda_i (z-1)) = \exp((\sum_i\lambda_i) (z-1)) = \exp(\lambda (z-1)) $

So $N$ follows the Poisson distribution with $\lambda$ as parameter.

$\textbf{Theorem}$ When we have a Poisson random variable $N$ with a mean of $\lambda$, and these events can be categorized into $m$ types with probabilities $p_1, \ldots, p_m$, independently of each other, then the numbers of events $N_1, \ldots, N_m$ corresponding to each event type $1, \ldots, m$ are mutually independent Poisson random variables. Their means are given by $\lambda p_1, \ldots, \lambda p_m$, respectively.

$\textbf{Proof.}$

$$\Pr[N_1 =n_1, N_2 =n_2, ..., N_m = n_m] = \Pr[N_1=n_1, N_2 = n_2, ..., N_m =n_m | N = n] \times \Pr[N=n] \\ = \left({n \choose n_1, n_2, ... , n_m} \prod_i p_i^{n_i}\right) \times \exp(-\lambda)\frac{\lambda^n}{n!}\\= \frac{n!}{\prod_i n_i!} \prod p_i^{n_i} \times\frac{\prod_i \exp(-p_i\lambda)\lambda^{n_i}}{n!} = \prod_{i} \exp(-p_i\lambda)\frac{{(\lambda p_i)}^{n_i}}{n_i!} $$

We also notice the marginal probability:

$$\Pr[N_i = n_i] = \sum_{n=n_i}^{\infty} \Pr[N_i = n_i| N=n] \times \Pr[N=n] \\= \sum_{n=n_i}^{\infty} {n\choose n_i}p_i^{n_i}(1-p_i)^{n-n_i} \exp(-\lambda)\frac{\lambda^n}{n!} \\ = \exp(-\lambda) \frac{(\lambda p_i)^{n_i}}{n_i!}\sum_{n=n_i}^{\infty} \frac{ (\lambda - \lambda p_i)^{n-n_i}}{(n-n_i)!}\\ =\exp(-\lambda) \frac{(\lambda p_i)^{n_i}}{n_i!}\sum_{n=0}^{\infty} \frac{ (\lambda - \lambda p_i)^{n}}{n!} \\= \exp(-\lambda) \frac{(\lambda p_i)^{n_i}}{n_i!} \exp(\lambda - \lambda p_i) \\= \exp(-\lambda p_i)\frac{(\lambda p_i)^{n_i}}{n_i!}$$

And so $\Pr[N_1 =n_1, N_2 =n_2, ..., N_m = n_m]= \prod_i \Pr[N_i= n_i].$

$\textbf{Example.} $The study indicates that the expected number of claims per individual policy is $3$, and the number of claims follows a Poisson distribution. The goal is to exclude a particular medical procedure from coverage, which has historically accounted for approximately $5\%$ of the claims. Now, we'll proceed to determine the new frequency distribution.

$\textbf{Solution}.$ By the theorem above, we will have $\lambda = 3$ and $p_1 = 0.05$ so $\lambda_1 = 3 \times 0.05 = 0.15$ and $p_2 =0.95$ and $\lambda_2 = 0.95 \times 3=2.85$, since we are removing claim of type $1$, we will have the new expected value to be $2.85$.

Now, let's consider a more general scenairo inspired by above theorem. When dealing with a random variable $N$ drawn from a distribution, and we classify the events of $N$ into $m$ types with probabilities $p_1, \ldots, p_m$, independently of each other, the counts of events $N_1, \ldots, N_m$ corresponding to each event type $1, \ldots, m$ are mutually independent random variables following specific distributions.

We will leave as an exercise for the reader to see how we can give prove under certain distributions or give counter examples using moment generating function or probability generating function.

Negative Binomial Distribution.

The negative binomial distribution can serves as an alternative for Poisson distribution because it contains two parameters rather than $1$, making it more flexible.

$$\Pr[N=k] = {k+r-1 \choose k} \left(\frac{1}{1+\beta}\right)^r\left(\frac{\beta}{1+\beta}\right)^k.$$

Now, the moment generating function will be

$$M_{X}(t) = \mathbb{E}[\exp(Xt)] = \sum_{k=0}^{\infty} {k+r-1 \choose k} \left(\frac{1}{1+\beta}\right)^r\left(\frac{\beta}{1+\beta}\right)^k \exp(t)^k \\ =\left(\frac{1}{1+\beta}\right)^r\sum_{k=0}^{\infty} {k+r-1 \choose k} \left(\frac{e^t\beta}{1+\beta}\right)^k $$

$${k+r-1\choose k} = \frac{(k+r-1)\times (k-1+r-1) \times\ldots\times(r+1)\times r }{k!}\\= (-1)^k\frac{(-r)\times(-r-1) \times \ldots \times (-r -(k-2)) \times(-r-(k-1)) }{k!} = (-1)^k { -r \choose k}$$

And so the moment generating function is

$$M_{X}(t) = \left(\frac{1}{1+\beta}\right)^r\sum_{k=0}^{\infty} (-1)^k { -r \choose k}\left(\frac{e^t\beta}{1+\beta}\right)^k = \left(\frac{1}{1+\beta}\right)^r\sum_{k=0}^{\infty} { -r \choose k}(1)^{-r-k}\left(-\frac{e^t\beta}{1+\beta}\right)^k $$

We know Newton's binomial theorem, so we will have

$$M_X(t) = \left(\frac{1}{1+\beta}\right)^r \left(1 - \frac{\beta}{1+\beta}\exp(t)\right)^{-r}=(1+\beta - \beta \exp(t))^{-r}=(1-\beta(\exp(t)-1))^{-r}$$

And the mean and second moments will be calculated through first and second derivatives.

$$\frac{d}{dt}M_X(t) = \frac{d}{dt}(1-\beta(\exp(t)-1))^{-r} \\ = (-r)(1-\beta(\exp(t)-1))^{-r-1} \times(-\beta \exp(t)) \\ = r\beta (1-\beta(\exp(t) -1))^{-r-1}\exp(t)$$

$$\frac{d^2}{dt^2}M_X(t) = r\beta\frac{d}{dt} (1-\beta(\exp(t) -1))^{-r-1}\exp(t)\\=r\beta \times \left( (r+1)\beta(1-\beta(\exp(t)-1) )^{-r-2}\exp(2t)+(1-\beta(\exp(t) -1))^{-r-1}\exp(t)\right)$$

and so the mean and variance of $X$, will be

$$\mathbb{E}[X] = r \beta , \operatorname{V}[X] = r\beta ((r+1)\beta+1) - r^2\beta^2 = r\beta (r\beta+\beta+1) -r^2\beta^2= r\beta(\beta+1).$$

A special case of negative binomial distribution is when we set our $r=1$, in this case, we will have

$$\Pr[N=k] = \left(\frac{1}{1+\beta}\right)\left(\frac{\beta}{1+\beta}\right)^k.$$

And so our distribution becomes a geometric distribution. One good thing about geometric distribution is its memoryless properties, as we shall see shortly.

Memoryless properties of geometric properties:

Consider the following conditional probability:

$$\Pr[N\ge k+t | N\ge k] = \frac{\Pr[N\ge k+t]}{\Pr[N\ge k]} $$

Notice the cmf of the geometric distribution is the following:

$$\Pr[N < n] = \sum_{k=0}^{n-1} \left(\frac{1}{1+\beta}\right) \left(\frac{\beta}{1+\beta}\right)^k = \left(\frac{1}{1+\beta}\right) \frac{1-\left(\frac{\beta}{1+\beta}\right)^n}{1-\left(\frac{\beta}{1+\beta}\right)} = 1-\left(\frac{\beta}{1+\beta}\right)^n$$

$$\Pr[N\ge n] = 1 - \Pr[N < n] = \left(\frac{\beta}{1+\beta}\right)^n$$

$$\Pr[N\ge k+t | N\ge k] = \frac{\Pr[N\ge k+t]}{\Pr[N\ge k]} = \frac{\left(\frac{\beta}{1+\beta}\right)^{k+t}}{\left(\frac{\beta}{1+\beta}\right)^k} = \left(\frac{\beta}{1+\beta}\right)^t = \Pr[N\ge t]. $$

This shows that geometric properties have memory less properties. The memoryless properties is important to actuarial science. The memoryless property is an important concept in actuarial studies because it has significant implications for certain types of insurance and financial risk models. When a distribution exhibits the memoryless property, it means that past events do not influence future events. This property is particularly relevant in scenarios where the timing of certain events is important, such as insurance claims or financial transactions.

The reason why above happens is because the following. Let's consider the expected value of a random variable, we want to express the expected value in terms of its cdf. (let's assume the continuous case, then we will move to the discrete case).

$$\mathbb{E}[X]= \int_{\mathbb{R}} xf(x)\, dx$$

by integration by parts, we will have $u = x$ and $dv = f(x)dx$, then $du = 1, v = -S(x)$, where $S(x) = \Pr[X \ge x]$ and so, we will have

$$\mathbb{E}[X]= \int_{\mathbb{R}} -F(x)\, dx = \int_{\mathbb{R}}S(x) \, dx$$

Let's look at one example of the uniform distribution. We know the mean of $U(0,1)$ will be $1/2$ and the cumulative function of uniform distribution is $x$ and so $S(x) = 1-x$ and we do the integration and get $$\int_0^{1} 1-x \,dx = x - \frac{1}{2}x^2 \bigg|_{x=0}^1 = \frac{1}{2}.$$

One good thing about the above formula regarding average is that it works with conditional probabilities as well, since conditional probabilities does not depend on $x$, it will just be inserted as a constant and pass through all the integration.

And so if a distribution is memoryless meaning $S_{X|X\ge k}(k+t)=\frac{S_X(k+t)}{S_{X}(k)}=\Pr[X \ge k+t | X \ge k] = \Pr[X\ge t] =S_X(t)$, which will implies

$$\mathbb{E}[X|X\ge k] = \int_{\mathbb{R}}S_{X|X\ge k}(k+t) \, dt = \int_{\mathbb{R}}\frac{S_X(k+t)}{S_{X}(k)} \, dt = \int_{\mathbb{R}}S_X(t) \, dt = \mathbb{E}[X]$$

So the expected value does not depend on the previous observation.

In the context of insurance, insurance companies can take into account the variability in risk levels among policyholders by considering the risk parameter $\lambda$ as a random variable drawn from a distribution.

In the standard Poisson distribution, $\lambda$ is typically viewed as a fixed constant representing the average rate of claim occurrences for a specific insured individual. However, in reality, the riskiness of policyholders can vary significantly. Some drivers may have a higher likelihood of accidents due to factors such as age, driving experience, location, or driving habits, while others may have a lower risk profile.

To capture this inherent variability, insurers can adopt a more sophisticated approach by treating $\lambda$ as a random variable. This involves considering a probability distribution for $\lambda$, denoted as $u(\lambda)$. The distribution $u(\lambda)$ represents the likelihood of different risk levels or expected claim rates for policyholders in the insured population.

We can find that by randomizing over lambda, we will get negative binominal distribution. Specifically, let's consider the following:

$$p_k = \Pr[N= k] = \mathbb{E}_{\Lambda}[\Pr[N=k | \lambda = \Lambda]] \\ =\int_0^{\infty} \Pr[N=k | \lambda =x] u(x) \, dx \\ = \int_0^{\infty} \exp(-x) \frac{x^k}{k!} u(x)\, dx = \frac{1}{\Gamma(k+1)} \int_{0}^{\infty} x^k \exp(-x)u(x)\, dx$$

If $\lambda$ follows gamma distribution, then we will have $$u(x) = \frac{1}{\Gamma(a)\theta^a}x^{a-1} \exp(-x/\theta) $$ and so

$$p_k = \frac{1}{\Gamma(k+1)} \int_{0}^{\infty} x^k \exp(-x)\frac{1}{\Gamma(a)\theta^a}x^{a-1} \exp(-x/\theta)\, dx\\ = \frac{1}{\Gamma(k+1) \Gamma(a) \theta^{\alpha}} \int_0^{\infty} x^{k+a-1}\exp\left(-\frac{\theta +1}{\theta}x\right)\, dx$$

Let $u = \frac{\theta +1}{\theta}x \implies \frac{\theta}{\theta +1 }du = dx $

So we will have

$$p_k = \frac{\left(\frac{\theta}{\theta+1}\right)^{k+a}}{\Gamma(k+1) \Gamma(a) \theta^{\alpha}} \int_0^{\infty} u^{k+a-1}\exp\left(-u\right)\, \, du = \frac{\Gamma(k+a)\left(\frac{\theta}{\theta+1}\right)^{k+a}}{\Gamma(k+1) \Gamma(a) \theta^{\alpha}} = \frac{\Gamma(k+a)}{\Gamma(k+1) \Gamma(a)} \left(\frac{\theta}{\theta+1}\right)^{k} \left(\frac{1}{\theta+1}\right)^{a} = {k+a-1\choose k} \left(\frac{\theta}{\theta+1}\right)^{k} \left(\frac{1}{\theta+1}\right)^{a}$$

We can see that the Poisson distribution is a limiting case of the negative binomial distribution by sending $r \to \infty, $ and $\lambda = r\beta$. Specifically, let's consider the moment generating function of the negative binomial distribution.

$$\lim_{r \to \infty}(1-\frac{\lambda}{r}(\exp(t)-1))^{-r} \\= \lim_{r \to \infty} \exp(-r \ln\left(1-\frac{\lambda}{r}(\exp(t)-1)\right)) \\= \lim_{r \to \infty} \exp\left(-r \ln\left(\frac{r-\lambda(\exp(t)-1)}{r}\right)\right) \\= \lim_{r\to\infty}\exp\left(r \ln\left(\frac{r}{r-\lambda\exp(\exp(t)-1)}\right)\right)$$

$$= \lim_{r\to\infty}\exp\left(\frac{\ln(r)- \ln(r-\lambda \exp(e^t-1))}{1/r}\right)\\=\exp\left(\lim_{r \to \infty} \frac{\ln(r)- \ln(r-\lambda \exp(e^t-1))}{1/r}\right)\\=\exp\left(-\lim_{r \to \infty} \frac{\frac{1}{r}- \frac{1}{r-\lambda \exp(e^t-1)}}{1/r^2}\right)\\= \exp\left(\lim_{r \to \infty} { \frac{r^2}{r-\lambda \exp(e^t-1)}} -r\right)\\= \exp\left(\lim_{r \to \infty} { \frac{r^2-r^2 +\lambda r\exp(e^t-1)}{r-\lambda \exp(e^t-1)}} \right) \\= \exp(\lambda \exp(e^t-1))$$, which is the mgf of the Poisson distribution.

The binomial distribution.

The binomial distribution is good to have in the realm of actuarial science, since unlike negative binomial or Poisson, it has a relative less variance compared with its mean. And also it has an upper limit, which helps us with modeling distribution that is significantly right skewed, e.g. the number of family members covered by a particular plan or the number of accidents happening within a year. For a single individual, the binomial distribution is coinciding with the Bernoulli distribution with moment generating function: $f(z)=(1-q) + q \exp(z)=1 + q (\exp(z)-1)$ and for a group of $n$, we. will have the moment generating function as $f(z) = (1 + q (\exp(z)-1))^n$, which is the mgf for binomial distribution with parameter $n$. The pmf of the binomial distribution is the following:

$$p_{k}=\Pr[N=k] = {n\choose k}q^k(1-q)^{n-k}$$.

Now, let's consider the mean and the variance.

$$\frac{d}{dz} (1+q(\exp(z)-1))^n \\= nq \exp(z)(1+q(e^z-1))^{n-1} $$

$$\frac{d^2}{dz^2}(1+q(\exp(z)-1))^n \\= nq \frac{d}{dz}\exp(z)(1+q(e^z-1))^{n-1} \\= nq ((n-1)q \exp(2z)(1+q(e^z-1))^{n-2} +\exp(z)(1+q(e^z-1))^{n-1})$$

And so the mean is $nq$ and the variance is $nq((n-1)q +1) -n^2q^2 = nq(nq -q + 1) -n^2 q^2 = nq(1-q) \le nq.$

The $(a,b,0)-$class.

If a discrete distribution that takes on $0,1,...$ has the following recursive relation.

$$p_{k} = \left(a + \frac{b}{k}\right)p_{k-1}$$

for $p_k = \Pr[K=k], k=1,2,....$ and for $p_0$ we can calculate it by summing up the probabilities and set the summation to 1.

All previous discussed discrete distributions fall in the category of $(a,b,0)-$class.

We can transform the above formula for $p_k$ into the following:

$$k\frac{p_k}{p_{k-1}} = ak+b.$$

For the Poisson distribution, we have

$$\Pr[N=k] =\exp(-\lambda)\frac{\lambda^k}{k!}$$

and so we will have

$$k \frac{p_k}{p_{k-1}}=k\frac{\exp(-\lambda) \frac{\lambda^k}{k!}}{\exp(-\lambda) \frac{\lambda^{k-1}}{(k-1)!}} = \lambda$$

and $p_0 = \exp(-\lambda).$

And for negative binomial distribution, we have

$$\Pr[N=k] = { r + k -1\choose k}\left(\frac{1}{1+\beta}\right)^r \left(\frac{\beta}{1+\beta}\right)^k$$

and so

$$k\frac{p_k}{p_{k-1}} = k \frac{{ r + k -1\choose k}\left(\frac{1}{1+\beta}\right)^r \left(\frac{\beta}{1+\beta}\right)^k}{{ r + k -2\choose k-1}\left(\frac{1}{1+\beta}\right)^r \left(\frac{\beta}{1+\beta}\right)^{k-1}}\\= k \frac{\frac{(r+k-1) \times (r+k-2) \times \ldots \times (r+1)\times r}{k!}}{\frac{(r+k-2) \times (r+k-3) \times \ldots \times (r+1)\times r }{(k-1)!}} \left(\frac{\beta}{1+\beta}\right) \\= (r+k-1)\left(\frac{\beta}{1+\beta}\right) \\= \left(\frac{\beta}{1+\beta}\right)k +\left(\frac{\beta}{1+\beta}\right)(r-1)$$

and $p_0 = (1+\beta)^{-r}.$

Now consider the binomial distribution.

$$Pr[N=k] = {n \choose k} q^k (1-q)^{n-k} \\= \frac{n!}{k! (n-k)!}q^k (1-q)^{n-k}.$$

$$k\frac{p_{k}}{p_{k-1}} = k \frac{\frac{n!}{k!(n-k)!}q^k (1-q)^{n-k}}{\frac{n!}{(k-1)!(n-k+1)!}q^{k-1} (1-q)^{n-k+1}} = (n-k+1)\left(\frac{q}{1-q}\right) = -\frac{q}{1-q}k +(1+n)\left(\frac{q}{1-q}\right)$$

when $k-1 = n$, then we will have $k=n+1$, we will have

$$-\frac{q}{1-q}k +(1+n)\left(\frac{q}{1-q}\right) = 0$$ and

$p_{n+1} = 0$, we also have $p_{n+2}=p_{n+3} = ... = 0$, which satisfy recursive relation trivially.

$p_0 = (1-q)^n.$

$$ \begin{equation} \begin{array}{|c|c|c|c|} \hline \text { Distribution } & \text { A } & \text {B} & p_0 \\ \hline \text { Poisson } & 0 & \lambda & \exp(-\lambda) \\ \hline \text { Negative Binomial } & \frac{\beta}{1+\beta} & \frac{\beta}{1+\beta}(r-1) & (1+\beta)^{-r} \\ \hline \text { Binomial } & -\frac{q}{1-q} & (n+1)\frac{q}{1-q} & (1-q)^n \\ \hline \text { Geometric } & \frac{\beta}{1+\beta} & 0 & (1+\beta)^{-1} \\ \hline \end{array} \end{equation} $$

Truncated (censored), modified distributions.

These distributions are useful in various real-world scenarios where the original distribution is not appropriate or does not adequately represent the data.

Truncated distributions arise when some data points or observations are removed or censored from the dataset. This can happen when certain values fall outside a specific range or are not recorded due to some experimental or practical limitations. Truncated distributions allow us to analyze the remaining data without the influence of the excluded values. These distributions are crucial in situations where the full range of data is not available, and it helps in making more accurate inferences and predictions.

Modified distributions are employed when the original distribution does not fit the data well, especially in specific regions. By adjusting the probabilities at certain points or altering the parameters, modified distributions can better represent the data in these regions. Such modifications are useful in cases where the original distribution fails to capture certain patterns or characteristics in the dataset.

We can define our $(a,b,1)$-class as distribution satisfy $$k\frac{p_k}{p_{k-1}} =ak+b$$

for $k=2, 3,...$

Let's differentiate between two situations: one where $p_0 = 0$ and the other where $p_0 > 0$.

We'll use the abbreviations ZT and ZM for convenience for zero-truncated/zero-modified.

Let's derive some properties regarding the probability generating function and probability function for $(a,b,1)$-class with respect to $(a,b,0)$-class.

Let $P(z)=\sum_{i=0}^{\infty} p_k z^k$ denote the pgf of a member of the $(a, b, 0)$ class. Let $P^M(z)=\sum_{i=0}^{\infty} p_k^M z^k$ denote the pgf of the corresponding member of the $(a, b, 1)$ class.

Recall $$(k+1) \frac{p_{k+1}^M}{p_{k}^M}=(k+1) \frac{p_{k+1}}{p_{k}} = a(k+1)+b$$

and so $$\frac{p_{k+1}^{M}}{p_{k}^M} \\= \frac{p_{k+1}}{p_k} \implies p_{k+1}^{M} \\= \frac{p_{k}^M}{p_{k}}p_{k+1} \\= \frac{p_{1}^M}{p_{1}}p_{k+1}$$

and so $$p_{k}^{M} =c p_{k}$$ for $k \ge 2$.

We can also get $$\frac{p_{1}^{M}}{p_{2}^{M}} = \frac{p_1}{p_{2}} \implies p_{1}^{(M)} = \frac{p_2^{M}}{p_2}p_1 = cp_1.$$

And so $$p_{k}^{M} =c p_{k}$$ for $k \ge 1$.

Now, the probability generating function will be $$P^M(z) = p_0^{(M)} + \sum_{i=1}^{\infty}p_k^{M}z^k =p_0^{(M)} + c\sum_{i=1}^{\infty}p_kz^k = p_0^{(M)} + c (P(z)-p_0)$$

And if we set $z$=1, then we will have

$$1=\sum_{i=0}^{\infty}p_k^{(M)}=P^{(M)}(1)= p_0^{(M)} + c(1-p_0) \implies \frac{p_k^{(M)}}{p_k}=c = \frac{1-p_0^{(M)}}{1-p_0}$$

So our probability generating function will be

$$P^M(z) = p_0^{(M)} + \frac{1-p_0^{(M)}}{1-p_0} (P(z) - p_0) = \frac{1-p_0^{(M)}}{1-p_0}P(z)+p_0^{(M)} -\frac{p_0+p_0^{(M)}-p_0^{(M)}p_0-p_0^{(M)}}{1-p_0} \\= \frac{1-p_0^{(M)}}{1-p_0}P(z) +\frac{p_0^{(M)}-p_0}{1-p_0}\\ =\frac{1-p_0^{(M)}}{1-p_0}P(z) +1 - \frac{1-p_0^{(M)}}{1-p_0} $$.

And $$p_k^{(M)}= \frac{1-p_0^{(M)}}{1-p_0}p_k.$$

And for the zero truncated case, we will have pgf as

$$P^{T}(z) =\frac{1}{1-p_0}P(z) - \frac{p_0}{1-p_0}$$

and $$p^{T}_k = \frac{p_k}{1-p_0}.$$

So if we want to express the pgf and probability of zero-modified in terms of zero truncated, we will have

$$p_k^{(M)}= (1-p_0^{(M)})\frac{1}{1-p_0}p_k = (1-p_0^{(M)})p_K^T$$

and

$$P^M(z) = p_0^{(M)} + \frac{1-p_0^{(M)}}{1-p_0} (P(z) - p_0) = p_0^{(M)}+ (1-p_0^{(M)})p^T(z)$$

Let's analyze a negative binomial random variable with parameters $\beta$ and $r$. We aim to find the first four probabilities for this random variable and then calculate the corresponding probabilities for the zero-truncated and zero-modified versions, with the first probability set to $p_0^M$.

Recall that

So $$p_k^{(M)} = \frac{1-p^{M}_{0}}{1-p_0}p_k \\= \frac{1-p^{M}_{0}}{1-(1+\beta)^{-r}} p_k \\=\frac{1-p^{M}_{0}}{1-(1+\beta)^{-r}} {k+r -1 \choose k} \left(\frac{1}{1+\beta}\right)^r\left(\frac{\beta}{1+\beta}\right)^k\\= \frac{1 -p_0^M}{(1+\beta)^r -1} {k+r-1 \choose k}\left(\frac{\beta}{1+\beta}\right)^k $$

and we can compute the pgf by the following

$$p^{(M)}(z) \\= p_0^{M} + \frac{1-p_0^M}{1-p_0} (P(z) - p_0) \\= p_0^M+ \frac{1-p_0^M }{1-(1+\beta)^{-r}}[(1-\beta(z-1))^{-r} - (1+\beta)^{-r}] \\= p_0^M + \frac{1-p_0^M }{(1+\beta)^r-1} \left( \frac{(1+\beta)^r- (1-\beta(z-1))^r}{(1-\beta(z-1))^r}\right)\\= p_0^M +\frac{1-p_0^M }{(1+\beta)^r-1} \left( \left(\frac{1+\beta}{1-\beta(z-1)}\right)^r-1\right) $$

A zero-inflated distribution is when we have $p_0^M > p_0$, when you zero-inflating a Poisson distribution, you will get the variance bigger than the mean.

By using zero-modified distribution, we can have $-1 < r< 0$ in the case of binomial. We call these distributions "extended" truncated negative binomial (ETNB), which satisfy $p_0=0$ and the recursive relation holds and the summation of $p_i$ is finite.

When $r$ goes to zero, we will have the following

$$p_k^{T} \\= \lim_{r \to 0}\frac{1}{(1+\beta)^r -1} {k+r-1 \choose k}\left(\frac{\beta}{1+\beta}\right)^k \\= \lim_{r \to 0}\frac{1}{(1+\beta)^r -1} \frac{(k+r-1)\times (k+r-2) \times \ldots \times (r+1)\times r}{k!}\left(\frac{\beta}{1+\beta}\right)^k \\= \lim_{r\to 0}\frac{\sum_{n=0}^{k-1} \frac{(k+r-1) \times (k+r -2) \times ... \times(r+1) \times r}{(r+n)}}{k!\ln(1+\beta) (1+\beta)^r}\left(\frac{\beta}{1+\beta}\right)^k\\= \lim_{r\to 0}\frac{ \frac{(k+r-1) \times (k+r -2) \times ... \times(r+1)\times r}{r}}{k!\ln(1+\beta) (1+\beta)^r}\left(\frac{\beta}{1+\beta}\right)^k \\= \frac{ [\beta/(1+\beta)]^k }{k\ln(1+\beta) }$$

We can also see the pgf in this case will be

$$p^{(M)}(z) \\= \lim_{r \to 0}\frac{1}{(1+\beta)^r-1} \left( \left(\frac{1+\beta}{1-\beta(z-1)}\right)^r-1\right) \\= \lim_{r \to 0}\frac{\ln\left(\frac{1+\beta}{1-\beta(z-1)}\right)\left(\frac{1+\beta}{1-\beta(z-1)}\right)^r}{\ln(1+\beta)(1+\beta)^r} \\= \frac{\ln\left(\frac{1+\beta}{1-\beta(z-1)}\right)}{\ln(1+\beta)} \\= 1 - \frac{\ln(1-\beta(z-1))}{\ln(1+\beta)} $$

For $-1<r <0$, there's another limiting case when $\beta \to \infty$, which we call the distribution Sibuya distribution. It has probability generating function $P(z) = 1-(1-z)^{-r}.$ However, it does not have moments.

Distributions with no moments, also known as undefined or infinite moment distributions, are problematic for insurance pricing because they do not have finite expected values or variances. In the context of insurance, the moments of a distribution are essential for assessing risks and determining appropriate premiums. Here's why distributions with no moments are considered challenging for insurance pricing, which causes difficult in assessing the risk.

We can derive that the Shibuya distribution by the following:

$$p^{(M)}(z) = \lim_{\beta \to \infty}\frac{1}{(1+\beta)^r-1} \left( \left(\frac{1+\beta}{1-\beta(z-1)}\right)^r-1\right) = \lim_{\beta \to \infty}\left(1- \left(\frac{1+\beta}{1-\beta(z-1)}\right)^r\right) = 1 - \left(\lim_{\beta \to \infty}\frac{1+\beta}{1-\beta(z-1)}\right)^r=1 - (1-z)^{-r}.$$

The following problem is from Loss Models: From Data to Decisions By Stuart A. Klugman, Harry H. Panjer, Gordon E. Willmot 2019 and in chapter $6$ last section. The solution is written by me

Determine the probabilities for an ETNB distribution with $r=-\frac{1}{2}$ and $\beta=1$. Do this both for the truncated version and for the modified version, with $p_0^{M}=0.6$ set arbitrarily.

Recall that

So we will have $A = \frac{1}{2}, B = \frac{1}{2}(-\frac{3}{2}) = -\frac{3}{4}$. Note if we substitute $p_0 = (2)^{\frac{1}{2}} = \sqrt{2} > 1$, which is incorrect.

And also recall $$p_k^{(M)}= \frac{1 -p_0^M}{(1+\beta)^r -1} {k+r-1 \choose k}\left(\frac{\beta}{1+\beta}\right)^k $$

Then we will have $p_1^T = \frac{1}{(1+\beta)^r-1}{r \choose 1}\left(\frac{\beta}{1+\beta}\right) = \frac{1}{1/\sqrt{2}-1}(-\frac{1}{2})\left(\frac{1}{2}\right)=\frac{1}{4} (2 + \sqrt 2) \approx 0.853553390$.

And so $p_2^T = (A+\frac{B}{2})p_1^T = (\frac{1}{2} -\frac{3}{8}) \cdot \frac{1}{4} (2 + \sqrt 2) = \frac{1}{32} (2 + \sqrt 2) \approx 0.10669417375$

and $p_3^T = (A+\frac{B}{3}) p_2^T = (\frac{1}{2} - \frac{1}{4}) p_1^T = \frac{1}{128}(2+\sqrt2)\approx 0.0266735$

and we will have the following for the zero modified by simply multiplying $(1-0.6) = 0.4 = \frac{2}{5}$ and we get

$p_1^M = \frac{1}{10}(2+\sqrt2) = 0.34142135$

and

$p_2^M = \frac{1}{80}(2+\sqrt2)$ and $p_3^M = \frac{1}{320}(2+\sqrt 2).$