# 3. Random Number Generation¶

Random number generation is essential to simulation. Before we discuss how to simulate different queuing models, we need to first describe how to generate random numbers in simulation, particularly in simulus.

In this tutorial, we will use the `scipy.stats`

module to generate random numbers to simulate the queuing models. Python’s native `random`

module can also be used for this purpose. (In fact, `simulus`

uses the `random`

module as its core random number generator.) But the `scipy.stats`

module contains a huge number of probability distributions (more than 120 of them!), as well as many useful statistical functions, which are extremely useful for simulation tasks, and of course it would
be more than sufficient for our tutorial.

The `scipy.stats`

module uses the pseudo-random number generator provided by `numpy`

. Both `numpy`

and Python’s `random`

module use the Mersenne Twister as the core generator. This generator produces 53-bit precision floats and has a period of \(2^{19937}-1\). It has been proven that this generator has very good random properties.

Before we start this section of the tutorial, run the following cell so that we set up the environment by importing the necessary packages.

```
[1]:
```

```
# for random distributions, random number generators, statistics
import random
import numpy as np
import scipy.stats as stats
# for simulation and for qmodels
import simulus, qmodels
# for data visualization
import matplotlib.pyplot as plt
%matplotlib inline
# for animation inside the notebook
import ipywidgets as widgets
from ipywidgets import interact
```

## 3.1. Random Variables in scipy.stats¶

The `scipy.stats`

module contains a large number of probability distributions. Let’s first examine a few. One of the most common distribution used in simulation is the exponential distribution. The function `expon(scale)`

creates an exponential random variable and the parameter ‘scale’ is the mean of the distribution.

```
[2]:
```

```
x = stats.expon(scale=2)
```

We can easily find the mean, the median, and the standard deviation of the random variable, using the following functions:

```
[3]:
```

```
x.mean(), x.median(), x.std()
```

```
[3]:
```

```
(2.0, 1.3862943611198906, 2.0)
```

We can also use the `stats(moments)`

method to find the mean(‘m’), variance (‘v’), skew (‘s’), and kurtosis(‘k’), where the argument ‘moments’ specifies which we would like to have. In the following, we print them all:

```
[4]:
```

```
m, v, s, k = x.stats(moments='mvsk')
print('mean=%g, var=%g, skew=%g, kurtosis=%g' % (m, v, s, k))
```

```
mean=2, var=4, skew=2, kurtosis=6
```

We can plot the probability density function (pdf) and the cumulative density function (cdf) of the random variable. In the following, we first define a couple of functions to get ready for plotting:

```
[5]:
```

```
def plot_rv(rv, title, xmin=None, xmax=None):
# find the range (ppf() is the inverse cdf)
if xmin is None: xmin = rv.ppf(0.01)
if xmax is None: xmax = rv.ppf(0.99)
xs = np.linspace(xmin, xmax, 100)
# get the data points for pdf and cdf
ys = rv.pdf(xs)
zs = rv.cdf(xs)
plt.fill_between(xs, ys, color='#7fc97f', alpha=0.7)
plt.plot(xs, ys, color='#7fc97f', lw=3, alpha=0.9, label='pdf')
plt.fill_between(xs, zs, color='#beaed4', alpha=0.7)
plt.plot(xs, zs, color='#beaed4', lw=3, alpha=0.9, label='cdf')
plt.title(title)
plt.xlim(xmin, xmax)
plt.ylim(0)
plt.legend()
plt.show()
```

The above function takes a random variable, a string title of the plot, and the optional min and max data range for plotting. If min and max are ignored, we choose the 1% and 99% of the distribution as the range for plotting. The following function creates an exponentially distributed random variable with the given scale (mean) and uses the above function to plot it.

```
[6]:
```

```
def plot_expon(mean):
rv = stats.expon(scale=mean)
plot_rv(rv, "exponential (mean=%g)" % mean)
```

Now we are ready to plot the exponential distribution:

```
[7]:
```

```
plot_expon(2.0)
```

Using ipywidgets, we can dynamically change the ‘scale’ parameter (using a slider) and plot the distribution on the fly. In the following, we can move the slider to change the scale value and therefore alter the pdf and cdf curves.

```
[8]:
```

```
slider = widgets.FloatSlider(min=0.1, max=5, value=2)
interact(plot_expon, mean=slider)
None
```

Let’s do the same for two other distributions, gamma and normal, just for fun:

```
[9]:
```

```
def plot_gamma(a, scale):
rv = stats.gamma(a=a, scale=scale)
plot_rv(rv, "gamma (a=%g, scale=%g)" % (a,scale))
slider_a = widgets.FloatSlider(min=0.1, max=5, value=2.5)
slider_scale = widgets.FloatSlider(min=0.1, max=5, value=0.4)
interact(plot_gamma, a=slider_a, scale=slider_scale)
None
```

```
[10]:
```

```
def plot_norm(mean, stdev):
rv = stats.norm(loc=mean, scale=stdev)
plot_rv(rv, "normal (mean=%g, stdev=%g)" % (mean, stdev))
slider_mean = widgets.FloatSlider(min=-2, max=2, value=0)
slider_stdev = widgets.FloatSlider(min=0.1, max=2, value=1)
interact(plot_norm, mean=slider_mean, stdev=slider_stdev)
None
```

## 3.2. Generating Random Variates¶

The values obtained from the random variables (such as the mean, the standard deviation, pdf and cdf) are all deterministic as the probability distribution. Once a random variable is created, one can use the `rvs(size)`

method to draw random samples from the distribution, where the ‘size’ argument (as a positional argument) specifies the number of samples.

Random samples depend on the random number generator. To get repeatable results, we can first set a random seed. Recall that the `scipy.stats`

module uses the pseudo-random number generator provided in the `numpy`

module. Therefore, we should set the random seed using the `numpy.random.seed()`

function (as opposed to `random.seed()`

, which is used for the Python’s `random`

module).

```
[11]:
```

```
# x is a random variable (with a probability distribution)
x = stats.expon(scale=2.0)
# set the random seed, and get the first 1000 random
# samples; we show the first 3 random numbers
np.random.seed(13579)
xs1 = x.rvs(1000)
print('the first 1000 samples: %r...' % xs1[:3])
# set another random seed, and get another 1000 random
# samples; again we show the first 3 random numbers
np.random.seed(24680)
xs2 = x.rvs(1000)
print('the second 1000 samples: %r...' % xs2[:3])
```

```
the first 1000 samples: array([0.36717359, 0.71748813, 0.24954749])...
the second 1000 samples: array([0.82463564, 0.88634089, 1.43507555])...
```

The random seed determines the random sequence of the random number generator. If we reuse the same random seed, we should be able to get the same random sequence.

```
[12]:
```

```
np.random.seed(13579) # reuse the seed from previous
xs1 = x.rvs(1000)
print('repeat the first 1000 samples: %r...' % xs1[:3])
np.random.seed(24680) # reuse the seed from previous
xs2 = x.rvs(1000)
print('repeat the second 1000 samples: %r...' % xs2[:3])
```

```
repeat the first 1000 samples: array([0.36717359, 0.71748813, 0.24954749])...
repeat the second 1000 samples: array([0.82463564, 0.88634089, 1.43507555])...
```

We can investigate whether the random numbers are indeed drawn from the expected random distribution. To do that, we can plot a histogram of the random numbers and compare that with the true distribution.

```
[13]:
```

```
plt.figure(figsize=(10,5))
xmin, xmax = x.ppf(0.001), x.ppf(0.999)
xs = np.linspace(xmin, xmax, 100)
ys = x.pdf(xs)
axs = plt.subplot(1, 2, 1)
axs.hist(xs1, alpha=0.5, bins='auto', density=True)
axs.plot(xs, ys, 'r-')
axs.set_xlim(xmin, xmax)
axs.set_title("histogram of xs1")
axs = plt.subplot(1, 2, 2)
axs.hist(xs2, alpha=0.5, bins='auto', density=True)
axs.plot(xs, ys, 'r-')
axs.set_xlim(xmin, xmax)
axs.set_title("histogram of xs2")
plt.show()
```

## 3.3. Repeatable Random Sequences in Simulation¶

### 3.3.1. Global Random Seed¶

Simulus uses the Python’s `random`

module as the default random number generator. In order to obtain repeatable results, one needs to set the random seed using `random.seed()`

before calling any simulus functions (such as creating a simulator).

The following example a simple instance of simulus that uses the default random number generator in the `random`

module. The example generates a series of random integers between 0 and 99.

```
[14]:
```

```
random.seed(13579) # the global random seed
sim = simulus.simulator()
x = [random.randint(0,99) for _ in range(10)]
print(x)
```

```
[34, 3, 21, 85, 92, 15, 63, 35, 70, 40]
```

The random seed in the above example (13579) is called *the global random seed*. One can set the global random seed before the first simulator in simulus is created. (Simulus uses the random number generator defined in the `random`

module to create a namespace when the first simulator is created. The namespace will partially determine the random sequence of the random number generators attached to the simulators. We will discuss this momentarily.)

The above example can run repeatedly and the same random sequence (34, 3, 21, …) will be generated each time. This is obvious because we use the same seed each time we run the example. However, if we are using the `scipy.stats`

module, we also need to set the random seed for the `numpy.random`

module. We can do this by getting a random (32-bit) integer from the `random`

module, and use it as the seed for the `numpy.random`

module. In doing so, the simulation can run repeatedly with the
same results as long as we use the same global random seed.

```
[15]:
```

```
random.seed(13597) # the global random seed
s = random.randrange(2**32)
print("numpy.random's seed=%d" % s)
np.random.seed(s)
sim = simulus.simulator()
x = stats.randint(0, 99).rvs(10)
print(x)
```

```
numpy.random's seed=4121229328
[84 97 77 42 51 90 51 82 78 56]
```

### 3.3.2. Simulator-Specific Random Sequence¶

In simulus, the global random seed determines all the random sequences of the simulation. This would satisfy most of the simulation needs in terms of repeatability. However, there are cases this may not be sufficient.

For example, we may want to have a unique random sequence specific to a simulator. This can be important if we run our model in parallel with multiple simulators, say, one on each CPU core. So if a machine has 24 cores, we may have 24 simulators running simultaneously on the machine. And if run simulus for distributed simulation on 100 such machines in a cluster, we may have a simulation with 2400 simulators, each dealing with a smaller part of a large model. In this scenario, each simulator should have its own random sequence in order to get repeatable results.

A random sequence from a simulator is uniquely determined by the global random seed and the name of the simulator. Simulus guarantees that as long as we choose the same global random seed and use the same name for a simulator, we can obtain the same random sequence, regardless of where we run the simulator (even if on a different CPU core and on a different machine in the cluster).

The following example shows how to use the simulator-specific random number generator (using `sim.rng().randint`

as opposed of using `random.randint`

). This example again generates a series of random integers between 0 and 99. The sequence would be dependent on the simulator’s name ‘myname’. If the simulator’s name is fixed, the random sequence then would only change if we change the global random seed next time we run the example.

```
[16]:
```

```
random.seed(13579) # the global random seed
sim = simulus.simulator('myname')
x = [sim.rng().randint(0,99) for _ in range(10)] # simulator-specific random sequence
print(x)
```

```
[42, 20, 56, 40, 73, 31, 95, 94, 67, 22]
```

### 3.3.3. Separate Random Streams¶

Even within a simulator, we may need to use separate random sequences, a.k.a. *random streams* — for example, one for the customer inter-arrival time and the other for the service time. Having multiple random streams would make it easier to debug the models, since one can keep the same random sequence for each part of the model, even if we have to change the other parts.

In the following example, we create two `numpy`

random number generators (‘rng1’ and ‘rng2’) using two 32-bit random integers (‘s1’ and ‘s2’) from the simulator-specific random sequence. We attach the random number generators to the random variables, ‘rv1’ and ‘rv2’, by setting the `random_state`

of the corresponding random variables.

```
[17]:
```

```
random.seed(13579) # the global random seed
sim = simulus.simulator('myname')
s1 = sim.rng().randrange(2**32)
rng1= np.random.RandomState(s1)
print("create rng1 with seed=%d" % s1)
s2 = sim.rng().randrange(2**32)
rng2= np.random.RandomState(s2)
print("create rng2 with seed=%d" % s2)
rv1 = stats.randint(0, 99)
rv1.random_state = rng1
rv2 = stats.randint(0, 99)
rv2.random_state = rng2
print('stream1: %r' % rv1.rvs(10))
print('stream2: %r' % rv2.rvs(10))
```

```
create rng1 with seed=691080952
create rng2 with seed=4265267818
stream1: array([32, 67, 27, 38, 69, 12, 92, 97, 38, 69])
stream2: array([59, 19, 96, 87, 97, 89, 40, 6, 95, 37])
```

Now if we change one random stream, say, by replacing the random variable ‘rv1’ with a geometric distribution and also by changing the number of random samples drawn from the distribution (from 10 to 20), the other random stream would remain the same.

```
[18]:
```

```
random.seed(13579) # the global random seed
sim = simulus.simulator('myname')
s1 = sim.rng().randrange(2**32)
rng1= np.random.RandomState(s1)
print("create rng1 with seed=%d" % s1)
s2 = sim.rng().randrange(2**32)
rng2= np.random.RandomState(s2)
print("create rng2 with seed=%d" % s2)
rv1 = stats.geom(0.25)
rv1.random_state = rng1
rv2 = stats.randint(0, 99)
rv2.random_state = rng2
print('stream1: %r' % rv1.rvs(20))
print('stream2 (should be same): %r' % rv2.rvs(10))
```

```
create rng1 with seed=691080952
create rng2 with seed=4265267818
stream1: array([1, 2, 6, 1, 6, 1, 3, 4, 1, 8, 1, 4, 3, 2, 7, 3, 2, 1, 2, 2])
stream2 (should be same): array([59, 19, 96, 87, 97, 89, 40, 6, 95, 37])
```

## 3.4. Using Python Generator Functions¶

The practice of simulation can be different from pure statistical analysis in that simulation tends to use the random numbers one at a time. For example, we draw one random number to represent the time it takes for the next customer to arrive at queue, and then we draw another to represent the time it takes for the customer to be served, although the two random numbers are drawn from different probability distributions.

This can be accomplished easily using generator functions. Generator functions in Python are functions that can be paused and resumed on the fly, each time returning an object. The object can be inside an iteration. For random number generation, the returned objects are random numbers from a probability distribution.

To create a generator, one simply define a function that uses the ‘yield’ statement instead of ‘return’. In the following example, we define a generator function `exp_generator()`

which returns one random number at a time drawn from an exponential distribution with the given mean. We create two generator instances each with a separate random stream: one for the inter-arrival time and the other for the service time. The two random streams are “attached” with the simulator named ‘myname’. As a
result, they are independent, unique, and repeatable, as long as we use the same global random seed.

```
[19]:
```

```
def expon(mean, seed):
rv = stats.expon(scale=mean)
rv.random_state = np.random.RandomState(seed)
while True:
for x in rv.rvs(100):
yield x
random.seed(13579) # the global random seed
sim = simulus.simulator('myname')
inter_arrival_time = expon(1.2, sim.rng().randrange(2**32))
service_time = expon(0.8, sim.rng().randrange(2**32))
for i in range(5):
print("%d: iatime=%g, svctime=%g" %
(i, next(inter_arrival_time), next(service_time)))
print('...')
```

```
0: iatime=0.0836329, svctime=0.659787
1: iatime=0.612976, svctime=0.0816539
2: iatime=1.75473, svctime=1.05767
3: iatime=0.0635947, svctime=3.02943
4: iatime=1.89737, svctime=1.33523
...
```

## 3.5. Source Code¶

The following is the source code for random number generator functions:

```
[20]:
```

```
# %load '../qmodels/rng.py'
"""Random number generators as Python generator functions."""
import numpy as np
import scipy.stats as stats
__all__ = ['expon', 'gamma', 'norm', 'truncnorm']
batch_size = 100
def expon(mean, seed):
'''Returns a generator function for an i.i.d random variate from an
exponential distribution with the given mean.'''
rv = stats.expon(scale=mean)
rv.random_state = np.random.RandomState(seed)
while True:
for x in rv.rvs(batch_size):
yield x
def gamma(a, scale, seed):
'''Returns a generator function for an i.i.d. random variate from a
gamma distribution, where a is the shape parameter (when a is an
integer, gamma reduces to the Erlang distribution; and when a = 1
to the exponential distribution).'''
rv = stats.gamma(a, scale)
rv.random_state = np.random.RandomState(seed)
while True:
for x in rv.rvs(batch_size):
yield x
def norm(mean, stdev, seed):
'''Returns a generator function for an i.i.d. random variate from a
normal distribution with teh given mean and standard deviation.'''
rv = stats.norm(loc=mean, scale=stdev)
rv.random_state = np.random.RandomState(seed)
while True:
for x in rv.rvs(batch_size):
yield x
def truncnorm(a, b, seed):
'''Returns a generator function for an i.i.d. random variate from a
normal distribution truncated to the range between a and b.'''
rv = stats.truncnorm(a, b)
rv.random_state = np.random.RandomState(seed)
while True:
for x in rv.rvs(batch_size):
yield x
```

```
[ ]:
```

```
```