## Introduction:

The goal in analysing output
data from running a simulation model is to make a valid statistical inference
about the initial and long-term average behaviour of the system based on the
sample average from N replicate simulation runs.

Output Analysis is the analysis of data generated by a simulation run to predict system performance or compare the performance of two or more system designs. In stochastic simulations, multiple runs are always necessary. The output of a single run can be viewed as a sample of size 1.

Output analysis is needed
because output data from a simulation exhibits random variability when random
number generators are used. i.e., two different random number streams will
produce two sets of output which (probably) will differ. The statistical tool
mainly used is the confidence interval for the mean.

For most simulations, the
output data are correlated and the processes are non-stationary. The
statistical (output) analysis determines:

a. The
estimate of the mean and variance of random variables.

b. The
number of observations required to achieve a desired precision of these
estimates.

## Nature of the Problem:

Once a stochastic variable has
been introduced into a simulation model, almost all the system variables
describing the system behaviour also become stochastic. The values of most of
the system variables will fluctuate as the simulation proceeds so that no one
measurement can be taken to represent the values of a variable.

Instead many observations of the variable values must be made in order to make a statistical estimate of its true values. Some statement must also be made about the probability of the true value falling within the given interval about the estimated value. Such a statement defines a confidence interval, without it simulation result are of little value to the system analyst.

Instead many observations of the variable values must be made in order to make a statistical estimate of its true values. Some statement must also be made about the probability of the true value falling within the given interval about the estimated value. Such a statement defines a confidence interval, without it simulation result are of little value to the system analyst.

A large body statistical method
has been developing over the years to analyse results in science, engineering
and other fields where experimental observation is made. So, because of the experimental measurements of the system of simulation for these statistical methods can be adapted to
simulation results to analyse.

**The Newly Developing Statistical Methodology Concerns:**

1. To
ensure that the statistical estimates are consistent, meaning that as the
sample size increases the estimate tends to a true value.

2. To
control biasing in measure of both new values of variance. Bias causes the
distinction of an estimate to differ significantly from the true population
statistics, even though the estimate may be consistent.

3. To
develop sequential testing methods, to determine how long a simulation should
be run in order to obtain confidence in its return.

## Estimation Method:

Statistical methods are commonly
used on the random variable. Usually, a random variable is drawn from an infinite
population with a finite mean ‘Î¼’ and finite variance ‘Ïƒ2’. These random
variables are independently and identically distributed (i.e. IID variables).

Let, xi=iid random variables. (i
= 1, 2…, n), then according to central limit theorem and applying transformation,
approximate normal variance,

a. It can be shown to be a
consistent estimator for the mean of the population from which the sample is done.

b. Since the sample mean is
some of the random variables, it is itself a random variable. So, a confidence
interval about its computed value needs to be established.

c. The probability density
function on the standard normal variable (Z) is shown in the figure below.

## Simulation Run Statistics:

On every simulation run, some statistic is measure based on some
assumption; for example: on establishing confidence interval it is assumed that
the observation is mutually independent and distinction from which they are
drawn is stationary. But many statistics are interesting in simulation don’t
meet this condition.

Let us illustrate the problems that arise in measuring statistic
from a simulation run with the example of a single server system.

Consider the occurrence of arrivals has a Poisson distribution:

a. The service time has an exponential
distribution.

b. The queuing discipline is FIFO

c. The inter-arrival time is distributed
exponentially

d. System has a single server.

Then in a simulation run, the simplest way to estimate the mean
waiting is to accumulate the waiting time of n successive entities and dividing
it by ‘n’. this gives sample mean denoted by x ̅ (n).

The 2

^{nd}problem is that the distribution may not be stationary; it is because a simulation run is started with the system in some initial state, frequently the idle state, in which no service is being given and no entities are waiting, thus the early arrivals have a more probability of obtaining service quickly. So, a sample means that includes the early arrivals will be biased. As the length of the simulation run extended and the sample size increases, the effect of bias will be minimum. This is shown in the figure below.## Replication of Runs:

One problem in measuring the statistic in the
simulation run is that the results are dependent. But it is required, in
simulation, to get the independent result. The one way of obtaining the independent
result is to repeat the simulation.

Repeating the experiment with different random numbers
for the same sample size ‘n’ gives a set of an independent determination of sample mean x ̅(n).

Even though the distribution of the sample means depends
upon the degree of autocorrelation, this independent determination of sample
mean can be used to estimate the variance of the distribution.

## Elimination of Initial Bias:

There two general approaches that can be used to
remove the initial bias:

1. The system can be started in more representative states
rather than in the empty state.

2. The first part of the simulation run can be ignored.

In the first approach, it is necessary to know the
steady-state distinction for the system and we then select the initial state
distinction. In the study of simulation, particularly the existing system, there
may be information available on the expected condition which makes it feasible
to select a better initial condition and thus eliminating the initial bias.

The second approach that is used to remove the initial
bias is the most common approach. In this method, the initial section of the
run which has a high bias (simulation) result is eliminated. First, the run is
started from an idle state and stopped after a certain period of time (the time
at which the bias is satisfactory). The entities existing in the system at that
are left as they are and this point is the point of a restart for other repeating simulation runs.

Then the run is restarted with statistics being
gathered from the point of the restart. These approaches have the following difficulties:

1. No simple rules can be given to deciding how long an interval
should be eliminated. For this, we have to use some pilot run starting from the
ideal state to judge how long the initial bias remains. These can be done by plotting
the measured, statistics against the run length.

2. Another disadvantage of eliminating the first part of the simulation run is that the estimate of variance will be based on less
information affecting the establishment of confidence limit. These will then
cause to increase in confidence internal size.

## No comments:

## Post a Comment

If you have any doubt, then don't hesitate to drop comments.