# What is sample variance of sample variance, and what is theoretical sampling distribution?

I am trying to work some things in R and I am having trouble understanding some of the instructions.

I generated $1000$ samples of size $5$ from the standard normal distribution, and I calculated the mean of the sample variance of these. Now I want to know what the sample variance of my sample of sample variances is. But I am not sure I understand really what this means, nor how to implement this in R.

Further, I am asked to overlay the histogram I generated from my sample with a histogram of the theoretical density of the sampling distribution. What does this mean? Ie, what is meant by the theoretical density of the sampling distribution of the sample variance.

I know all my samples are coming from standard normal, where $\sigma^{2}=1$

and I know that if $X_{N}=X_{1}+…+X_{1000}$ would be $N(0,\frac{\sigma^{2}}{1000})$, is this at all what is being referred to?

I will appreciate any help and advice. Thank you

#### Solutions Collecting From Web of "What is sample variance of sample variance, and what is theoretical sampling distribution?"

The distribution of the sample variance $S^2$ is given by
$(n-1)S^2/\sigma^2 \sim \chi^2(n-1)$. I’m guessing that you are
asked to provide an illustration of this relationship
using R. Consider the following simulation.

 m = 1000;  n = 5;  x = rnorm(m*n)
DTA = matrix(x, nrow=m)  # each row a sample of size n
v = apply(DTA, 1, var)   # sample variances of m rows
hist((n-1)*v, prob=T, col="wheat", ylim=c(0,.2))
lines(density((n-1)*v), lwd=2, col="darkgreen")
mean(v)
## 1.003081
var(v)
## 0.4881987


This may not be exactly what you are being asked for, but it may
point you in the right direction. I have overlaid a density
curve on the histogram. I’m not sure what kind of histogram
could be superimposed.

Probably, an important message here is that
the relevant chi-squared distribution has df = n-1, not df = n.
You can try superimposing the density of $Chisq(5)$ and you’ll
see it doesn’t fit the histogram at all well.

$Addendum:$ I don’t know if you know about density
estimators, but for good measure, I also superimposed a
density estimator (smoothed histogram) in green. For this
particular simulation run the theoretical curve and the
density estimator agree pretty well, but if you run the program
several times you will get some cases in which the agreement
isn’t so good. (If you use m = 10,000, results will be more
stable.)

Please let me know if you can make sense of this to finish your project. What is the variance of $Chisq(4)$? If you don’t know,
look at the Wikipedia article on ‘Chi-squared distribution’.

Addendum per Comment from @Quality: Because $(n-1)S^2/\sigma^2 \sim Chisq(4),$ we have $V[4S^2] = 2(4)$ or $V(S^2) = 8/16 = 1/2$. Also, v in the program
represents $S^2,$ so it is not surprising that var(v) returns
$0.488 \approx 0.5$ within simulation error. (Because variances
are on a squared-unit scale, the margin of simulation error is numerically larger for variances than for means: Several additional
runs of the program gave values between 0.47 and 0.59. Use m=10^6 for a slower run with better accuracy.)