Date: 2023.08.29
This class focus on modes of convergence.
almost sure convergence
- defined by
convergence in probability
- defined by
convergence in distribution/week convergence
- defined by CDF or probability in set
Modes of Convergence
Probability Space
Definition a probability space by , and
Sure convergence
almost sure convergence
Define as .
Suppose we have a random variable and , then almost sure convergence means:
Note: strongest one, and defined on probability is nearly the same everywhere.
Convergence in probability
Define as converges in probability to , and the key is the probability of limit.
Properties
Convergence in probability → Convergence in distribution
Convergence in distribution → Convergence in probability when X = c (a constant)
[continuous mapping theorem] for any continuous function :
For , we can have
- in some special cases, it applies for conv in distribution
- if are independent, we have
Convergence in distribution
Other names can be converge weakly/converge in law.
Properties
Portmanteau lemma
following two arguments are same:
Continuous mapping theorem
for any continuous function :
Other: Convergence in mean
Define converges in mean towards the random variable if and only if:
So some special cases are:
- when , we say converges in mean to
- when , we say converges in mean square to
Theorem from Marginal Joint distribution
For , we have marginal joint distribution
But when it comes to converges in distribution, we need extra assumptions:
- For , we can have
Stochastic Order Notation
Big O: stochastic boundedness
iff
which means is stochastically bounded.
Small o: convergence in probability
iff
for every positive , which means that increase much faster than in any time.
Theorems
LLNs
This section covers several variants of LLM(Law of Large numbers).
Here we use notation as follows:
- is a series of random vars
- Denote as mean of var, and denote as variance
- i.i.d: identical and independent distribution
- Different convergence:
Theorem(1713)
Given , then
- Note: a special case for LLN.
Forms of Theorems
when we wanna find a specific theorem, we should define:
Assumptions
- independent vars
- allows distribution not change
Conclusions
- convergence in p (weak) or in a.s. (strong)
- shape of tails (fat/not fat)
Theorem (pre Chebyshev weak LLN)
For all independent (not i.i.d), we have and for all , then there exists such that:
- Assumption about variance can be relaxed to:
- Note: when random var is not unusual (infinite variance), averaging converges to expectation.
Proof
Given , we know
use property:
from a bounded variance, we can know:
According to definition of convergence in p, when , we ensure that
Theorem (Kolmogorov's 2nd strong LLN)
Given are i.i.d, then
iff exists and equals to for all i
- for i.i.d sequence, not all finite variance for all vars, no a.s. convergence
CLTs
This section covers several variants of CLT(Central Limit Theorem).
From LLN we know:
Now we wanna know the "shape" of such convergence, which is about asymptotic distribution/density.
An intuitive approach is to add a scaler to enlarge the item:
CLT implies that for a special scaler:
we have a magical property that we can have a special distribution :
Theorem (Lindeberg-Lévy CLT)
Given:
- are i.i.d
- and
then we have:
- Note: a special scaler results in a special distribution, and it is robust for all random vars
Theorem (Cramér–Wold, vector-form)
The above theorem can be easily generalized to vector form.
Following are equal:
- for all
So is the linear combination of the vector of random vars.
Theorem (multi-var form)
- are i.i.d
- and
then we have:
Theorem (Berry-Esseen)
let , which is our targeted CDF.
let be i.i.d with finite 3rd moment, then there exists constant such that:
- Note: our target CDF is bounded, and generally we can find a small