Date: 2023.08.29

This class focus on

modes of convergence.

almost sure convergence

- defined by
convergence in probability

- defined by
convergence in distribution/week convergence

- defined by CDF or probability in set

## Modes of Convergence

### Probability Space

Definition a probability space by , and

### Sure convergence

### almost sure convergence

Define as .

Suppose we have a random variable and , then almost sure convergence means:

**Note**: strongest one, and defined on probability is nearly the same everywhere.

### Convergence in probability

Define as converges in probability to , and the key is the probability of limit.

#### Properties

Convergence in probability → Convergence in distribution

Convergence in distribution → Convergence in probability when X = c (a constant)

[

**continuous mapping theorem**] for any continuous function :For , we can have

- in some special cases, it applies for conv in distribution
- if are independent, we have

### Convergence in distribution

Other names can be **converge weakly/converge in law**.

#### Properties

**Portmanteau lemma**following two arguments are same:

**Continuous mapping theorem**for any continuous function :

### Other: Convergence in mean

Define converges in mean towards the random variable if and only if:

So some special cases are:

- when , we say converges in
**mean**to - when , we say converges in
**mean square**to

### Theorem from Marginal Joint distribution

For , we have marginal joint distribution

But when it comes to converges in distribution, we need extra assumptions:

- For , we can have

## Stochastic Order Notation

### Big O: stochastic boundedness

iff

which means is stochastically bounded.

### Small o: convergence in probability

iff

for every positive , which means that increase much faster than in any time.

### Theorems

## LLNs

This section covers several variants of LLM(Law of Large numbers).

Here we use notation as follows:

- is a series of random vars
- Denote as mean of var, and denote as variance
- i.i.d: identical and independent distribution
- Different convergence:

### Theorem(1713)

Given , then

- Note: a special case for LLN.

### Forms of Theorems

when we wanna find a specific theorem, we should define:

**Assumptions**

- independent vars
- allows distribution not change

**Conclusions**

- convergence in p (weak) or in a.s. (strong)
- shape of tails (fat/not fat)

### Theorem (pre Chebyshev weak LLN)

For all independent (not i.i.d), we have and for all , then there exists such that:

- Assumption about variance can be relaxed to:
- Note: when random var is not unusual (infinite variance), averaging converges to expectation.

**Proof**

Given , we know

use property:

from a bounded variance, we can know:

According to definition of convergence in p, when , we ensure that

### Theorem (Kolmogorov's 2nd strong LLN)

Given are i.i.d, then

iff exists and equals to **for all i**

- for i.i.d sequence, not all finite variance for all vars, no a.s. convergence

## CLTs

This section covers several variants of CLT(Central Limit Theorem).

From LLN we know:

Now we wanna know the "shape" of such convergence, which is about asymptotic distribution/density.

An intuitive approach is to add a **scaler** to enlarge the item:

CLT implies that for a special scaler:

we have a magical property that we can have a special distribution :

### Theorem (Lindeberg-Lévy CLT)

Given:

- are i.i.d
- and

then we have:

- Note: a special scaler results in a special distribution, and it is
**robust for all random vars**

### Theorem (Cramér–Wold, vector-form)

The above theorem can be easily generalized to vector form.

Following are equal:

- for all

So is the linear combination of the vector of random vars.

### Theorem (multi-var form)

- are i.i.d
- and

then we have:

### Theorem (Berry-Esseen)

let , which is our targeted CDF.

let be i.i.d with finite 3rd moment, then there exists constant such that:

- Note: our target CDF is bounded, and generally we can find a small