Skip to content

Lecture Note: Econ 703 (week 1)

🕒 Published at: a year ago

Date: 2023.08.29

This class focus on modes of convergence.

  • almost sure convergence

    • defined by P(lim)
  • convergence in probability

    • defined by lim(P)
  • convergence in distribution/week convergence

    • defined by CDF or probability in set P(XnA)

Modes of Convergence

Probability Space

Definition a probability space by (Ω,F,P), and Xn,ΩR

Sure convergence

limnXn(ω)=X(ω),for any ωΩ

almost sure convergence

Define as Xna.s.X.

Suppose we have a random variable Xn and X, then almost sure convergence means:

P(limnXn=X)=1orP(limnd(Xn,X)=0)=1

Note: strongest one, and defined on probability is nearly the same everywhere.

Convergence in probability

Define XnpX as Xn converges in probability to X, and the key is the probability of limit.

limnP(|XnX|ϵ)=0,for any ϵ

Properties

  • Convergence in probability → Convergence in distribution

  • Convergence in distribution → Convergence in probability when X = c (a constant)

  • [continuous mapping theorem] for any continuous function g:

    XnpX,g(Xn)pg(X)
  • For XnpX,YnpY, we can have Xn+YnpX+Y

    • in some special cases, it applies for conv in distribution
    • if Xn,Yn are independent, we have XnYnpXY

Convergence in distribution

Other names can be converge weakly/converge in law.

FXn(x)F(x)

Properties

  • Portmanteau lemma

    following two arguments are same:

XndXlimnP(XnA)=P(XA)
  • Continuous mapping theorem

    for any continuous function g:

    XndX,g(Xn)dg(X)

Other: Convergence in mean

Define Xn converges in rth mean towards the random variable X if and only if:

limnE(|XnX|r)=0

So some special cases are:

  • when r=1, we say Xn converges in mean to X
  • when r=2, we say Xn converges in mean square to X

Theorem from Marginal Joint distribution

For XnpX,YnpY, we have marginal joint distribution (Xn,Yn)p(X,Y)

But when it comes to converges in distribution, we need extra assumptions:

  • For XnpX,YnpC, we can have (Xn,Yn)d(X,C)

Stochastic Order Notation

Big O: stochastic boundedness

Xn=O(rn) |Xnrn|<MC iff

P(|Xnan|>M)<ε,  n>N

which means Xn is stochastically bounded.

Small o: convergence in probability

Xn=o(rn) iff

limnP(|Xnrn|ϵ)=0

for every positive ϵ, which means that rn increase much faster than Xn in any time.

Theorems

  • op(1)+op(1)=op(1)
  • op(1)+Op(1)=Op(1)
  • op(1)Op(1)=op(1)

LLNs

This section covers several variants of LLM(Law of Large numbers).

Here we use notation as follows:

  • {Xi} is a series of random vars {X1,X2,...Xn}
  • Denote E as mean of var, and denote var(Xi) as variance
  • i.i.d: identical and independent distribution
  • Different convergence: d,p

Theorem(1713)

Given {Xi}i.i.dBernoulli(p), then Xndp

  • Note: a special case for LLN.

Forms of Theorems

when we wanna find a specific theorem, we should define:

Assumptions

  • independent vars
  • allows distribution not change

Conclusions

  • convergence in p (weak) or in a.s. (strong)
  • shape of tails (fat/not fat)

Theorem (pre Chebyshev weak LLN)

For {Xi} all independent (not i.i.d), we have E(Xi)=μ< and var(Xi)M for all i, then there exists 0<M< such that:

Xnpμ
  • Assumption about variance can be relaxed to: 1n2ivar(Xi)o(1)
  • Note: when random var Xi is not unusual (infinite variance), averaging converges to expectation.

Proof

  • Given E(Xi)=μ, we know E(X)=μ

  • use property: P(|XnE(Xn)|ϵ)var(Xn)ϵ2

    from a bounded variance, we can know:

    P(|XnE(Xn)|ϵ)var(Xn)ϵ21ϵ2Mn
  • According to definition of convergence in p, when n, we ensure that limnP(|XnE(Xn)|ϵ)limn1ϵ2Mn=0

Theorem (Kolmogorov's 2nd strong LLN)

Given {Xi} are i.i.d, then

Xnμa.s.

iff E(Xi) exists and equals to μ for all i

  • for i.i.d sequence, not all finite variance for all vars, no a.s. convergence

CLTs

This section covers several variants of CLT(Central Limit Theorem).

From LLN we know:

(Xnμ)p0

Now we wanna know the "shape" of such convergence, which is about asymptotic distribution/density.

An intuitive approach is to add a scaler f(n) to enlarge the item:

f(n)(Xnμ)p?

CLT implies that for a special scaler:

f(n)=n

we have a magical property that we can have a special distribution Z:

n(Xnμ)dZ

Theorem (Lindeberg-Lévy CLT)

Given:

  • {Xi} are i.i.d
  • E(Xi)=μ and var(Xi)=σ2

then we have:

n(Xnμσ)dN(0,1)
  • Note: a special scaler results in a special distribution, and it is robust for all random vars

Theorem (Cramér–Wold, vector-form)

The above theorem can be easily generalized to vector form.

Following are equal:

  • XndX
  • λXndλX for all λRk

So λXn is the linear combination of the vector of random vars.

Theorem (multi-var form)

  • {Xi} are i.i.d
  • E(Xi)=μ and var(Xi)=v

then we have:

n(Xnμσ)dN(0,v)

Theorem (Berry-Esseen)

let Jn(t)=P(n(Xnμσ)t), which is our targeted CDF.

let {Xi} be i.i.d with finite 3rd moment, then there exists constant c such that:

JnΦcE(|XE(X)|)3var(X)3/2n
  • Note: our target CDF is bounded, and generally we can find a small

Gitalking ...