# Hahn-Banach theorem

This is article on the Hahn-Banach theorem. I write this article down to serve both as a note for me as well as a tutorial for the interested readers. This article is rather technical and if you know the basic set theory and linear algebra, I believe you can read this article without difficulty.

This theorem, named after Hans Hahn and Stefan Banach, is a central tool in functional analysis. By using this theorem, any linear functional defined on a vector subspace of some vector space can be extended to the whole space.

# Linear Functional and Sublinear Function

So, what is a linear functional then?

A Linear Functional L on a real vector space V is a function L:V→ℝ, which satisfies the following properties:

∀v, w∈V, c∈ℝ

LF1. L(v+w)=L(v)+L(w)

LF2. c·L(v)=L(c·v)

A Sublinear Function p on a real vector space V is a function p:V→ℝ, which satisfies the following properties:

∀v, w∈V, c∈ℝ

SF1. p(v+w) ≤ p(v)+p(w)

SF2. c·p(v) = p(c·v) where c > 0

# Zorn’s Lemma

Two axioms are used in the proof of the Hahn-Banach theorem: **Greatest Lower Bound Property**** and ****Zorn’s Lemma****.**

**Greatest-lower-bound property — Any non-empty set of real numbers that has an lower bound must have a greatest lower bound in real numbers.**

**Least-upper-bound property — Any non-empty set of real numbers that has an upper bound must have a least upper bound in real numbers.**

**Zorn’s lemma — Suppose a non-empty partially ordered set P has the property that every non-empty chain has an upper bound in P. Then the set P contains at least one maximal element.**

First, I have to introduce the definition of Partial Order, Total Order and Chain.

A Partial Order ≤ on a set X is a binary relation that satisfies the following properties:

PO1. ∀x ∈ X, x ≤ x

PO2. ∀x, y ∈ X, x ≤ y and y ≤ x ⇒ x = y

PO3. ∀x, y, z ∈ X, x ≤ y and y ≤ z ⇒ x ≤ z

A Total Order on a set X is a Partial Order in which any two elements are comparable:

∀x, y ∈ X, x ≤ y or y ≤ x

A Chain on a set X is a Total Ordered subset of X.

# Hahn-Banach Theorem

Below is the description of Hahn-Banach Theorem:

HA. Let V be the vector space and S⊆V be the linear subspace in V. Let l:S→ℝ be a linear functional defined on S. Let p:V→ℝbe a sublinear function defined on V such that l(v) ≤ p(v) ∀v∈S. Then, 1) there is a linear functional L:V→ℝsuch that 2) L(v)=l(v) ∀ v∈S and 3) L(v) ≤ p(v) ∀v∈V.

**Proof**

If S=V, the statement is true.

So, consider S⊂V.

⇒ ∃z∈V\S

**1)**We can use z to extend the linear subspace S and the linear functional l.

Consider the set S’ extended from S by:

S’ = { y+c·z | y∈S, c∈ℝ }

∀f, g ∈S’

⇒ f = y1+c1·z, g = y2+c2·z for some y1, y2 ∈ S and c1, c2 ∈ ℝ

⇒ f+g = (y1+y2) + (c1+c2)·z ⇒ f+g ∈ S’

∀f∈S’, d∈ℝ

⇒ f = y+c·z for some y∈S, c∈ℝ

⇒ d·f = d·(y+c·z) = d·y+(d·c)·z ⇒ d·f ∈ S’

⇒ S’ is a linear subspace of V

Assume L’:S’→ℝ be the extended linear functional defined on S’ s.t.

L’(v) = l(v) ∀ v∈S

L’(v) ≤ p(v) ∀ v∈S’

Then, ∀ v∈S’

⇒ v = y+c·z for some y ∈ S and c ∈ ℝ

⇒ L’(v) = L’(y+c·z) = L’(y)+c·L’(z) = l(y)+c·L’(z) **[ By LF1, LF2 and definition ]**

We need to prove that L’ is well defined. If we can prove that v is uniquely represented by y+c·z, then L’ is well defined as l is already well defined.

Suppose on the contrary, v = y1+c1·z = y2+c2·z where y1,y2∈S and c1,c2∈ℝ

⇒ y1-y2 = (c2-c1)·z ∈S

⇒ y1-y2 = 0, c2-c1 = 0 **[ Since z **∉ **S ]**

⇒ y1=y2 and c1=c2 ⇒ v is uniquely defined in S’

⇒ L’ is well defined

Next, we need to check if our assumption is correct. Denote q = L’(z), then

∀v∈S’, L’(v)=l(y)+c·q for some y∈S and c∈ℝ

=> l(y)+c·q ≤ p(y+c·z) **[ By our assumption ]**

=> c·q ≤ p(y+c·z) — l(y) — — — (*)

If c = 0, L’(v) = l(v) ∀ v∈S. If c **≠ **0, let’s check if there exists q∈ℝsuch that (*) is satisfied. **Case 1: c>0**

q ≤ p(y/c+z)-l(y/c)

⇒ q ≤ p(y+z)-l(y) **[ Since y/c ∈ S and replace y/c by y ]****Case 2: c<0**

q ≥ 1/c·p(y+c·z) +(-1/c)·l(y) = -(-1/c)·p(y+c·z) +(-1/c)·l(y) = -p(-y/c-z)+l(-y/c) **[ By SF2 ]**

⇒ q ≥ -p(y-z)+l(y) **[ Since -y/c ∈ S and replace -y/c by y ]**

Let M = { p(y+z)-l(y) | y∈S }, N = { -p(y-z)+l(y) | y∈S }

∀ m∈M, n∈N

⇒ m = p(y1+z)-l(y1), n = -p(y2-z)+l(y2) for some y1, y2 ∈ S

⇒ m-n = p(y1+z)-l(y1)+p(y2-z)-l(y2) = p(y1+z)+p(y2-z)-(l(y1)+l(y2))

≥ p(y1+y2)-l(y1+y2) ≥ 0 **[ Since y1+y2 ∈ S and by SF1 ]**

⇒ n ≤ m

⇒ sup N ≤ q ≤ inf M **[ Since m ≥ q ≥ n, inf M and sup N exist by ****Greatest Lower Bound Property**** and ****Least Upper Bound Property** **]**

⇒ ∃q∈ℝ s.t. (*) holds ⇒ every S’ has an extended linear functional L’ s.t. L’(v)=l(v) ∀ v∈S’ and L’(v) ≤ p(v) ∀v∈V.

Now, we find that there is a Finally, we consider the family of all possible extended linear functional L’ and use the Zorn’s Lemma to “merge” all these function.

**2)**Let’s find out what the set P is.

Consider Ω = { (S’[i], L’[i]) | i=1,2,.. } be the set of ordered pair (S’[i], L’[i]) where S’[i] is the linear subspace containing S and L’[i]:S’[i]→ℝ is the corresponding extended linear function with L’[i](v)=l(v) ∀ v∈S and L’[i](v) ≤ p(v) ∀ v∈S’[i].

Consider P = (Ω, ≤) with the order ≤ on Ω s.t. :

(S’[i], L’[i]) ≤ (S’[j], L’[j]) iff S’[i] ⊆ S’[j] and L’[j](S’[i])=L’[i](S’[i])

We need to prove that ≤ is a Partial Order.**PO1:**

∀ (S’[i],L’[i]) ∈ Ω, S’[i] ⊆ S’[i], L’[i](S’[i])=L’[i](S’[i]) => (S’[i], L’[i]) ≤ (S’[i], L’[i])**PO2:**

∀ (S’[i],L’[i]),(S’[j], L’[j]) ∈ Ω,

(S’[i], L’[i]) ≤ (S’[j], L’[j]) and (S’[j], L’[j]) ≤ (S’[i], L’[i])

⇒ S’[i] ⊆ S’[j] and S’[j] ⊆ S’[i]

L’[j](S’[i])=L’[i](S’[i]) and L’[i](S’[j])=L’[j](S’[j])

⇒ S’[i]=S’[j] and L’[j](S’[j]) = L’[j](S’[i]) = L’[i](S’[i]) = L’[i](S’[j])

⇒ (S’[i], L’[i]) = (S’[j], L’[j])**PO3:**

∀ (S’[i],L’[i]), (S’[j], L’[j]), (S’[k], L’[k]) ∈ Ω,

(S’[i], L’[i]) ≤ (S’[j], L’[j]) and (S’[j], L’[j]) ≤ (S’[k], L’[k])

⇒ S’[i] ⊆ S’[j] and S’[j] ⊆ S’[k] and L’[j](S’[i])=L’[i](S’[i]) and L’[k](S’[j]) = L’[j](S’[j])

⇒ S’[i] ⊆ S’[k] and L’[k](S’[i])=L’[j](S’[i])=L’[i](S’[i])

⇒ (S’[i], L’[i]) ≤ (S’[k], L’[k])

Ω ≠ Ø **[ Since ∃z∈V\S, extended linear subspace exists. ]**

Therefore, P=(Ω,≤) is a non-empty Partial Ordered Set.

**3)** For every non-empty chain C = {(S’[i],L’[i])} on Ω, we need to find its upper bound. Consider the ordered pair (S’c, L’c) with

S’c = ⋃ S’[i]

L’c: S’c→ℝ where L’c(v) = L’[i](v) for some (S’[i], L’[i]) ∈ C where v ∈ S’[i]

We need to prove that the (S’c, L’c) ∈ Ω and also is a upper bound of the chain C, that is, we need to prove the following conditions are satisfied: **3.1)** S’c is a linear subspace of V**3.2)** S ⊆ S’c**3.3)** L’c is well defined**3.4)** L’c(v) ≤ p(v) ∀ v∈S’c**3.5)** L’c(v) = l(v) ∀ v∈S**3.6)** L’c is a linear functional defined on S’c**3.7)** (S’[i], L’[i]) ≤ (S’c, L’c) ∀(S’[i], L’[i]) ∈ P

**3.1)** S’c is a linear subspace of V:

∀ f, g ∈ S’c, d∈ℝ

⇒ f∈S’[i] for some (S’[i], L’[i]), g∈S’[j] for some (S’[j], L’[j])

⇒ f, g∈S’[j] **[ Suppose (S’[i], L’[i]) ≤ (S’[j], L’[j]) as C is totally ordered ]**

⇒ f+g∈S’[j] ⊆ ⋃ S’[i] **[ Since S’[j] is a linear subspace ]**

⇒ f+g∈S’c **[ Same result when (S’[j], L’[j]) ≤ (S’[i], L’[i]) ]**

∀ f∈S’c, d∈ℝ

⇒ f∈S’[i] for some (S’[i], L’[i])

⇒ d·f∈S’[i] **[ Since S’[i] is a linear subspace ]**

⇒ d·f∈S’c

**3.2)** ∀ S’[k], S ⊆ S’[k] ⊆⋃ S’[i]

⇒ S ⊆ S’c

**3.3)** L’c is well defined:

Suppose v∈S’[i] and v∈S’[j]

⇒ (S’[i], L’[i]) ≤ (S’[j], L’[j]) ⇒ L’[j](v)=L’[i](v) **[ Since C is totally ordered ]**

⇒ L’c(v)=L’[j](v)=L’[i](v) **[ Same result when (S’[j], L’[j]) ≤ (S’[i], L’[i]) ]**

**3.4)** ∀ v∈S’c ⇒ v∈S’[i] for some (S’[i], L’[i]) ⇒ L’c(v)=L’[i](v) ≤ p(v)

**3.5)** ∀ v∈S ⇒ v∈S’[i] for some (S’[i], L’[i]) ⇒ L’c(v) = L’[i](v) = l(v)

**3.6)** L’c is linear functional:

∀ f, g ∈ S’c

⇒ f∈S’[i] for some (S’[i], L’[i]), g∈S’[j] for some (S’[j], L’[j])

⇒ f, g ∈S’[j] **[ Suppose (S’[i], L’[i]) ≤ (S’[j], L’[j]) as C is totally ordered ]**

⇒ L’c(f) + L’c(g) = L’[i](f)+L’[j](g) = L’[j](f)+L’[j](g) = L’[j](f+g) = L’c(f+g) **[ Since f+g ∈** **S’[j] ]**

∀f ∈ S’c, d∈ℝ

⇒ f ∈ S’[i] for some (S’[i], L’[i])

⇒ L’c(d·f) = L’[i](d·f) = d·L’[i](f) = d·L’c(f)

**3.7)** (S’[i], L’[i]) ≤ (S’c, L’c) ∀(S’[i], L’[i]) ∈ P:

∀S’[k] ⇒ S’[k] ⊆⋃ S’[i]

∀v∈S’[i] ⇒ L’c(v) = L’[i](v) **[ By the definition of L’c ]**

Therefore, every non-empty chain has an upper bound in P. By the **Zorn’s lemma, **set P = (Ω, ≤) contains at least one maximal element (Smax, Lmax). If Smax = V, then Hahn-Banach Theorem will be proven.

**4)** Now, suppose Smax ≠ V

⇒ Smax ⊂ V ⇒ ∃z∈V\Smax

⇒ By 1), there exists set S’ which is an extended from Smax by z and extended linear functional L’ defined on S’

⇒ ∃(S’, L’) ∈ Ω s.t. (Smax, Lmax) ≤ (S’, L’)

⇒ contradicting (Smax, Lmax) is the maximal element

Reference:

“https://en.wikipedia.org/wiki/Vector_space”

“https://en.wikipedia.org/wiki/Linear_subspace”

“https://en.wikipedia.org/wiki/Functional_analysis”

“https://en.wikipedia.org/wiki/Hahn_Banach_theorem”

“https://www.proofwiki.org/wiki/Definition:Linear_Functional”

“https://en.wikipedia.org/wiki/Infimum_and_supremum”