Proofs involving the Moore–Penrose inverse

Let A be an m-by-n matrix over a field 𝕂, where 𝕂, is either the field ℝ, of real numbers or the field ℂ, of complex numbers. Then there is a unique n-by-m matrix A⁺ over 𝕂, such that:

AA⁺A = A,
A⁺AA⁺ = A⁺,
(AA⁺)^* = AA⁺,
(A⁺A)^* = A⁺A.

A⁺ is called the Moore-Penrose inverse of A. Notice that A is also the Moore-Penrose inverse of A⁺ . That is, (A⁺)⁺ = A.

Useful lemmas

These results are used in the proofs below. In the following lemmas, A is a matrix with complex elements and n columns, B is a matrix with complex elements and n rows.

===Lemma 1: A*A = 0 ⇒ A = 0=== The assumption says that all elements of A*A are zero. Therefore,

$$0 = \operatorname{Tr}(A^*A) = \sum_{j=1}^n (A^*A)_{jj} = \sum_{j=1}^n \sum_{i=1}^m (A^*)_{ji}A_{ij}=\sum_{i=1}^m \sum_{j=1}^n |A_{ij}|^2$$.

Therefore, all A_ij equal 0 i.e. A = 0.

Lemma 2: AAB = 0* ⇒ AB = 0

$\begin{alignat}{3} & 0\,&& = A^*AB &&&\\ \Rightarrow\,& 0\,&& = B^*A^*AB &&& \\ \Rightarrow\,& 0\,&& = (AB)^*(AB) &&& \\ \Rightarrow\,& 0\,&& = AB &&& (\text{by Lemma 1}) \end{alignat}$

Lemma 3: ABB = 0* ⇒ AB = 0

This is proved in a manner similar to the argument of Lemma 2 (or by simply taking the Hermitian conjugate).

Existence and uniqueness

Proof of uniqueness

Suppose that B and C are two n-by-m matrices over 𝕂 satisfying the Moore-Penrose criteria. Observe then that

AB = ACAB = (AC)^*(AB)^* = C^*(ABA)^* = C^*A^* = (AC)^* = AC.

Analogously we conclude that BA = CA. The proof is completed by observing that then

B = BAB = BAC = CAC = C.

Proof of existence

The proof proceeds in stages.

1-by-1 matrices

For any x ∈ 𝕂, we define $x^+ := \begin{cases} x^{-1}, & \mbox{if }x \neq 0 \\ 0, & \mbox{if }x = 0 \end{cases}$

It is easy to see that x⁺ is a pseudoinverse of x (interpreted as a 1-by-1 matrix).

Square diagonal matrices

Let D be an n-by-n matrix over 𝕂 with zeros off the diagonal. We define D⁺ as an n-by-n matrix over 𝕂 with (D⁺)_ij := (D_ij)⁺ as defined above. We write simply D_ij⁺ for (D⁺)_ij = (D_ij)⁺.

Notice that D⁺ is also a matrix with zeros off the diagonal.

We now show that D⁺ is a pseudoinverse of D:

(DD⁺D)_ij = D_ijD_ij⁺D_ij = D_ij ⇒ DD⁺D = D
(D⁺DD⁺)_ij = D_ij⁺D_ijD_ij⁺ = D_ij⁺ ⇒ D⁺DD⁺ = D⁺
$(DD^+)^*_{ij} = \overline{(DD^+)_{ji}} = \overline{D_{ji}D^+_{ji}} = (D_{ji}D^+_{ji})^* = D_{ji}D^+_{ji} = D_{ij}D^+_{ij} \Rightarrow (DD^+)^* = DD^+$
$(D^+D)^*_{ij} = \overline{(D^+D)_{ji}} = \overline{D^+_{ji}D_{ji}} = (D^+_{ji}D_{ji})^* = D^+_{ji}D_{ji} = D^+_{ij}D_{ij} \Rightarrow (D^+D)^* = D^+D$

General non-square diagonal matrices

Let D be an m-by-n matrix over 𝕂 with zeros off the main diagonal, where m and n are unequal. That is, D_ij = d_i for some d_i ∈ 𝕂 when i = j and D_ij = 0 otherwise.

Consider the case where n > m. Then we can rewrite D = [D₀ 0_{m × (n−m)}] by stacking where D₀ is a square diagonal m-by-m matrix, and 0_{m × (n−m)} is the m-by-(n-m) zero matrix. We define $D^+\equiv\begin{bmatrix}D_0^+\\\mathbf{0}_{(n-m)\times m}\end{bmatrix}$ as an n-by-m matrix over 𝕂, with D₀⁺ the pseudoinverse of D₀ defined above, and 0_{(n−m) × m} the (n-m)-by-m zero matrix. We now show that D⁺ is a pseudoinverse of D:

By multiplication of block matrices, DD⁺ = D₀D₀⁺ + 0_{m × (n−m)}0_{(n−m) × m} = D₀D₀⁺, so by property 1 for square diagonal matrices D₀ proven in the previous section,DD⁺D = D₀D₀⁺[D₀ 0_{m × (n−m)}] = [D₀D₀⁺D₀ 0_{m × (n−m)}] = [D₀ 0_{m × (n−m)}] = D.
Similarly, $D^+D=\begin{bmatrix}D_0^+D_0 & \mathbf{0}_{m\times(n-m)}\\\mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times(n-m)}\end{bmatrix}$, so $D^+DD^+=\begin{bmatrix}D_0^+D_0 & \mathbf{0}_{m\times(n-m)}\\ \mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times(n-m)}\end{bmatrix}\begin{bmatrix}D_0^+\\\mathbf{0}_{(n-m)\times m}\end{bmatrix}=\begin{bmatrix}D_0^+D_0D_0^+\\\mathbf{0}_{(n-m)\times m}\end{bmatrix}=D^+.$
By 1 and property 3 for square diagonal matrices, (DD⁺)^* = (D₀D₀⁺)^* = D₀D₀⁺ = DD⁺.
By 2 and property 4 for square diagonal matrices, $(D^+D)^*=\begin{bmatrix}(D_0^+D_0)^* & \mathbf{0}_{m\times(n-m)}\\\mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times(n-m)}\end{bmatrix}=\begin{bmatrix}D_0^+D_0 & \mathbf{0}_{m\times(n-m)}\\\mathbf{0}_{(n-m)\times m} & \mathbf{0}_{(n-m)\times(n-m)}\end{bmatrix}=D^+D.$

Existence for D such that m > n follows by swapping the roles of D and D⁺ in the n > m case and using the fact that (D⁺)⁺ = D.

Arbitrary matrices

The singular value decomposition theorem states that there exists a factorization of the form

A = UΣV^* where:

U is an m-by-m unitary matrix over 𝕂.

Σ is an m-by-n matrix over 𝕂 with nonnegative real numbers on the diagonal and zeros off the diagonal.

V is an n-by-n unitary matrix over 𝕂.

Define A⁺ as VΣ⁺U^*.

We now show that A⁺ is a pseudoinverse of A:

AA⁺A = UΣV^*VΣ⁺U^*UΣV^* = UΣΣ⁺ΣV^* = UΣV^* = A
A⁺AA⁺ = VΣ⁺U^*UΣV^*VΣ⁺U^* = VΣ⁺ΣΣ⁺U^* = VΣ⁺U^* = A⁺
(AA⁺)^* = (UΣV^*VΣ⁺U^*)^* = (UΣΣ⁺U^*)^* = U(ΣΣ⁺)^*U^* = U(ΣΣ⁺)U^* = UΣV^*VΣ⁺U^* = AA⁺
(A⁺A)^* = (VΣ⁺U^*UΣV^*)^* = (VΣ⁺ΣV^*)^* = V(Σ⁺Σ)^*V^* = V(Σ⁺Σ)V^* = VΣ⁺U^*UΣV^* = A⁺A

Basic properties

A^*⁺ = A^{+^*}

The proof works by showing that A^{+^*} satisfies the four criteria for the pseudoinverse of A^*. Since this amounts to just substitution, it is not shown here.

The proof of this relation is given as Exercise 1.18c in.

Identities

==== A⁺ = A⁺ A^+* A* ==== A⁺ = A⁺AA⁺ and AA⁺ = (AA⁺)^* imply that A⁺ = A⁺(AA⁺)^* = A⁺A^{+^*}A^*.

====A⁺ = A* A^+* A⁺==== A⁺ = A⁺AA⁺ and A⁺A = (A⁺A)^* imply that A⁺ = (A⁺A)^*A⁺ = A^*A^+*A⁺.

A = A^+* A A*

A = AA⁺A and AA⁺ = (AA⁺)^* imply that A = (AA⁺)^*A = A^{+^*}A^*A.

A = A A A^+

A = AA⁺A and A⁺A = (A⁺A)^* imply that A = A(A⁺A)^* = AA^*A^{+^*}.

A = A* A A*⁺

This is the conjugate transpose of A = A^{+^*}A^*A above.

A = A⁺ A A**

This is the conjugate transpose of A = AA^*A^{+^*} above.

Reduction to the Hermitian case

The results of this section show that the computation of the pseudoinverse is reducible to its construction in the Hermitian case. It suffices to show that the putative constructions satisfy the defining criteria.

===A⁺ = A* (A A*)⁺=== This relation is given as exercise 18(d) in, for the reader to prove, "for every matrix A". Write D = A^*(AA^*)⁺. Observe that

$\begin{alignat}{3} & AA^*\, &&= AA^*(A A^*)^+ AA^* &&&\\ \Leftrightarrow\,& AA^*\,&&= ADAA^* &&& \\ \Leftrightarrow\,& \quad 0\,&&=(AD-I)AA^* &&& \\ \Leftrightarrow\,& \quad 0\,&&=ADA-A &&& (\text{by Lemma 3}) \\ \Leftrightarrow\,& \quad \!A\,&& = ADA &&& \end{alignat}$

Similarly, (AA^*)⁺AA^*(AA^*)⁺ = (AA^*)⁺ implies that A^*(AA^*)⁺AA^*(AA^*)⁺ = A^*(AA^*)⁺ i.e. DAD = D.

Additionally, AD = AA^*(AA^*)⁺ so AD = (AD)^*.

Finally, DA = A^*(AA^*)⁺A implies that (DA)^* = A^*((AA^*)⁺)^*A = A^*((AA^*)⁺)A = DA.

Therefore D = A⁺.

===A⁺ = (A* A)⁺A*=== This is proved in an analogous manner to the case above, using Lemma 2 instead of Lemma 3.

Products

For the first three proofs, we consider products C = AB.

A has orthonormal columns

If A has orthonormal columns i.e. A^*A = I then A⁺ = A^*. Write D = B⁺A⁺ = B⁺A^*. We show that D satisfies the Moore-Penrose criteria.

CDC = ABB⁺A^*AB = ABB⁺B = AB = C,

DCD = B⁺A^*ABB⁺A^* = B⁺BB⁺A^* = B⁺A^* = D,

(CD)^* = D^*B^*A^* = A(B⁺)^*B^*A^* = A(BB⁺)^*A^* = ABB⁺A^* = CD,

(DC)^* = B^*A^*D^* = B^*A^*A(B⁺)^* = (B⁺B)^* = B⁺B = B⁺A^*AB = DC.

Therefore D = C⁺.

B has orthonormal rows

If B has orthonormal rows i.e. BB^* = I then B⁺ = B^*. Write D = B⁺A⁺ = B^*A⁺. We show that D satisfies the Moore-Penrose criteria.

CDC = ABB^*A⁺AB = AA⁺AB = AB = C,

DCD = B^*A⁺ABB^*A⁺ = B^*A⁺AA⁺ = B^*A⁺ = D,

(CD)^* = D^*B^*A^* = (A⁺)^*BB^*A^* = (A⁺)^*A^* = (AA⁺)^* = AA⁺ = ABB^*A⁺ = CD,

(DC)^* = B^*A^*D^* = B^*A^*(A⁺)^*B = B^*(A⁺A)^*B = B^*A⁺AB = DC.

Therefore D = C⁺.

A has full column rank and B has full row rank

Since A has full column rank, A^*A is invertible so (A^*A)⁺ = (A^*A)⁻¹. Similarly, since AB has full row rank, BB^* is invertible so (BB^*)⁺ = (BB^*)⁻¹.

Write D = B⁺A⁺ = B^*(BB^*)⁻¹(A^*A)⁻¹A^*. We show that D satisfies the Moore-Penrose criteria.

CDC = ABB^*(BB^*)⁻¹(A^*A)⁻¹A^*AB = AB = C,

DCD = B^*(BB^*)⁻¹(A^*A)⁻¹A^*ABB^*(BB^*)⁻¹(A^*A)⁻¹A^* = B^*(BB^*)⁻¹(A^*A)⁻¹A^* = D,

CD = ABB^*(BB^*)⁻¹(A^*A)⁻¹A^* = A(A^*A)⁻¹A^* = (A(A^*A)⁻¹A^*)^* ⇒ (CD)^* = CD,

DC = B^*(BB^*)⁻¹(A^*A)⁻¹A^*AB = B^*(BB^*)⁻¹B = (B^*(BB^*)⁻¹B)^* ⇒ (DC)^* = DC.

Therefore D = C⁺.

Conjugate transpose

Here, B = A^*, and thus C = AA^* and D = A^+*A⁺. We show that indeed D satisfies the four Moore-Penrose criteria.

CDC = AA^*A^+*A⁺AA^* = A(A⁺A)^*A⁺AA^* = AA⁺AA⁺AA^* = AA⁺AA^* = AA^* = C

DCD = A^+*A⁺AA^*A^+*A⁺ = A^+*A⁺A(A⁺A)^*A⁺ = A^+*A⁺AA⁺AA⁺ = A^+*A⁺AA⁺ = A^+*A⁺AA⁺ = A^+*A⁺ = D

(CD)^* = (AA^*A^+*A⁺)^* = A^+*A⁺AA^* = A^+*(A⁺A)^*A^* = A^+*A^*A^+*A^* = (AA⁺)^*(AA⁺)^* = AA⁺AA⁺ = A(A⁺A)^*A⁺=

= AA^*A^+*A⁺ = CD

(DC)^* = (A^+*A⁺AA^*)^* = AA^*A^+*A⁺ = A(A⁺A)^*A⁺ = AA⁺AA⁺ = (AA⁺)^*(AA⁺)^* = A^+*A^*A^+*A^* = A^+*(A⁺A)^*A^*=

= A^+*A⁺AA^* = DC

Therefore D = C⁺. In other words:

(AA^*)⁺ = A^+*A⁺ and, since (A^*)^* = A

(A^*A)⁺ = A⁺A^+*

Projectors and subspaces

Define P = AA⁺ and Q = A⁺A. Observe that P² = AA⁺AA⁺ = AA⁺ = P. Similarly Q² = Q, and finally, P = P^* and Q = Q^*. Thus P and Q are orthogonal projection operators. Orthogonality follows from the relations P = P^* and Q = Q^*. Indeed, consider the operator P: any vector decomposes as

x = Px + (I−P)x

and for all vectors x and y satisfying Px = x and (I−P)y = y, we have

x^*y = (Px)^*(I−P)y = x^*P^*(I−P)y = x^*P(I−P)y = 0.

It follows that PA = AA⁺A = A and A⁺P = A⁺AA⁺ = A⁺. Similarly, QA⁺ = A⁺ and AQ = A. The orthogonal components are now readily identified.

If y belongs to the range of A then for some x, y = Ax and Py = PAx = Ax = y. Conversely, if Py = y then y = AA⁺y so that y belongs to the range of A. It follows that P is the orthogonal projector onto the range of A. I − P is then the orthogonal projector onto the orthogonal complement of the range of A, which equals the kernel of A^*.

A similar argument using the relation QA^* = A^* establishes that Q is the orthogonal projector onto the range of A^* and (I−Q) is the orthogonal projector onto the kernel of A.

Using the relations P(A⁺)^* = P^*(A⁺)^* = (A⁺P)^* = (A⁺)^* and P = P^* = (A⁺)^*A^* it follows that the range of P equals the range of (A⁺)^*, which in turn implies that the range of I − P equals the kernel of A⁺. Similarly QA⁺ = A⁺ implies that the range of Q equals the range of A⁺. Therefore, we find,

$$\begin{alignat}{2} \operatorname{Ker}(A^+) &= \operatorname{Ker}(A^*). \\ \operatorname{Im}(A^+) &= \operatorname{Im}(A^*). \\ \end{alignat}$$

Additional properties

Least-squares minimization

In the general case, it is shown here for any m × n matrix A that ∥Ax − b∥₂ ≥ ∥Az − b∥₂ where z = A⁺b. This lower bound need not be zero as the system Ax = b may not have a solution (e.g. when the matrix A does not have full rank or the system is overdetermined).

To prove this, we first note that (stating the complex case), using the fact that P = AA⁺ satisfies PA = A and P = P^*, we have

$$\begin{alignat}{2} A^*(Az - b) & = A^*(A A^+ b - b)\\ & = A^*(P b - b) \\ & = A^*P^* b - A^*b \\ & = (PA)^* b - A^*b \\ & = 0 \end{alignat}$$ so that (c.c. stands for the complex conjugate of the previous term in the following)

$$\begin{alignat}{2} \|Ax -b\|_2^2 &= \|Az -b\|_2^2 + (A(x-z))^*(Az-b) + \text{c.c.} + \|A(x - z)\|_2^2 \\ &= \|Az -b\|_2^2 + (x-z)^*A^*(Az-b) + \text{c.c.} + \|A(x - z)\|_2^2 \\ &= \|Az -b\|_2^2 + \|A(x - z)\|_2^2\\ & \ge \|Az -b\|_2^2 \end{alignat}$$ as claimed.

If A is injective i.e. one-to-one (which implies m ≥ n), then the bound is attained uniquely at z.

Minimum-norm solution to a linear system

The proof above also shows that if the system Ax = b is satisfiable i.e. has a solution, then necessarily z = A⁺b is a solution (not necessarily unique). We show here that z is the smallest such solution (its Euclidean norm is uniquely minimum).

To see this, note first, with Q = A⁺A, that Qz = A⁺AA⁺b = A⁺b = z and that Q^* = Q. Therefore, assuming that Ax = b, we have

$$\begin{alignat}{2} z^*(x-z) & = (Qz)^*(x-z)\\ &=z^*Q(x-z)\\ &=z^*(A^+ A x - z) \\ &=z^*(A^+ b - z) \\ &=0. \end{alignat}$$

Thus

$$\begin{alignat}{2} \|x\|_2^2 &= \|z\|_2^2 + 2z^*(x-z) + \|x-z\|_2^2 \\ &= \|z\|_2^2 + \|x-z\|_2^2 \\ &\ge \|z\|_2^2 \end{alignat}$$ with equality if and only if x = z, as was to be shown.

🪦 Wikipedia History

4 yearsage

1editors

1edits

Archive Provenance

Created: December 23, 2017

Deleted: February 21, 2022

Article size: 18.5 KB

Technical Metadata

Wikipedia page ID: 12096569

Metadata captured: May 11, 2026 8:20 AM

Metadata updated: May 11, 2026 8:20 AM

Subject Tags

Article proofsMatrix theory

Why Deleted

AfD

by Sandstein

Articles for deletion/Proofs involving the Moore–Penrose inverse (XFDcloser)

View AfD discussion ↗

Sources

link.springer.com/...

Archive Inventory

View stored source record counts

Revision rows stored: 0

Outgoing links stored: 15

External links stored: 1

Templates stored: 6

Talk exports stored: 0

AfD exports stored: 0

Raw API payloads stored: 0

Image records stored: 0

View full source metadata

Outgoing Wikipedia links (15)

Adi Ben-IsraelComplex conjugateComplex numberDiagonal matrixEuclidean normKernel (linear algebra)Main diagonalOrthogonal complementProjection (linear algebra)Real numberSingular value decompositionSpringer Science+Business MediaThomas N.E. GrevilleUnitary matrixZero matrix

External links (1)

link.springer.com/...

Templates (6)

Cite bookDEFAULTSORT:Moore-Penrose Inverse ProofsHarvtxtMain articleReflistUse dmy dates

Proofs involving the Moore–Penrose inverse

Useful lemmas

Lemma 2: A*AB = 0 ⇒ AB = 0

Lemma 3: ABB* = 0 ⇒ AB = 0

Existence and uniqueness

Proof of uniqueness

Proof of existence

1-by-1 matrices

Square diagonal matrices

General non-square diagonal matrices

Arbitrary matrices

Basic properties

Identities

A = A+* A* A

A = A A* A+*

A* = A* A A+

A* = A+ A A*

Reduction to the Hermitian case

Products

A has orthonormal columns

B has orthonormal rows

A has full column rank and B has full row rank

Conjugate transpose

Projectors and subspaces

Additional properties

Least-squares minimization

Minimum-norm solution to a linear system

Lemma 2: AAB = 0* ⇒ AB = 0

Lemma 3: ABB = 0* ⇒ AB = 0

A = A^+* A A*

A = A A A^+

A = A* A A*⁺

A = A⁺ A A**