Digging into the Conjugation in Complex Inner Products
The inconsistency between the inner product of complex spaces and the quadratic forms regarding the positions of conjugate elements.
Table of Contents
There is a phenomenon in linear algebra textbooks that is difficult to understand at first glance.
By common mathematical convention, the standard inner product (formula not rendered) on an -dimensional complex space (formula not rendered) is defined as:
(formula not rendered)And at the same time there is the definition of the Hermitian quadratic forms:
(formula not rendered)These two definitions, which are supposed to be closely related, take conjugates on different elements — for the inner product, it is the latter, (formula not rendered); for the quadratic form, it is the former, (formula not rendered).
It is of course possible to use another set of definitions, where the elements of (formula not rendered) and (formula not rendered) both take the conjugate on the other element, and the “inconsistency” seen above is still present.
After wrestling with this strange question for a long time, I had some wonderful thoughts, which are recorded in detail here.
Unless otherwise specified, we establish the following convention:
- (formula not rendered), (formula not rendered) stands for the (formula not rendered)-th component of the vectors (formula not rendered) and (formula not rendered) in (formula not rendered), respectively;
- (formula not rendered) stands for the element at the (formula not rendered)-th row and the (formula not rendered)-th column of the positive definite Hermitian matrix (formula not rendered).
The Simple Explanation
In fact, (formula not rendered), and (formula not rendered); hence they are both specialisations of the generalised quadratic form (formula not rendered).
As for the difference in the “position” of the conjugation, it is merely an illusion created by the way it is written. It is easy to see this in the following form.
(formula not rendered)(where the Kronecker function (formula not rendered) equals the element at the (formula not rendered)-th row and the (formula not rendered)-th column of the identity matrix (formula not rendered).)
The Detailed Explanation
The form (formula not rendered) is not randomly chosen. There is a more fundamental reason behind the mathematicians’ preference for it. [The physics convention seems to be the opposite, but the reason for this (the Dirac notation of quantum mechanics) is irrelevant to our topic here.]
Why Is There the Conjugate and Conjugate Symmetry?
First, why is there a conjugate in the standard inner product?
Looking back at the dot product on real space, it is essentially a generalisation of the Euclidean norm (formula not rendered) to the case of two vectors.
On the complex space there needs to be a similar operation, satisfying positive definiteness (formula not rendered). In the case of one dimension, the modulus of the complex number (formula not rendered) is clearly a first choice. Generalising it to dimensions, the definition (formula not rendered) arises. The conjugation comes from here.
From another perspective, why does the inner product satisfy the conjugate symmetry (formula not rendered), rather than the plain symmetry (formula not rendered)?
Let us return to the standard inner product and try to separate the real part of the expression from the imaginary part.
(formula not rendered)It can be observed that the real part (formula not rendered) is equivalent to the sum of dot products on (formula not rendered), while the imaginary part (formula not rendered) equals the sum of cross products on (formula not rendered), giving the “rotation angle in the -dimensional complex space from (formula not rendered) to (formula not rendered)” — for instance, when the arguments (polar angles) of each component are equal for both vectors, (formula not rendered); when the argument of the component of (formula not rendered) is that of the component of (formula not rendered) rotated by (formula not rendered) degrees for each component, (formula not rendered) reaches its maximal possible value (formula not rendered).
And in the more general definition of the inner product, it is natural to expect the real and imaginary parts of the result to have corresponding properties respectively. Conjugate symmetry is derived from the anticommutative property of the cross product, or more generally, the anticommutative property for “rotation angles in complex spaces”.
In this way, the property of the conjugate is indeed excellent.
Why the Conjugate Is Taken on the Second Vector?
To solve this problem, one has to explore another layer of the nature of inner products, and it is never wrong to start with the simplest dot product. What is the nature of the dot product?
We can of course say that it represents “the projection of (formula not rendered) on (formula not rendered), multiplied by the length of (formula not rendered)”, but this is not deep enough.
3Blue1Brown explained in depth in the video🪐 about a viewpoint: the dot product is the application of a linear transformation on (formula not rendered) defined by (formula not rendered) to (formula not rendered). This transformation turns any vector (formula not rendered) into a scalar value (formula not rendered).
From this perspective, an inner product function and a vector together also determines a transformation. That is to say, an “inner product” is an operator that maps vectors onto linear transformations. In a more abstractive manner, this is an instance of currying: a binary function (formula not rendered) can be seen as an operator that maps an argument to a unary function, (formula not rendered).
A positive definite Hermitian matrix (formula not rendered) defines such an operator, which maps a vector (formula not rendered) onto a transformation based on the standard basis (formula not rendered). Applying the transformation to (formula not rendered) results in the previously seen form of (formula not rendered). This is actually a general form of inner products on (formula not rendered) (the Hermitian form).
Hence, the reason why the conjugate is taken for (formula not rendered) is because we expect that (formula not rendered), as the element being transformed, keep its original form, and all computations be put inside the linear transformation (formula not rendered). As for why (formula not rendered) is written before (formula not rendered), it’s probably because it’s more intuitive to first specify the element being transformed and then specify the transformation in this binary operation.