What is convolution? A brief explanation.
Updated: Aug 27, 2019
Since convolutional neural network is getting popular, the term “convolution” also becomes familiar to many people. But just what exactly is convolution? This article will answer this question for those who are willing to expand their knowledge in the mathematical field.
The definition of convolution
If you have two functions, f(x) and g(x), and you’d like to generate a third function based on them, there are actually multiple measures you can choose from. For instance, function composition is an option to go with, which can produce a new function equals f(g(x)). Similarly, “convolution” is one of such mathematical operations allowing one to generate a new function out of two existed functions. The mathematical operation is defined as follows:
For continuous situation:
For discrete situation (discrete convolution):
We now know the definition of “convolution”, but what does it really means in the real world? The answer to this question will be discussed in the next paragraph.
The physical meaning of math
Before we go further with convolution, we’d like to propose a concept: in many cases, a mathematical equation does not have an ultimate physical meaning. Takes “1+1 = 2” as an example, the equation can mean many things. For instance, one orange plus one orange equals two oranges, or one person plus one person equals two people. Note that these are the situations where “1+1 = 2” is applicable, but there are also some situations where “1+1 = 2” does not hold true. For example, one drop of water plus another drop does not necessarily equal two drops of water (they may merge into a larger drop). We can also say that any condition that matches the depiction of “1+1 = 2” is a potential physical meaning for the equation, so our interpretation of this equation changes accordingly depending on which condition we’re currently referring to.
Similarly, “convolution” can be understood in many fashions, depending on the area it’s applied to. In the rest of this article, we’re going to introduce two important applications of convolution in signal and image processing, respectively. And as you are about to see, they are very distinct in respect of their physical meanings.
Example of application 01 – Signal Processing
Given that you have a linear, time-invariant (LTI) system capable of turning an input signal, which can be described by the function x(t), into the output signal, which can be described by y(t) (P.S., y(t) is also known as the response of x(t)), you’ll find out that the response, y(t), is equal to the convolution between the input signal and a special kind of response called the impulse response. In other words:
To understand why the equation above is true, some prior knowledge is necessary, including (1) what is an impulse and impulse response, (2) what is sifting property and (3) what is a linear time-invariant system. These topics are discussed in below:
(1) What is an impulse and impulse response?
If you have a linear time-invariant system capable of processing signal (i.e., taking in an input signal and generate a response signal correspondingly), how can you have a full understanding about the system?
Ideally, we can input signals with all possible frequencies to the system at the same time and see what kind of response it produces. And since we have observed how the system reacts to all frequencies, we’ve already fully understood how the system behaves.
An impulse represented by the function δ (Dirac delta function, which will be introduced later) is just the kind of signal that matches the description above, which allows us to fully understand a LTI system. As we can see from Figure 2 (a), δ’s value remains to be 1 on the entire frequency domain, meaning that it contains inputs of all frequencies. And since these frequencies are all applied to the system at the same time, it’s natural to expect that the function δ on the time domain has value solely on a certain time point (in this case, t = 0), and its value at that time point reaches to infinity (as shown in Figure 2 (b). Also notice that after we apply an impulse to a LTI system, the system will produce a corresponding response; such response is called an impulse response (as shown in Figure 2 (c)).
As mentioned previously, an impulse can be described by a special function called Dirac delta function (denoted by “δ”), whose definition is as follows:
Such function has many intriguing properties. Two of them that are particularly important for the future discussion are the sifting property (it’ll be introduced in the next paragraph) and the fact that the “area” under the graph of δ(t) at the time point t = 0 is equal to 1. And since the geometrical meaning of integral is the area under the curve of a function, one can anticipate that (see Figure 3):
(2) What is sifting property (note that it’s “sifting”, not “shifting”)?
In simple words, given that x(t) is any function and δ stands for Dirac delta function, sifting property states that:
The reason for this equation to be valid is explained in Figure 4: