Today, machine learning algorithms are used to identify and detect patterns in audio, images and videos. Audio, images and videos can be represented as signals that vary over time and space. The time domain analysis is a representation of a signal as a function of time, i.e. it shows how a signal changes over time and appear as sinusoidal waves. There are a number of independent variables that fully describes a signal and is referred to as degrees of freedom. In the time domain analysis, the degrees of freedom refer to number of samples in the signal.
In the context of applying signals to AI, frequency domain analysis is required to analyze signals with respect to frequency instead of time domain analysis. Frequency domain is used in image processing, speech recognition, feature extraction techniques in machine learning applications and in time series analysis to identify trends and patterns in data.
Frequency Domain of an original time signal is a mathematical representation of a signal or data in terms of its frequency components. In other words, it is a way of analyzing a signal by examining the different frequencies that make it up and they appear as distinct impulses. In the frequency domain, a signal is represented as a sum of sinusoidal waves with different frequencies, amplitudes, and phases. In the frequency domain analysis, the degrees of freedom is related to the number of frequency components that make up the signal. The relative strengths of these frequency components can reveal important information about the signal, such as its bandwidth, dominant frequencies, and harmonic content. Some of the key design parameters associated with the frequency domain analysis include sampling rate, windowing function, signal length and the choice of Fourier Transform algorithm.
There is an inverse relationship between time and frequency domain. The desired signal and the undesired signal are separable in the frequency domain and you can use a filter to reject the undesired signal. Also, analyzing signals in the frequency domain can have better computational efficiency than analyzing them in the time domain. This is because many signal processing operations such as filtering, convolution, and correlation, can be performed more efficiently in the frequency domain.
How Does Frequency Domain Work?
Suppose we have three signals of different frequencies and different amplitudes as shown below. If we observe the signal from a time domain perspective, it is difficult to get an idea how many frequency components are present in the signal. Frequency domain gives a different view to observe the signal.
A signal can be converted from a time domain to a frequency domain via mathematical operators called Transforms. Fourier Transforms gives us the dominant frequencies that add up to give this specific signal. Fourier Transform decomposes a signal into its constituent frequencies and their corresponding amplitudes and phases. This information is typically displayed using a graph called a frequency spectrum, which shows the strength of each frequency component in the signal. The frequency spectrum is plotted with frequency on the x-axis and amplitude on the y-axis. Interestingly, the human brain is capable of performing Fourier transforms as well.
The choice of Fourier Transform Algorithm affects the computational efficiency of the frequency domain analysis. The Fast Fourier Transform (FFT) is the most commonly used algorithm, which allows fast and efficient computation of the Fourier Transform. However, there are also other algorithms such as the Discrete Fourier Transform (DFT) and the Cooley-Tukey algorithm.
When analyzing signals in the frequency domain, it is important to consider the effects of different noises on the signal.
Input noise
Input noise refers to the noise that is present in the original signal, before any processing has occurred. Input noise can mask frequency components of a signal which can reduce the quality and accuracy of the frequency domain analysis. When analyzing the signal in the frequency domain, the presence of input noise can result in an increase in the noise floor, which is the level of noise present across all frequencies. An objective function can be used to characterize the properties of a signal. The effect of input noise in the frequency domain can be characterized using statistical measures, such as the Signal-to-Noise ratio (SNR), which compares the strength of the signal to the strength of the noise. A low SNR indicates that the noise level is high compared to the signal level, which can make it difficult to distinguish the signal from the noise. One way to mitigate the effects of input noise is to use signal processing techniques, such as filtering or averaging, to remove or reduce the noise. By combining the frequency domain and the objective function, it is possible to optimize the design of signal in order to achieve a desired level of performance.
Additive noise
Additive noise is a noise that is introduced into the signal during transmission or processing. In the frequency domain, the effects of additive noise can be analyzed using techniques such as spectral density analysis, which involves calculating the power spectral density of the signal and the noise separately and then adding them together. The power spectral density represents the distribution of the signal’s power as a function of frequency, and it can be used to identify the frequency components that are most affected by the noise.
Physiological noise
Physiological noise is a type of noise that is specific to neuroimaging and refers to fluctuations in the MR signal that are related to various physiological processes, such as cardiac and respiratory cycles. This noise can affect the quality of functional and structural MRI images and can interfere with the detection of brain activity. One approach to mitigating the effects of physiological noise is to analyze the MR signal in the frequency domain. This involves decomposing the signal into its frequency components and analyzing the power spectrum of the signal at different frequencies. In particular, physiological noise tends to have strong power in the low-frequency range, typically below 0.1 Hz, and can be separated from the signal of interest using frequency filtering techniques.
Background noise
Background noise refers to the noise that is present in the environment. The effect of background noise in the frequency domain can be characterized using statistical measures, such as the root mean square value or the power spectral density and can be mitigated using spectral filtering or spectral smoothing.
The baseline signal represents the low-frequency components of the signal that are typically present in the lower end of the spectrum. Baseline signal can have an impact on the interpretation and analysis of the higher frequency components of the signal. For example, in the case of physiological signals, such as electroencephalography (EEG) or electrocardiography (ECG), the baseline signal represents the background electrical activity of the brain or heart, respectively. This background activity can obscure the analysis of the higher frequency components of the signal, such as the specific brain or heart activity of interest. In order to mitigate the effects of baseline signal in the frequency domain, various techniques can be used, such as high-pass filtering or baseline correction. High-pass filtering involves selectively removing or attenuating the low-frequency components of the signal, while retaining the higher frequency components of interest. Baseline correction involves subtracting the baseline signal from the original signal to isolate the higher frequency components of interest.
Let’s consider an example of image sharpening. To perform image sharpening using the Fourier transforms, we first apply the transform to the image. We can then filter the image in the frequency domain to emphasize high-frequency components, which correspond to edges and details. This filtering can be done by multiplying the Fourier transform of the image by a filter function that emphasizes high-frequency components. We then apply the inverse Fourier transform to the filtered image to obtain the sharpened image in the spatial domain. The result is an image that appears clearer and more defined, with enhanced edges and details. However, it’s important to note that image sharpening can also introduce noise or artifacts into the image, so it’s important to carefully choose the filter function and adjust control parameters to achieve the desired result.
1. Multiply the input image by (-1)^(x+y) to center the transform
2. Compute F(u,v), the DFT of input
3. Multiply F(u,v) by a filter H(u,v)
4. Compute the inverse DFT of step 3
5. Obtain the real part of step 4
6. Multiply the result obtained in step 5 by (-1)^(x+y)
The frequency domain filtering is advantageous because of less computational overhead. It is faster to perform 2D Fourier Transform and a filter multiply than to perform a convolution in the spatial domain. It gives you control over the whole images where we can enhance and suppress different characteristics of the image easily. The idea of blurring an image by reducing its high frequency component or sharpening the image by increasing the magnitude of its high frequency component is easy to understand. Fourier transform states that any function that periodically repeats itself can be expressed as the sum of sines and cosines of different frequencies and different amplitudes.
Frequency Domain Features (FDF)
Frequency domain features are specific characteristics of a signal or system that can be extracted from its frequency-domain representation. These features are often used in signal processing, machine learning, and other fields to analyze and classify signals based on their frequency content. In machine learning, discriminative features are those that are most relevant for distinguishing between different classes or categories of signals. The frequency domain is a common source of discriminative features, as it can provide information about the energy present at different frequencies in a signal. For example, in speech recognition, different phonemes are characterized by different frequency patterns, and the frequency domain can be used to extract these patterns as discriminative features. Similarly, in image processing, the frequency domain can be used to identify distinctive spatial patterns in an image, which can be used to distinguish between different objects or scenes. There are many techniques for extracting discriminative features from the frequency domain. One common technique is to use the Fourier transforms or another frequency-domain transform to obtain a set of frequency components, and then to apply feature selection or feature extraction algorithms to identify the most relevant features for a given classification task.
Frequency domain features include:
Power spectral density (PSD): Measures the distribution of power across different frequency bands in a signal and is used to analyze the frequency content of a signal and to identify dominant frequencies.
Spectral centroid: Calculates the center of mass of a signal’s frequency spectrum, which provides information about the average frequency of the signal.
Spectral flatness: Measures the degree to which a signal’s power is spread evenly across its frequency spectrum. Signals with a higher spectral flatness have more even power distribution across different frequencies.
Spectral entropy: Measures the degree of randomness or unpredictability in a signal’s frequency spectrum. Signals with a higher spectral entropy have a more complex frequency content.
Harmonic ratio: Measures the degree to which a signal contains harmonics, which are integer multiples of a fundamental frequency. Signals with a higher harmonic ratio contain more harmonics.
Node details
A node refers to a computational block or module that extracts specific features from a signal in the frequency domain. A feature is a measurable property of a signal that can be used to characterize or differentiate it from other signals. Some features that can be extracted include spectral centroid, spectral bandwidth, spectral roll-off, spectral flux, and mel-frequency cepstral coefficients (MFCCs). These features can be used in various applications such as speech recognition, music classification, and biomedical signal analysis.
Input ports
Input port is a point at which a signal enters a system or a device in the frequency domain. The input signal must first be transformed from the time-domain to the frequency-domain using a technique such as the Fourier transform or the Laplace transform. Once the signal is in the frequency domain, it can be analyzed, processed or modified using various frequency domain techniques.
Output ports
A frequency domain output port is a point at which a signal exits a system or device in the frequency domain. For example, in a filter, the frequency domain output port is where the filtered signal is obtained in the frequency domain after the input signal has been processed by the filter.
Extension
Extension refers to the process of extending a signal from its existing frequency domain representation to a larger frequency range. This is typically done to increase the resolution of the frequency domain representation or to analyze the signal or system at higher frequencies. This can be performed using techniques such as zero-padding, which involves adding zeros to the end of a signal to increase its length and thereby increase the frequency resolution of the Fourier transform. Another technique is interpolation, which involves estimating the values of the signal or system at intermediate frequencies based on its known values at discrete frequencies.
Deep learning algorithms use artificial neural networks. An Artificial Neural Network is made up of multiple processing units called nodes or neurons. The nodes are organized into layers and the layers are connected to each other by weights in the network. The number of nodes present in any given layer of a network partly depends on where in the network the layer resides and it also partly depends on the data that will eventually be processed by the nodes in a given layer and also partly depends on the design choice for the given layer by the network architect.
For the input layer, the number of nodes is directly determined by the number of input features for the single sample that will be passed as input to the network.
In the below illustration, the neural network has an input layer with two nodes. This indicates that the input data would have two input features. For example, in a dataset, a sample could represent an individual person and within the sample we could have two features, e.g. height and weight of the person. So the height and weight will be passed as input to the network and we therefore represent the two input features as two nodes in the input layer. If we are using the network for classification tasks, then in the output layer, the number of nodes has to be equal to the number of output classes. With the hidden layers, we have more freedom in choosing the number of nodes.
Consider an example of Convolutional Neural Network (CNN). CNN is a type of Artificial Neural Network that is popular for analysing images. Each node in the CNN acts as a frequency domain filter and carries out a specific task in the image processing process. The frequency content of an image refers to the rate at which the gray levels change in time. Rapidly changing brightness values correspond to high frequency terms, slowly changing brightness values correspond to low frequency terms. Filters are able to detect patterns. An image might have multiple edges, shapes, textures, objects, etc. So one type of pattern a filter could detect is edges, some could detect corners, some could detect circles, etc. The deeper the networks are; the more sophisticated these filters become. For example, one node can be used for image smoothing and another node can be used for image sharpening.
Applications of Frequency Domain
Image processing – An image is a signal and can be represented in the form of a 2D matrix where each element of the matrix represents pixel intensity. This state of 2D matrices that depict the intensity distribution of an image is called spatial domain. Any image in spatial domain can be represented in a frequency domain. Discrete Cosine Transform (DCT) is widely used in applications such as image and video processing. Like the Fourier transform, the DCT converts a signal from the time domain to the frequency domain, but it is particularly well-suited for analyzing signals that have a strong correlation between adjacent samples. In the DCT, a signal is divided into a set of frequency components, called DCT coefficients, which represent the amount of energy present at each frequency. The DCT coefficients are ordered according to their frequency, with the lowest frequency component (DC) at the beginning of the list and the highest frequency component at the end. In image or video processing, it is used to identify and discard low-energy DCT coefficients, which correspond to high-frequency noise in the signal. By discarding these coefficients, the signal can be processed without significant loss of quality. DCT coefficients are often used as input features to machine learning algorithms, where they can be used to identify patterns and classify signals.
Audio processing – The frequency domain is used extensively in audio signal processing to analyze and manipulate audio signals. In digital audio processing, analog signals are first converted into digital signals through a process called analog-to-digital conversion (ADC). The digital signals can then be processed using the Fourier transform. The Fourier transform can be used to calculate the spectrum of an audio signal, which can then be used for pitch detection, timbre analysis, and other applications. The coefficient of variation (CV) is a statistical measure that is often used to describe the variability of a dataset. It is defined as the ratio of the standard deviation of the dataset to its mean, expressed as a percentage. In the frequency domain, the CV can be calculated for each frequency component, or for a set of frequency components within a certain frequency range. The coefficient of variation provides a measure of the variability of the energy present at each frequency, and can be used to identify frequencies that have high or low levels of variability. In an audio signal, the coefficient of variation of the frequency components can be used to identify frequencies that have high levels of background noise or distortion, as these frequencies will have higher variability than other frequencies. By filtering out these high-variability frequencies, the quality of the signal can be improved.
Speech recognition: In speech recognition technology, machine learning is used to analyze the acoustic signal from human speech directly or from an audio file. Frequency domain is used to analyze the frequency content of speech signals, which can then be used for feature extraction and classification. Deep learning models can be trained on these frequency-domain features to recognize speech with high accuracy.
Medical Imaging: The non-invasive mapping of cerebral oxygen metabolism is an essential tool for understanding the human brain function and dysfunction. In-vivo analysis of the brain’s oxygen extraction fraction (OEF) and cerebral metabolic rate of oxygen consumption (CMRO2) typically involves the use of the arterial spin labeling signal. However, the estimation of these physiological parameters can be challenging due to parameter uncertainty and the complex nature of the ASL signal. To address these issues, a frequency-domain machine learning method has been developed, which uses regularized non-linear least squares analysis to (RNLS analysis) estimate target parameters such as OEF and CMRO2 from ASL data. The study design includes the analysis of data from healthy human brains and those with diseased brains to assess the accuracy and reliability of the parameter estimates in both populations. The machine learning approach incorporates physiological parameters and blood oxygenation parameters into the analysis, providing improved accuracy and reliability of the parameter estimates. Regression methods are used to model the relationship between the ASL signal and the target parameters. This approach has the potential to provide valuable insights into the human brain function, particularly in diseased brains, where cerebral oxygen metabolism may be altered.
The frequency domain analysis plays a critical role in AI by providing a powerful framework for analyzing and processing data. Complex patterns and trends can be extracted and analyzed, leading to more accurate predictions and better decision-making. Techniques such as Fourier analysis, wavelet transforms, and power spectral density analysis are commonly used in AI applications such as image and speech recognition, natural language processing, and anomaly detection. As AI continues to evolve and become more sophisticated, the importance of the frequency domain will only continue to grow.
Michael Germuska, Hannah Louise Chandler, Thomas Okell, Fabrizio Fasano, Valentina Tomassini, Kevin Murphy, Richard G. Wise. “A Frequency-Domain Machine Learning Method for Dual-Calibrated fMRI Mapping of Oxygen Extraction Fraction (OEF) and Cerebral Metabolic Rate of Oxygen Consumption (CMRO2)”, 31 Mar. 2020, https://www.frontiersin.org/articles/10.3389/frai.2020.00012/full. Accessed 25 Apr. 2023
Introduction
Today, machine learning algorithms are used to identify and detect patterns in audio, images and videos. Audio, images and videos can be represented as signals that vary over time and space. The time domain analysis is a representation of a signal as a function of time, i.e. it shows how a signal changes over time and appear as sinusoidal waves. There are a number of independent variables that fully describes a signal and is referred to as degrees of freedom. In the time domain analysis, the degrees of freedom refer to number of samples in the signal.
In the context of applying signals to AI, frequency domain analysis is required to analyze signals with respect to frequency instead of time domain analysis. Frequency domain is used in image processing, speech recognition, feature extraction techniques in machine learning applications and in time series analysis to identify trends and patterns in data.
Table of contents
What is Frequency Domain?
Frequency Domain of an original time signal is a mathematical representation of a signal or data in terms of its frequency components. In other words, it is a way of analyzing a signal by examining the different frequencies that make it up and they appear as distinct impulses. In the frequency domain, a signal is represented as a sum of sinusoidal waves with different frequencies, amplitudes, and phases. In the frequency domain analysis, the degrees of freedom is related to the number of frequency components that make up the signal. The relative strengths of these frequency components can reveal important information about the signal, such as its bandwidth, dominant frequencies, and harmonic content. Some of the key design parameters associated with the frequency domain analysis include sampling rate, windowing function, signal length and the choice of Fourier Transform algorithm.
There is an inverse relationship between time and frequency domain. The desired signal and the undesired signal are separable in the frequency domain and you can use a filter to reject the undesired signal. Also, analyzing signals in the frequency domain can have better computational efficiency than analyzing them in the time domain. This is because many signal processing operations such as filtering, convolution, and correlation, can be performed more efficiently in the frequency domain.
How Does Frequency Domain Work?
Suppose we have three signals of different frequencies and different amplitudes as shown below. If we observe the signal from a time domain perspective, it is difficult to get an idea how many frequency components are present in the signal. Frequency domain gives a different view to observe the signal.
A signal can be converted from a time domain to a frequency domain via mathematical operators called Transforms. Fourier Transforms gives us the dominant frequencies that add up to give this specific signal. Fourier Transform decomposes a signal into its constituent frequencies and their corresponding amplitudes and phases. This information is typically displayed using a graph called a frequency spectrum, which shows the strength of each frequency component in the signal. The frequency spectrum is plotted with frequency on the x-axis and amplitude on the y-axis. Interestingly, the human brain is capable of performing Fourier transforms as well.
The choice of Fourier Transform Algorithm affects the computational efficiency of the frequency domain analysis. The Fast Fourier Transform (FFT) is the most commonly used algorithm, which allows fast and efficient computation of the Fourier Transform. However, there are also other algorithms such as the Discrete Fourier Transform (DFT) and the Cooley-Tukey algorithm.
When analyzing signals in the frequency domain, it is important to consider the effects of different noises on the signal.
Input noise
Input noise refers to the noise that is present in the original signal, before any processing has occurred. Input noise can mask frequency components of a signal which can reduce the quality and accuracy of the frequency domain analysis. When analyzing the signal in the frequency domain, the presence of input noise can result in an increase in the noise floor, which is the level of noise present across all frequencies. An objective function can be used to characterize the properties of a signal. The effect of input noise in the frequency domain can be characterized using statistical measures, such as the Signal-to-Noise ratio (SNR), which compares the strength of the signal to the strength of the noise. A low SNR indicates that the noise level is high compared to the signal level, which can make it difficult to distinguish the signal from the noise. One way to mitigate the effects of input noise is to use signal processing techniques, such as filtering or averaging, to remove or reduce the noise. By combining the frequency domain and the objective function, it is possible to optimize the design of signal in order to achieve a desired level of performance.
Additive noise
Additive noise is a noise that is introduced into the signal during transmission or processing. In the frequency domain, the effects of additive noise can be analyzed using techniques such as spectral density analysis, which involves calculating the power spectral density of the signal and the noise separately and then adding them together. The power spectral density represents the distribution of the signal’s power as a function of frequency, and it can be used to identify the frequency components that are most affected by the noise.
Physiological noise
Physiological noise is a type of noise that is specific to neuroimaging and refers to fluctuations in the MR signal that are related to various physiological processes, such as cardiac and respiratory cycles. This noise can affect the quality of functional and structural MRI images and can interfere with the detection of brain activity. One approach to mitigating the effects of physiological noise is to analyze the MR signal in the frequency domain. This involves decomposing the signal into its frequency components and analyzing the power spectrum of the signal at different frequencies. In particular, physiological noise tends to have strong power in the low-frequency range, typically below 0.1 Hz, and can be separated from the signal of interest using frequency filtering techniques.
Background noise
Background noise refers to the noise that is present in the environment. The effect of background noise in the frequency domain can be characterized using statistical measures, such as the root mean square value or the power spectral density and can be mitigated using spectral filtering or spectral smoothing.
The baseline signal represents the low-frequency components of the signal that are typically present in the lower end of the spectrum. Baseline signal can have an impact on the interpretation and analysis of the higher frequency components of the signal. For example, in the case of physiological signals, such as electroencephalography (EEG) or electrocardiography (ECG), the baseline signal represents the background electrical activity of the brain or heart, respectively. This background activity can obscure the analysis of the higher frequency components of the signal, such as the specific brain or heart activity of interest. In order to mitigate the effects of baseline signal in the frequency domain, various techniques can be used, such as high-pass filtering or baseline correction. High-pass filtering involves selectively removing or attenuating the low-frequency components of the signal, while retaining the higher frequency components of interest. Baseline correction involves subtracting the baseline signal from the original signal to isolate the higher frequency components of interest.
Let’s consider an example of image sharpening. To perform image sharpening using the Fourier transforms, we first apply the transform to the image. We can then filter the image in the frequency domain to emphasize high-frequency components, which correspond to edges and details. This filtering can be done by multiplying the Fourier transform of the image by a filter function that emphasizes high-frequency components. We then apply the inverse Fourier transform to the filtered image to obtain the sharpened image in the spatial domain. The result is an image that appears clearer and more defined, with enhanced edges and details. However, it’s important to note that image sharpening can also introduce noise or artifacts into the image, so it’s important to carefully choose the filter function and adjust control parameters to achieve the desired result.
1. Multiply the input image by (-1)^(x+y) to center the transform
2. Compute F(u,v), the DFT of input
3. Multiply F(u,v) by a filter H(u,v)
4. Compute the inverse DFT of step 3
5. Obtain the real part of step 4
6. Multiply the result obtained in step 5 by (-1)^(x+y)
The frequency domain filtering is advantageous because of less computational overhead. It is faster to perform 2D Fourier Transform and a filter multiply than to perform a convolution in the spatial domain. It gives you control over the whole images where we can enhance and suppress different characteristics of the image easily. The idea of blurring an image by reducing its high frequency component or sharpening the image by increasing the magnitude of its high frequency component is easy to understand. Fourier transform states that any function that periodically repeats itself can be expressed as the sum of sines and cosines of different frequencies and different amplitudes.
Frequency Domain Features (FDF)
Frequency domain features are specific characteristics of a signal or system that can be extracted from its frequency-domain representation. These features are often used in signal processing, machine learning, and other fields to analyze and classify signals based on their frequency content. In machine learning, discriminative features are those that are most relevant for distinguishing between different classes or categories of signals. The frequency domain is a common source of discriminative features, as it can provide information about the energy present at different frequencies in a signal. For example, in speech recognition, different phonemes are characterized by different frequency patterns, and the frequency domain can be used to extract these patterns as discriminative features. Similarly, in image processing, the frequency domain can be used to identify distinctive spatial patterns in an image, which can be used to distinguish between different objects or scenes. There are many techniques for extracting discriminative features from the frequency domain. One common technique is to use the Fourier transforms or another frequency-domain transform to obtain a set of frequency components, and then to apply feature selection or feature extraction algorithms to identify the most relevant features for a given classification task.
Frequency domain features include:
Node details
A node refers to a computational block or module that extracts specific features from a signal in the frequency domain. A feature is a measurable property of a signal that can be used to characterize or differentiate it from other signals. Some features that can be extracted include spectral centroid, spectral bandwidth, spectral roll-off, spectral flux, and mel-frequency cepstral coefficients (MFCCs). These features can be used in various applications such as speech recognition, music classification, and biomedical signal analysis.
Input ports
Input port is a point at which a signal enters a system or a device in the frequency domain. The input signal must first be transformed from the time-domain to the frequency-domain using a technique such as the Fourier transform or the Laplace transform. Once the signal is in the frequency domain, it can be analyzed, processed or modified using various frequency domain techniques.
Output ports
A frequency domain output port is a point at which a signal exits a system or device in the frequency domain. For example, in a filter, the frequency domain output port is where the filtered signal is obtained in the frequency domain after the input signal has been processed by the filter.
Extension
Extension refers to the process of extending a signal from its existing frequency domain representation to a larger frequency range. This is typically done to increase the resolution of the frequency domain representation or to analyze the signal or system at higher frequencies. This can be performed using techniques such as zero-padding, which involves adding zeros to the end of a signal to increase its length and thereby increase the frequency resolution of the Fourier transform. Another technique is interpolation, which involves estimating the values of the signal or system at intermediate frequencies based on its known values at discrete frequencies.
Deep learning algorithms use artificial neural networks. An Artificial Neural Network is made up of multiple processing units called nodes or neurons. The nodes are organized into layers and the layers are connected to each other by weights in the network. The number of nodes present in any given layer of a network partly depends on where in the network the layer resides and it also partly depends on the data that will eventually be processed by the nodes in a given layer and also partly depends on the design choice for the given layer by the network architect.
For the input layer, the number of nodes is directly determined by the number of input features for the single sample that will be passed as input to the network.
In the below illustration, the neural network has an input layer with two nodes. This indicates that the input data would have two input features. For example, in a dataset, a sample could represent an individual person and within the sample we could have two features, e.g. height and weight of the person. So the height and weight will be passed as input to the network and we therefore represent the two input features as two nodes in the input layer. If we are using the network for classification tasks, then in the output layer, the number of nodes has to be equal to the number of output classes. With the hidden layers, we have more freedom in choosing the number of nodes.
Consider an example of Convolutional Neural Network (CNN). CNN is a type of Artificial Neural Network that is popular for analysing images. Each node in the CNN acts as a frequency domain filter and carries out a specific task in the image processing process. The frequency content of an image refers to the rate at which the gray levels change in time. Rapidly changing brightness values correspond to high frequency terms, slowly changing brightness values correspond to low frequency terms. Filters are able to detect patterns. An image might have multiple edges, shapes, textures, objects, etc. So one type of pattern a filter could detect is edges, some could detect corners, some could detect circles, etc. The deeper the networks are; the more sophisticated these filters become. For example, one node can be used for image smoothing and another node can be used for image sharpening.
Applications of Frequency Domain
Image processing – An image is a signal and can be represented in the form of a 2D matrix where each element of the matrix represents pixel intensity. This state of 2D matrices that depict the intensity distribution of an image is called spatial domain. Any image in spatial domain can be represented in a frequency domain. Discrete Cosine Transform (DCT) is widely used in applications such as image and video processing. Like the Fourier transform, the DCT converts a signal from the time domain to the frequency domain, but it is particularly well-suited for analyzing signals that have a strong correlation between adjacent samples. In the DCT, a signal is divided into a set of frequency components, called DCT coefficients, which represent the amount of energy present at each frequency. The DCT coefficients are ordered according to their frequency, with the lowest frequency component (DC) at the beginning of the list and the highest frequency component at the end. In image or video processing, it is used to identify and discard low-energy DCT coefficients, which correspond to high-frequency noise in the signal. By discarding these coefficients, the signal can be processed without significant loss of quality. DCT coefficients are often used as input features to machine learning algorithms, where they can be used to identify patterns and classify signals.
Audio processing – The frequency domain is used extensively in audio signal processing to analyze and manipulate audio signals. In digital audio processing, analog signals are first converted into digital signals through a process called analog-to-digital conversion (ADC). The digital signals can then be processed using the Fourier transform. The Fourier transform can be used to calculate the spectrum of an audio signal, which can then be used for pitch detection, timbre analysis, and other applications. The coefficient of variation (CV) is a statistical measure that is often used to describe the variability of a dataset. It is defined as the ratio of the standard deviation of the dataset to its mean, expressed as a percentage. In the frequency domain, the CV can be calculated for each frequency component, or for a set of frequency components within a certain frequency range. The coefficient of variation provides a measure of the variability of the energy present at each frequency, and can be used to identify frequencies that have high or low levels of variability. In an audio signal, the coefficient of variation of the frequency components can be used to identify frequencies that have high levels of background noise or distortion, as these frequencies will have higher variability than other frequencies. By filtering out these high-variability frequencies, the quality of the signal can be improved.
Speech recognition: In speech recognition technology, machine learning is used to analyze the acoustic signal from human speech directly or from an audio file. Frequency domain is used to analyze the frequency content of speech signals, which can then be used for feature extraction and classification. Deep learning models can be trained on these frequency-domain features to recognize speech with high accuracy.
Medical Imaging: The non-invasive mapping of cerebral oxygen metabolism is an essential tool for understanding the human brain function and dysfunction. In-vivo analysis of the brain’s oxygen extraction fraction (OEF) and cerebral metabolic rate of oxygen consumption (CMRO2) typically involves the use of the arterial spin labeling signal. However, the estimation of these physiological parameters can be challenging due to parameter uncertainty and the complex nature of the ASL signal. To address these issues, a frequency-domain machine learning method has been developed, which uses regularized non-linear least squares analysis to (RNLS analysis) estimate target parameters such as OEF and CMRO2 from ASL data. The study design includes the analysis of data from healthy human brains and those with diseased brains to assess the accuracy and reliability of the parameter estimates in both populations. The machine learning approach incorporates physiological parameters and blood oxygenation parameters into the analysis, providing improved accuracy and reliability of the parameter estimates. Regression methods are used to model the relationship between the ASL signal and the target parameters. This approach has the potential to provide valuable insights into the human brain function, particularly in diseased brains, where cerebral oxygen metabolism may be altered.
Also Read: What is UNet? How Does it Relate to Deep Learning?
Conclusion
The frequency domain analysis plays a critical role in AI by providing a powerful framework for analyzing and processing data. Complex patterns and trends can be extracted and analyzed, leading to more accurate predictions and better decision-making. Techniques such as Fourier analysis, wavelet transforms, and power spectral density analysis are commonly used in AI applications such as image and speech recognition, natural language processing, and anomaly detection. As AI continues to evolve and become more sophisticated, the importance of the frequency domain will only continue to grow.
References
Iman. “Frequency concept in an image!”, https://www.youtube.com/watch?v=xrTor1uw5iI. Accessed Apr. 18 2023.
CADENCE PCB SOLUTIONS “Time Domain Analysis vs Frequency Domain Analysis: A Guide and Comparison”, https://resources.pcb.cadence.com/blog/2020-time-domain-analysis-vs-frequency-domain-analysis-a-guide-and-comparison. Accessed Apr. 20 2023.
deeplizard. “Convolutional Neural Networks (CNNs) explained”, https://www.youtube.com/watch?v=YRhxdVk_sIs. Accessed 20 Apr. 2023.
“Image Enhancement in the frequency domain”, https://www.corsi.univr.it/documenti/OccorrenzaIns/matdid/matdid642638.pdf. Accessed Apr.22 2023.
J.P. Hornak. “The Basics of MRI”, 1996-2020, https://www.cis.rit.edu/htbooks/mri/chap-5/chap-5. Accessed 25 Apr. 2023.
Michael Germuska, Hannah Louise Chandler, Thomas Okell, Fabrizio Fasano, Valentina Tomassini, Kevin Murphy, Richard G. Wise. “A Frequency-Domain Machine Learning Method for Dual-Calibrated fMRI Mapping of Oxygen Extraction Fraction (OEF) and Cerebral Metabolic Rate of Oxygen Consumption (CMRO2)”, 31 Mar. 2020, https://www.frontiersin.org/articles/10.3389/frai.2020.00012/full. Accessed 25 Apr. 2023
Share this: