傅里叶变换算法及其 python 实现

Understanding the FFT Algorithm by Python

Last updated on Dec 16, 2023 4 min read Tutorial

此文基于自己最初学习引力波数据处理的时候，研究 Pyhon/MATLAB 等编程语言和程序包中的 fft 具体含义的科学笔记。

首先，我们要清楚频率是什么，以及如何该从一个信号序列中提取出频率信息呢？我们正是要用傅里叶变换 (Fourier Transform) 来提取一段信号的频率信息。那么，从时域的信号 $x(t)$ 变换到频域的 $\tilde{x}(\omega)$，服从如下公式：($\omega = 2\pi f$)

$$ \tilde{x}(\omega) = \int^{+\infty}_{-\infty}e^{-i\omega t}x(t)dt = \int^{+\infty}_{-\infty}e^{-2\pi i f t}x(t)dt \quad. $$

这个方程是可逆的，于是就有了所谓的从频域到时域的逆变换：

$$ x(t)= \int^{+\infty}_{-\infty}e^{i\omega t}\tilde{x}(\omega) \frac{d\omega}{2\pi} = \int^{+\infty}_{-\infty}e^{2\pi ift}\tilde{x}(f) df \quad. $$

更多的数学，暂时不多说，就引用这么一句话（来自一本我特别喜欢的书 Introduction to Signal Processing ¹，本系列文都参考此书中的符号体系）

The physical meaning of $\tilde{x}(\omega)$ is brought out by the inverse Fourier transform, which expresses the arbitrary signal $x(t)$ as a linear superposition of sinusoids of different frequencies.

简而言之，傅里叶变换就是在不损失信息的情况下，用不同的频域 basis 的线性组合来表示一个时域信号，可以看做是一个基底展开。在信号处理中，我们的信号数据都是离散的，于是就要考虑离散傅里叶变换 (discrete Fourier transform)：

$$ \tilde{x}[k]=\sum^{N-1}_{n=0}e^{\frac{-2\pi i}{N}k\cdot n}x[n] \quad,\\ x[n]=\frac{1}{N}\sum^{N-1}_{k=0}e^{\frac{2\pi i}{N}k\cdot n}\tilde{x}[k] \quad. $$

其中，信号采样点的数目是 $N$, ${k,n}=0,\dots,N-1$。显然，根据上述傅里叶变换规则，可以计算出频域里对应的第一个采样点的值，就是时域信号所有采样点的平均和: $\tilde{x}[0]=\sum^{N-1}_{n=0}x[n]$。特别要注意的是，上面的公式仅是在 Python 中各个的 library 中 fft() 的定义，与实际近似对应的连续时域信号是有差别的！它们之间应该是如下关系：($dt=T/N=1/fs$)

$$ \tilde{x}[\omega] = \int^{+\infty}_{-\infty}e^{-i\omega t}x(t)dt = dt\left[\sum^{N-1}_{n=0}e^{\frac{-2\pi i}{N}k\cdot n}x[n] \right] , $$ $$ x[t]= \int^{+\infty}_{-\infty}e^{i\omega t}\tilde{x}(\omega) \frac{d\omega}{2\pi} = \frac{1}{dt}\left[\frac{1}{N}\sum^{N-1}_{k=0}e^{\frac{2\pi i}{N}k\cdot n}\tilde{x}[k]\right]. $$

这其实也可以理解，因为 Library 里定义的傅里叶变换函数是仅依赖于一个有限采样的时域序列 $x[n]$ 就可以计算，并不一定要知道该序列所代表的时长 $T$ 或采样率 $df$ 为何，所以可以使代码得到更好的一般普适性。

接下来，就终于进入到代码环节了。

我们通过对比手撸 FFT 代码和 Python 程序包自带的 fft 函数来探求背后的计算原理，理解程序包提供的函数的真正含义。程序包主要选取的是： numpy.fft 和 scipy.fftpack （pyfftw.FFTW 暂略）

FFT from scratch (`DFT_slow`)

我们先根据定义，纯手撸一个傅里叶变换 DFT_slow 看看效果如何（基于 numpy）。借此还可以验证一些程序包中的 fft 函数的含义。其实很简单，直接利用矩阵与向量的乘积，就可以给出矢量化的定义：

$$ \begin{align} \vec{X} &= M \cdot \vec{x} \\ M_{kn} &= e^{-2\pi i k\cdot n/N} \end{align} $$

import numpy as np
def DFT_slow(x):
    """Compute the discrete Fourier Transform of the 1D array x"""
    x = np.asarray(x, dtype=float)
    N = x.shape[0]
    n = np.arange(N)
    k = n.reshape((N, 1))
    M = np.exp(-2j * np.pi * k * n / N)
    return np.dot(M, x)

上面的代码其实很清楚了。输入一个序列，输出一个序列，对应于上面公式👆不考虑 $dt$ 的信息的。

很快你就知道，其实上面算法的执行效率是很低的。于是就有了下面你的改进版本：

FFT from scratch (`FFT`)

根据对称性，

$$ \vec{X}_{k+i\cdot N} = \vec{X}_{k} \quad, \text{for any integer i}, $$

可以给出一个更高效的递归算法方案。

def FFT(x):
    """A recursive implementation of the 1D Cooley-Tukey FFT"""
    x = np.asarray(x, dtype=float)
    N = x.shape[0]
    
    if N % 2 > 0:
        raise ValueError("size of x must be a power of 2")
    elif N <= 32:  # this cutoff should be optimized
        return DFT_slow(x)
    else:
        X_even = FFT(x[::2])
        X_odd = FFT(x[1::2])
        factor = np.exp(-2j * np.pi * np.arange(N) / N)
        return np.concatenate([X_even + factor[:N // 2] * X_odd,
                               X_even + factor[N // 2:] * X_odd])

FFT from scratch (`FFT_vectorized`)

将上一个算法进一步矢量化，就可以进一步提高代码的执行效率。

def FFT_vectorized(x):
    """A vectorized, non-recursive version of the Cooley-Tukey FFT"""
    x = np.asarray(x, dtype=float)
    N = x.shape[0]

    if np.log2(N) % 1 > 0:
        raise ValueError("size of x must be a power of 2")

    # N_min here is equivalent to the stopping condition above,
    # and should be a power of 2
    N_min = min(N, 32)
    
    # Perform an O[N^2] DFT on all length-N_min sub-problems at once
    n = np.arange(N_min)
    k = n[:, None]
    M = np.exp(-2j * np.pi * n * k / N_min)
    X = np.dot(M, x.reshape((N_min, -1)))

    # build-up each level of the recursive calculation all at once
    while X.shape[0] < N:
        X_even = X[:, :X.shape[1] // 2]
        X_odd = X[:, X.shape[1] // 2:]
        factor = np.exp(-1j * np.pi * np.arange(X.shape[0])
                        / X.shape[0])[:, None]
        X = np.vstack([X_even + factor * X_odd,
                       X_even - factor * X_odd])

    return X.ravel()

Fast FFT from scratch (`FFT_fast`)

快速傅里叶变换(FFT)——有史以来最巧妙的算法？: Bilibili

Verification of correctness for the code

首先，我们要先验证一下我们的手撸算法。

x = np.random.random(1024)
print(np.allclose(DFT_slow(x), np.fft.fft(x)))
print(np.allclose(FFT(x), np.fft.fft(x)))
print(np.allclose(FFT_vectorized(x), np.fft.fft(x)))

from scipy.fftpack import fft
print(np.allclose(np.fft.fft(x), fft(x)))
# -----------Output---------------
True
True
True
True

验证得知，我们定义的 FFT 算法与 np.fft.fft(x) 和 scipy.fftpack.fft(x) 的结果是一致的。

Efficiency of the algorithms

然后，再验证下所有手撸算法和程序包函数的执行效率。

x = np.random.random(1024)
%timeit DFT_slow(x)
%timeit FFT(x)
%timeit FFT_vectorized(x)
%timeit np.fft.fft(x)
%timeit fft(x)
# It may take 30 seconds
# -----------Output---------------
66 ms ± 3.19 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
3.7 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
467 µs ± 72.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
63 µs ± 2.6 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
19.2 µs ± 1.25 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

可以看到程序包的执行速率是远比手撸的代码执行速度快的，在数量级上的效率提升，此后都将只用程序包中的函数计算 FFT。

Reference

Understanding the FFT Algorithm (such informative post by an astronomer — Jake VanderPlas)
Any idea of what a frequency is? Playing with the Fourier Transform (a post from Signal Processing for Dummies created by ELENA CUOCO who is a data scientist at European Gravitational Observatory)
Matched filter and signal-to-noise for a periodic template (posted by Michał Bejger who is an astrophysicist from LIGO-Virgo)
Discrete Fourier transform - Wikipedia
Lecture 7 - The Discrete Fourier Transform (a concise lecture note written by Prof. Stephen Roberts)
Fourier expansions and impedance (a jupyter notebook illustrated that “Every function can be constructed by adding up a bunch of sines and cosines.”)
FINDCHIRP: An algorithm for detection of gravitational waves from inspiraling compact binaries (a pre-print paper filled with direct notations and explanations about FFT, discrete Matched Filter and power spectral estimation)

Introduction to Signal Processing, Sophocles J. Orfanidis ↩︎

Python Signal Processing

傅里叶变换算法及其 python 实现

FFT from scratch (DFT_slow)

FFT from scratch (FFT)

FFT from scratch (FFT_vectorized)

Fast FFT from scratch (FFT_fast)