The Sparse Fourier Transform. Haitham HassaniehЧитать онлайн книгу.

satisfy the so-called ℓ∞/ℓ₂ guarantee. Specifically, let y be the minimizer of ||x̂ − y||₂. For a precision parameter δ = 1/n^O(1), and a constant ∊ > 0, our (randomized) algorithm outputs x̂′ such that:

with probability 1 − 1/n. The additive term that depends on δ appears in all past algorithms [Akavia 2010, Akavia et al. 2003, Gilbert et al. 2002, Gilbert et al. 2005a, Iwen 2010b, Mansour 1992], although typically (with the exception of Iwen [2010a]) it is eliminated by assuming that all coordinates are integers in the range {−n^O(1) … n^O(1)}. In this chapter, we keep the dependence on δ explicit.

The ℓ∞/ℓ₂ guarantee of Equation (3.1) is stronger than the ℓ₂/ℓ₂ guarantee of Equation (1.2). In particular, the ℓ∞/ℓ₂ guarantee with a constant approximation factor C implies the ℓ₂/ℓ₂ guarantee with a constant approximation factor C′, if one sets all but the k largest entries in x̂′ to 0.¹ Furthermore, instead of bounding only the collective error, the ℓ∞/ℓ₂ guarantee ensures that every Fourier coefficient is well approximated.

3.1.2 Techniques

Recall from Chapter 1 that the Sparse Fourier Transform algorithms work by binning² the Fourier coefficients into a small number of buckets. Since the signal is sparse in the frequency domain, each bucket is likely³ to have only one large coefficient, which can then be located (to find its position) and estimated (to find its value). For the algorithm to be sublinear, the binning has to be done in sublinear time. Binning the Fourier coefficient is done using an n-dimensional filter vector G that is concentrated both in time and frequency, i.e., G is zero except at a small number of time coordinates, and its Fourier transform is negligible except at a small fraction (about 1/k) of the frequency coordinates (the “pass” region).

Prior work uses different types of filters. Depending on the choice of the filter G, past algorithms can be classified as iteration-based or interpolation-based.

Iteration-based algorithms use a filter that has a significant mass outside its pass region [Gilbert et al. 2002, Gilbert et al. 2005a, Mansour 1992]. For example, Gilbert et al. [2002] and Gilbert et al. [2005a] set G to the rectangular filter which was shown in Figure 1.3(a), in which case is the Dirichlet kernel⁴, whose tail decays in an inverse linear fashion. Since the tail decays slowly, the Fourier coefficients binned to a particular bucket “leak” into other buckets. On the other hand, Mansour [1992] estimates the convolution in time domain via random sampling, which also leads to a large estimation error. To reduce these errors and obtain the ℓ₂/ℓ₂ guarantee, these algorithms have to perform multiple iterations, where each iteration estimates the largest Fourier coefficient (the one least impacted by leakage) and subtracts its contribution to the time signal. The iterative update of the time signal causes a large increase in runtime. The algorithms in Gilbert et al. [2002] and Mansour [1992] perform this update by going through O(k) iterations, each of which updates at least O(k) time samples, resulting in an O(k²) term in the runtime. The algorithm of Gilbert et al. [2005a] introduced a “bulk sampling” algorithm that amortizes this process but it requires solving instances of a non-uniform Fourier transform, which is expensive in practice.

Interpolation-based algorithms are less common and limited to the design in Iwen [2010b]. This approach uses the aliasing filter presented in Chapter 1, which is a leakage-free filter that allows [Iwen 2010b] to avoid the need for iteration. Recall that in this case, the filter G has Gi = 1 iff i mod n/p = 0 and Gi = 0 otherwise. The Fourier transform of this filter is a “spike train” with period p and hence this filter does not leak; it is equal to 1 on 1/p fraction of coordinates and is zero elsewhere. Unfortunately, however, such a filter requires that p divides n and the algorithm in Iwen [2010b] needs many different values of p. Since in general one cannot assume that n is divisible by all numbers p, the algorithm treats the signal as a continuous function and interpolates it at the required points. Interpolation introduces additional complexity and increases the exponents in the runtime.

3.1.3 Our Approach

The key feature of our algorithm is the use of a different type of filter. In the simplest case, we use a filter obtained by convolving a Gaussian function with a box-car function.⁵ Because of this new filter, our algorithm does not need to either iterate or interpolate. Specifically, the frequency response of our filter is nearly flat inside the pass region and has an exponential tail outside it. This means that leakage from frequencies in other buckets is negligible, and hence, our algorithm need not iterate. Also, filtering can be performed using the existing input samples xi, and hence our algorithm need not interpolate the signal at new points. Avoiding both iteration and interpolation is the key feature that makes this algorithm efficient.

Further, once a large coefficient is isolated in a bucket, one needs to identify its frequency. In contrast to past work which typically uses binary search for this task, we adopt an idea from Porat and Strauss [2010] and tailor it to our problem. Specifically, we simply select the set of “large” bins which are likely to contain large coefficients, and directly estimate all frequencies in those bins. To balance the cost of the bin selection and estimation steps, we make the number of bins somewhat larger than the typical value of O(k). Specifically, we use , which leads to the stated runtime.⁶

3.2 Algorithm

We refer to our algorithm as SFT 1.0 and it is shown in Algorithm 3.1. A key element of this algorithm is the inner loop, which finds and estimates each “large” coefficient with constant probability. In Section 3.2.1 we describe the inner loop, and in Section 3.2.2 we show how to use it to construct the full algorithm.

3.2.1 Inner Loop

Let B be a parameter that divides n, to be determined later. Let G be a (1/B, 1/(2B), δ, ω) flat window function described in Section 2.2.1 for some δ and . We will have δ ≈ 1/nc, so one can think of it as negligibly small.

There are two versions of the inner loop: location loops and estimation loops. Location loops, described as the procedure LOCATIONINNERLOOP in Algorithm 3.1, are given a parameter d, and output a set I ⊂ [n] of dkn/B coordinates that contains each large coefficient with “good” probability. Estimation loops, described as the procedure ESTIMATIONINNERLOOP in Algorithm 3.1, are given a set I ⊂ [n] and estimate x̂I such that each coordinate is estimated well with “good” probability.

Скачать книгу