Nyquist provides functions for FFT and inverse FFT operations on streams of audio data. Because sounds can be of any length, but an FFT operates on a fixed amount of data, FFT processing is typically done in short blocks or windows that move through the audio. Thus, a stream of samples is converted in to a sequence of FFT frames representing short-term spectra.

Nyquist does not have a special data type corresponding to a sequence of FFT frames. This would be nice, but it would mean creating a large set of operations suitable for processing frame sequences. Another approach, and perhaps the most "pure" would be to convert a single sound into a multichannel sound, with one channel per bin of the FFT.

Instead, Nyquist violates its "pure" functional model and resorts to objects
for FFT processing. A sequence of frames is represented by an XLISP object. Whenever you
send the selector `:next` to the object, you get back either NIL, indicating the
end of the sequence, or you get an array of FFT coefficients.

The Nyquist function FFT (mnemonic, isn't it?) returns one of the frame sequence generating objects. You can pass any frame sequence generating object to another function, IFFT, and turn the sequence back into audio.

With FFT and IFFT, you can create all sorts of interesting processes. The main idea is to create intermediate objects that both accept and generate sequences of frames. These objects can operate on the frames to implement the desired spectral-domain processes. Examples of this will follow, but first, let's try something simpler.

[*Special thanks to James Valenti for detailed comments and explanations
about this code. -RBD*]

Just to convince ourselves that this works and to explore the FFT function, let's perform an FFT on a sinusoid. To make the outcome predictable, we will use 32 samples of sinusoid with a period of 8 samples. If we take a 32-point FFT, the spectrum should show a strong 4th harmonic (there are 4 8-sample periods) and zero everywhere else.

Before the test, we will introduce a simple function to create an FFT sequence
generating object, which is called an iterator: The function is called `make-fft1-iterator`,
and creates an object of class `fft1-class` (I am using `fft1`
rather than `fft` because this is a simplified version that will be improved
later). Objects of this class will have the fields `sound`, `length`,
and `skip`. Create the class by sending `class` (the super class
of all classes) a `:new` message to create a new class that will be a
sub-class of `class`:

(setf fft1-class (send class :new '(sound length skip)))Add a method named :next to the

(send fft1-class :answer :next '() '( (snd-fft sound length skip nil)))Add a method named

(send fft1-class :answer :isnew '(snd len skp) '( (setf sound snd) (setf length len) (setf skip skp)))Make a new function named

(defun make-fft1-iterator (sound length skip) (send fft1-class :new (snd-copy sound) length skip))

For test data, create a sinusoid. `lfo` generates a
one second sinusoid.
The `control-srate-abs` transformation changes the sample rate
generated by the `lfo` to `points` and since the duration
is one second, there will be exactly `points` samples
generated. The `cycles` parameter is the frequency of the
sinusoid, so a value of 1 will generate one cycle (in one second) and
a value of 2 will generate 2 cycles in one second, etc.

;; create a 1-second sinusoid with points samples at cycles hz: (defun short-sine (points cycles) (control-srate-abs points (lfo cycles)))In the

(defun fft-test () (let (fft-iter) ;; signal will have 4 cycles in 32 points: (setf fft-iter (make-fft1-iterator (short-sine 32 4) 32 32)) (display "fft-test" (send fft-iter :next))))

Running this prints an array of nearly-zero values except for the 9th element (index 8), which is -1. The layout is as follows:

- the DC component goes in array element 0
- the Cosine part is in elements 2i - 1
- the Sine part is in elements 2i
- the Nyquist frequency component is in the last element

> (load "fft1.lsp") ;loading "fft1.lsp" ;[ gc: total 20640, 3360 free; samples 12 KB, 7KB free ] ;fft-test : (SEND FFT-ITER :NEXT) = #(1.53076e-017 0 0 0 0 0 0 -3.06152e-017 ; -1 0 0 0 0 0 0 3.06152e-017 ; 0 0 0 0 0 0 0 -3.06152e-017 ; 8.9407e-008 0 0 0 0 0 0 1.53076e-017) ;T ;>

Thus, the element at index 8 is the 4th Sine component, indicating a 4th harmonic as expected.

Now, let's try to reverse the FFT and recover the sinusoid. We will use the
`snd-ifft` function, which takes five parameters:

- the starting time of the resulting sound
- the sample rate of the resulting sound
- the iterator object, which will be called to obtain short term spectra (frames)
- the step-size, how many samples to advance for each frame
- the window (or NIL), by which each frame is multiplied after the IFFT

The result returned from snd-ifft is a sound which is assigned here to local
variable ifft-snd. In this case, the sound will have a start time of 0, a sample
rate of 32, the spectral frames (actually just one frame) will be obtained by
sending the `:next` message to `fft-iter`, the skip between successive
frames is 32, and no windowing is applied.

(defun ifft-test () (let (fft-iter ifft-snd) (setf fft-iter (make-fft1-iterator (short-sine 32 4) 32 32)) (setf ifft-snd (snd-ifft 0 32 fft-iter 32 NIL)) (display "fft-ifft" (snd-length ifft-snd 200)) (display "fft-ifft" (snd-samples ifft-snd 200)) ifft-snd )) ; return the sound

Now call the function:

(ifft-test)

Looking at the printed samples, you can see that the result is a close approximation
to the expected sinusoid. Since `ifft-test` returns a sound, you can
call` (play (ifft-test))`, but in this case you won't hear much because
the sound is only 32 samples, and there may also be problems trying to play
a sound that is only sampled at 32 samples per second.

**Exercise: **Modify `short-sine` to produce a cosine and observe the
difference in the FFT. Confirm that the IFFT reproduces the original cosine. (Hint: the
parameters to `LFO` should be: `(lfo cycles 1.0 *sine-table* 90.0)`,
indicating frequency, duration, the waveform, and the initial phase, respectively.)

Let's create an FFT iterator that reads sounds from a file. This is similar
to what was done before, but here, `s-read` is used to read samples instead
of using `lfo` to compute samples:

(defun file-fft1 (filename frame-length skip) (make-fft1-iterator (s-read filename) frame-length skip))

And here is a function to recover the sound file from spectral frames and play
it. The sound will start at time 0, at the sample rate `*sound-srate*`,
using `iterator` to generate successive spectral frames, advancing `skip`
samples between fft's, and no windowing. Note: `*sound-srate*` is a global
variable from Nyquist that is the default sample rate for sounds. It can be
read directly, but you should only modify it by calling `sound-srate-abs`:

(defun play-fft1 (iterator skip) (play (snd-ifft 0 *sound-srate* iterator skip NIL)))

Define a filename for testing. Remember that the backslash (`\`) character
is the *escape* character for XLisp strings, so you need to type two backslashes
to denote one backslash character. Windows will also accept forward slashes
(`/`) as a separator, and the forward slash does not need to be escaped.
Be sure to test this with a mono file at the default sample rate of 44100:

;; a convenient sound file name (change this to one of your mono soundfiles): (setf sfn "D:\\brain\\outro\\soup.wav")Define a function to call

(defun file-test () (play-fft1 (file-fft1 sfn 512 512) 512))

If you do not hear your original file, try playing the original file by typing
`(play-file sfn)`, and make sure the file is a mono file at 44,100 Hz
sample rate (to match `*sound-srate*` used in `play-fft1`). Check
out the file by typing `(sf-info sfn)`.

You can try different skip parameters, e.g. to play the sound at double speed, try this:

(play-fft1 (file-fft1 sfn 512 512) 256)

In the previous example, you may have noticed a lot of messages that look something like the following:

[ gc: total 18640, 3114 free; samples 49KB, 39KB free ]

These messages are printed whenever the garbage collector runs to reclaim memory that was allocated but is no longer in use by the program. The memory found by the garbage collector is reused in future computation to avoid allocating more memory than is necessary. The particular message here says that 18640 "cells" have been allocated. Each cell is 10 bytes, so there are 186,400 bytes, or about 186KB used to store XLisp programs and data. This does not include the XLisp interpreter, Nyquist functions, and system libraries, that take an additional 500KB or so. It also does not include audio samples, which are stored separately. Of the 18640 cells, 3114 were free after garbage collection. Of the 49KB of audio data (samples) allocated by Nyquist, 39KB were free for reuse after after garbage collection.

Nyquist and XLisp do a pretty good job of effective, automatic garbage collection, but
since XLisp was designed to run on small machines, it favors using a minimal amount of
memory, and this may result in frequent garbage collections. FFT processing, in
particular, makes heavy use of XLisp, since every floating point number in every FFT frame
requires the allocation of a 10-byte cell. When you are processing with FFTs, you can
often gain some speed by telling XLisp to allocate more memory cells. Do this by calling `expand`:

(expand 100)

The parameter tells how many additional internal blocks of memory to allocate. Each block is 2000 cells, or about 20KB, so if the parameter is 100, you will allocate about 2MB. After allocating memory, the next garbage collection might look like:

[ gc: total 218640, 204298 free; samples 49KB, 44KB free ]

Notice that now, garbage collection frees a higher proportion of memory than before (204298 out of 218640 cells). Garbage collection is more efficient when there is a higher proportion of memory to free.

On the other hand, you should not allocate more memory than necessary. If you allocate too much, parts of Nyquist and other programs will be paged out of RAM and onto your disk drive. Since Nyquist must access all of its memory for every garbage collection, it is very inefficient to have any of that memory reside on disk.

Should you always expand memory? Almost all non-FFT audio processing uses special data structures that are outside of XLisp and are managed without the use of the garbage collector. Since audio processing normally takes over 95% of the execution time, you normally do not have to worry much about the efficiency of XLisp.

The main purpose of FFT and IFFT functions is to enable processing in the spectral domain. Let's create a very simple example in which certain frequencies are set to zero, creating a filter effect. (Note that this is not a very good way to create filters because of boundary effects.) We need to create an object that will access an FFT iterator to get frames, and that will serve as an iterator to deliver frames. The code is as follows:

(setf fft-hp-class (send class :new '(source bins))) (send fft-hp-class :answer :next '() '( ; get a new frame of fft coefficients from source (let ((frame (send source :next))) ; creates local variable - frame ; a note about let: ; the first list after let names new local variables ; the following lists (below) are expressions to be evaluated ; ; if frame is nil, there are no more frames to process (cond (frame ; if frame is not nil, then do the following (dotimes (i bins) ; set all the bins from 0 to bins-1 (these are low frequency bins) to zero ; the order of coefficients is: ; DC coefficient ; first real ; first imaginary ; second real ; second imaginary ; ... ; if the length is even, then last is the real ; coefficient of the Nyquist frequency ; ; (Note: aref gets the ith element of the array) (setf (aref frame i) 0.0)))) ; end of cond frame))) ; frame is the last expression of the let, so it is the return valueCreate an

A refresher as to what goes on with `file-fft`: `file-fft1` reads a file
into a sound variable and calls `make-fft1-iterator`. Then
`make-fft1-iterator` returns an `fft-1` class by sending
`fft1-class :new` with a copy of the sound. It makes this copy because
it will have `skip` samples removed every invocation of
`fft1-class:next` because this method calls `snd-fft` which
removes the samples every time a new fft frame is calculated.

(send fft-hp-class :answer :isnew '(s b) '( (setf source s) (setf bins b)))This is a function to make and return an instance of the

(defun make-fft-hp (source bins) ; Send the fft-hp-calss a :new message to create a new instance of the class ; and to initialize the instance with the values of source and bins ; Returns an object that responds to the :next message by getting the next ; fft frame, zeroing the low frequencies, and returning the frame. (send fft-hp-class :new source bins))In

(defun hp-test () (play-fft1 (make-fft-hp (file-fft1 sfn 512 512) 11) 512))

Note that `make-fft-hp` takes a parameter, bins, that tells how many
coefficients of the FFT to zero. At 44,100 Hz, the frame rate is 44100/512 = 86.13 Hz, and
this corresponds to the first bin (coefficients at locations 1 and 2). Since we specified
11 for the bins parameter, this will zero the DC component and 5 complex pairs,
representing frequencies up to 86.12 * 5 = 430.7 Hz. (If you are trying this on a laptop
computer with small speakers, or if you are using a sound without low frequencies, you may
want to increase the number of bins in `hp-test` from 11 to 30 to make the effect
really obvious.)

You will notice some raspy sounds in the output. Since the FFT frames are not smoothed by any window or overlapped, there are serious windowing artifacts that are easily heard. We will look at ways to reduce this problem later.

An interesting effect is to let one spectrum modulate another one. By multiplying the amplitude spectrum of one signal onto that of another, one can superimpose the two signals. (Note that adding spectra is simply a way to mix them and is equivalent to addition in the time domain.)

Let's create a bright FM sound to serve as a spectral modulation "carrier". First, create the FM sound. This sound has an initial, middle, and fiinal modulation index as well as a (constant) pitch:

(defun fm-tone (step mi1 mi2 mi3) (let ((hz (step-to-hz step))) (setf mi1 (* mi1 hz)) (setf mi2 (* mi2 hz)) (setf mi3 (* mi3 hz)) (fmosc c4 (partial step (control-srate-abs *sound-srate* (pwl 0 mi1 0.5 mi2 1 mi3 1))))))

The modulated sound will be a cluster of three FM tones. You could use just one tone, but I have had better luck with chords or noise. You want to make sure the signal has plenty of amplitude at many frequencies; otherwise, the spectral modulation will not have anything to modulate.

(defun mod-snd () (sum (fm-tone c3 15 20 15) ;; adjust FM parameters here (fm-tone d3 15 20 15) ;; adjust FM parameters here (fm-tone e3 15 20 15))) ;; adjust FM parameters here

Next, we need a class to do the modulation. As before, objects of this class respond to
the `:next `message. In this case, the `:next` method sends `:next`
to both sources. It then multiplies the coefficients of one by the computed amplitude
spectrum of the other.

(setf fft-modulator-class (send class :new '(src1 src2))) (send fft-modulator-class :answer :isnew '(s1 s2) '( (setf src1 s1) (setf src2 s2))) (send fft-modulator-class :answer :next '() '( (let ((frame1 (send src1 :next)) (frame2 (send src2 :next)) n half_n) (cond ((and frame1 frame2) ; multiply frame2 by the amplitude coefficients of frame1 (setf (aref frame2 0) (* (aref frame2 0) (aref frame1 0))) ;; DC (setf n (- (length frame1) 1)) ; Subtracted 1 because we already took care of DC component (setf half_n (/ n 2)) ; integer divide (dotimes (i half_n) (let* ((i2 (+ i i 2)) (i2m1 (- i2 1)) (amp (sqrt (+ (* (aref frame1 i2m1) (aref frame1 i2m1)) (* (aref frame1 i2) (aref frame1 i2)))))) (setf (aref frame2 i2m1) (* (aref frame2 i2m1) amp)) (setf (aref frame2 i2) (* (aref frame2 i2) amp)))) (cond ((= n (+ half_n half_n 2)) ;; n is even -> nyquist component (setf (aref frame2 n) (* (aref frame2 n) (aref frame1 n))))) frame2) (t nil))))) (defun make-fft-modulator (src1 src2) (send fft-modulator-class :new src1 src2))

The code for `:next` is longer than you might expect, but it is basically
just a vector multiplication. Lisp is not ideal for numerical algorithms like this. The
following code will test the new class:

(defun mod-test () (let ((fs 512)) ;; frame size (play-fft1 (make-fft-modulator (file-fft1 sfn fs fs) (make-fft1-iterator (mod-snd) fs fs)) fs)))

Here, we pass in iterators for the sound stored in a file and for the modulated sound.
This example seems to work best if the sound in the file is a voice. You can also try
stretching the speech by reducing the step parameter to `file-fft` and by making
the modulated sound longer. You might also try different sizes of FFTs using the frame
size (`fs`) parameter. If the FFT frame size is too big, you will resolve
individual harmonics of the voice, and if it is too small, you will not resolve formants.
256 and 512 seem to be good numbers when working at 44,100 Hz sampling rates.

In the previous examples, and especially in the last one, a lot of buzzing and noise is created because the boundaries of the FFT frames are no longer continuous when the spectrum is altered. To reduce the buzzing effect, we can multiply the frame by a smoothing window before reconstructing the time domain signal. We can also use a smoothing window on the input side, before taking the FFT. For now, we will try just a smoothing window on the IFFT side.

The smoothing window will be a raised cosine pulse, although other window shapes could be used. The following code implements a raised cosine. (A raised cosine is a smooth bell-shaped curve that rises from zero to 1 and back again. The curve is one period of a sinusoid or cosine, but "raised" so that its minimum value is at zero.) The lfo function computes a sine curve, so we use an initial phase of 270 degrees to get the proper shape:

(defun raised-cosine () (scale 0.5 (sum (const 1) (lfo (/ 1.0 (get-duration 1)) 1 *sine-table* 270))))

Next, we need a function to create a window of the proper length. The sample rate of
the window does not matter because the FFT and IFFT functions will simply take the proper
number of samples (without any interpolation or resampling). Therefore, we will use the `frame-size`
as the sample rate, insuring that there are `frame-size` samples:

(defun fft-window (frame-size) (control-srate-abs frame-size (raised-cosine)))

Now we can pass in a window to the IFFT. Let's redo the spectral modulation example
with smoothing windows on the IFFT side. First, we will rewrite `play-fft` to use
windowing:

(defun play-fft (iterator frame-size skip) (play (snd-ifft 0 *sound-srate* iterator skip (fft-window frame-size))))

Now, we can rewrite `mod-test:`

(defun mod-test-w () (let ((fs 512)) ;; frame size (play-fft (make-fft-modulator (file-fft1 sfn fs (/ fs 2)) (make-fft1-iterator (mod-snd) fs (/ fs 2))) fs (/ fs 2))))

Notice that we reduced the step size to half the frame size. This is because we want the windows to overlap by 50%, creating a nice cross-fade from one frame to the next.

You might guess that the square windowing on the analysis (FFT) side is injecting high
frequencies into the spectrum. We can introduce windowing there too. We can modify `make-fft-iterator`
to use a window. This will require changes to `fft1-class`, so we will use the name
`fft-class:`

(setf fft-class (send class :new '(sound length skip window))) (send fft-class :answer :next '() '( (snd-fft sound length skip window))) (send fft-class :answer :isnew '(snd len skp) '( (setf sound snd) (setf length len) (setf skip skp) (setf window (fft-window len)) )) (defun make-fft-iterator (sound length skip) (send fft-class :new (snd-copy sound) length skip))

Now, we can rewrite `file-fft` to use windowing:

(defun file-fft (filename frame-length skip) (make-fft-iterator (s-read filename) frame-length skip))

Now, we can rewrite `mod-test` yet again:

(defun mod-test-ww () (let ((fs 512)) ;; frame size (play-fft (make-fft-modulator (file-fft sfn fs (/ fs 2)) (make-fft-iterator (mod-snd) fs (/ fs 2))) fs (/ fs 2))))

This version seems to have less definition for speech input than the previous one. It might be that some windowing artifacts in the right places are actually beneficial! In fact, some simple experiments indicate that, in this example, the speech sounds better when the (mod-snd) sound is analyzed without windowing. This probably creates more noise to be modulated by the speech signal, so more speech information gets through.

We can also stretch the result by taking smaller steps during the analysis (FFT) stage so that more frames are generated. I will also increase the frame size to obtain fewer frames/second.

(defun mod-test-wws () (let ((fs 1024)) ;; frame size (play-fft (make-fft-modulator (file-fft sfn fs (/ fs 16)) (make-fft1-iterator (mod-snd) fs (/ fs 16))) fs (/ fs 2))))

Cort Lippe and Zack Settel have explored cross synthesis and other effects with FFTs extensively, and their work inspired these examples.