FUN WITH SPECTRA Computing Fourier transforms using the Fast Fourier Transform (FFT) algorithm is a standard tool in geophysical analysis. There are many books and articles on Fourier theory and how the FFT works. I am distributing the part of Numerical Recipes that discusses this, which seems as good a treatment as any. You will certainly learn much more about this if you take our Data Analysis class. For brevity in this class I am simply going to focus on what you need to know to use an FFT program and leave the theory for you to pick up elsewhere. We start with a "time series," which is a series of measurements that are made at some regular interval. These measurments are normally stored on the computer in an array. We also need to know the sample interval (the time between sample points, the reciprocal of the digitization rate) and the number of points in the time series. In most cases the time series consists of real (i.e., not complex) numbers and we will assume this is true for the examples shown. The purpose of the FFT is to examine the different frequency waves that may be present in the data. The FFT is a linear transformation that allows us to look at the data in the "frequency domain" rather than the "time domain" of the original data. In fact, the original time series can be expressed exactly as a sum of harmonic waves of different frequency content and phase. The FFT allows us to readily compute what the frequencies and phases of these waves are. The simplest FFT algorithms require that the number of points in the original time series be a power of two. We will assume that this is true here, but you should be aware that alternative FFT methods exist for non-powers of two. Thus, the input to a typical FFT algorithm will consist of an array of data values, the number of time points (n), and the sample interval (dt). For a real time series, the FFT algorithm will then return a complex spectra that will give the amplitude and phase at a series of equally spaced frequency points. There will be n/2 + 1 frequency points and the frequency spacing will be df = 1/(n*dt). The first frequency point is at f = 0.0 The last frequency point is at f = 1/(2*dt) and is called the Nyquist frequency. The Nyquist frequency is the highest frequency that one can resolve in the time series. At the Nyquist frequency, there are two data points in the time series per wavelength. If higher frequencies are present in the data, they will be "aliased" so that they are seen at lower frequencies. From the time series alone, there is no way to discriminate between correctly resolved frequencies and aliased data from higher frequencies. Thus, sometimes filters (anti-aliasing filters) are applied to the data prior to digitization to remove the frequencies above the Nyquist so that this will not be a problem. The FFT algorithm provides the spectra at n/2 + 1 frequency points which may be represented as n/2 + 1 complex numbers. (complex numbers are necessary to represent both the amplitude and phase of each harmonic component). "Wait a minute," the alert reader might cry. We started with n real points in the time domain and now we have n+2 real points in the frequency domain, if we count each complex number as two reals. Have we gained information compared to the original time series? The resolution to this apparent problem comes from the fact that the first and last frequency points always have zero imaginary parts. Thus the total number of independent data points is the same in both the time and frequency domains. In fact, most FFT algorithms return only n/2 frequency points by packing the real part of the Nyquist point into the imaginary part of the zero frequency point. Once we have the frequency domain points we can make an amplitude spectra plot by plotting the absolute values (Cabs function) of the points versus frequency. If we want a power spectrum, then we square the amplitudes. Getting the correct units on the y-axis can be a tricky business (let's save that discussion until your time series analysis class); in many cases, however, we are only interested in the position of the peaks or the shape of the spectrum so we don't need to worry about this. We can also transform back to the time domain by computing an inverse FFT. If your code is working correctly, you should get back exactly the same time series as you started with. Here is a set of routines, adopted from those in Numerical Recipes to do both forward and inverse FFTs. You may obtain these in the /net/fishpub/239 directory under the name fftsubs.f90 if you don't want to cut and paste from the class web site (use anonymous ftp to mahi.ucsd.edu if you don't have a fishnet account). ---------------------------------------------------------------- ! GETSPEC_NR computes the spectrum of a real time series using ! Numerical Recipes FFT routines ! ! Requires: TAPER, REALFT, FOUR1 ! ! Inputs: ts = input time series with values from ts(1) to ts(ntime) ! ntime = number of time points (int) ! dt = sample interval (s) (float) ! itap = 1 to apply Hann taper before computing sprectrum ! = 0 otherwise ! Returns: spec = complex spectrum ! nfreq = number of complex frequency points (int, = 1+ntime/2) ! df = frequency spacing ! ! spec(1) = value at zero frequency ! spec(nfreq) = value at freq. of (nfreq-1)*df (Nyquist frequency) ! ! (November, 2002, Peter Shearer) ! subroutine GETSPEC_NR(ts, ntime, dt, itap, spec, nfreq, df) implicit none real, dimension(*) :: ts real, dimension(ntime) :: a complex, dimension(*) :: spec real :: dt, df integer :: ntime, itap, nfreq, i if (itap == 1) then call TAPER(ts, ntime, a) else do i = 1, ntime a(i) = ts(i) enddo end if call REALFT(a, ntime/2, 1) nfreq = 1 + ntime/2 df = 1./(dt*float(ntime)) spec(1) = cmplx(a(1), 0.) spec(nfreq) = cmplx(a(2), 0.) do i = 2, nfreq-1 spec(i) = cmplx(a(2*i-1), a(2*i)) enddo end subroutine GETSPEC_NR ! GETTS_NR computes a real time series from a complex spectrum ! (stored as in GETSPEC_NR) using Numerical Recipes FFT routines ! ! Requires: REALFT, FOUR1 ! ! Inputs: spec = complex spectrum stored as floats ! nfreq = number of complex frequency points (int, = 1+ntime/2) ! df = frequency spacing ! ! Returns: ts = input time series with values from ts(1) to ts(ntime) ! ntime = number of time points (int) ! dt = sample interval (s) (float) ! ! spec(1) = value at zero frequency ! spec(nfreq) = value at freq. of (nfreq-1)*df (Nyquist frequency) ! ! (November, 2002, Peter Shearer) ! subroutine GETTS_NR(spec, nfreq, df, ts, ntime, dt) real, dimension(*) :: ts real, dimension(ntime+2) :: a complex, dimension(*) :: spec real :: df, dt, scale integer :: nfreq, ntime, n, isign, i ntime = (nfreq-1)*2 dt = 1./(df*float(ntime)) scale = 2.0/float(ntime) a(1) = real(spec(1))*scale a(2) = real(spec(nfreq))*scale do i = 2, nfreq-1 a(2*i-1) = real(spec(i))*scale a(2*i) = aimag(spec(i))*scale enddo call REALFT(a, ntime/2, -1) do i = 1, ntime ts(i) = a(i) enddo end subroutine GETTS_NR ! TAPER multiplies ts1 for Hann taper and puts result in ts2 ! subroutine TAPER(ts1, n, ts2) implicit none real, dimension(n) :: ts1, ts2 real :: pihalf, tsmid, frac, angle, tap integer :: n, i, i2 pihalf = 3.1415927/2. i2 = n/2 +1 tsmid = float(n)/2. + 0.5 do i = 1, i2 frac = float(i)/tsmid angle = pihalf*frac tap = sin(angle)**2 ts2(i) = ts1(i)*tap ts2(n+1-i) = ts1(n+1-i)*tap enddo end subroutine TAPER SUBROUTINE REALFT(DATA,N,ISIGN) REAL*8 WR,WI,WPR,WPI,WTEMP,THETA DIMENSION DATA(*) THETA=6.28318530717959D0/2.0D0/DBLE(N) C1=0.5 IF (ISIGN.EQ.1) THEN C2=-0.5 CALL FOUR1(DATA,N,+1) ELSE C2=0.5 THETA=-THETA ENDIF WPR=-2.0D0*DSIN(0.5D0*THETA)**2 WPI=DSIN(THETA) WR=1.0D0+WPR WI=WPI N2P3=2*N+3 DO 11 I=2,N/2+1 I1=2*I-1 I2=I1+1 I3=N2P3-I2 I4=I3+1 WRS=SNGL(WR) WIS=SNGL(WI) H1R=C1*(DATA(I1)+DATA(I3)) H1I=C1*(DATA(I2)-DATA(I4)) H2R=-C2*(DATA(I2)+DATA(I4)) H2I=C2*(DATA(I1)-DATA(I3)) DATA(I1)=H1R+WRS*H2R-WIS*H2I DATA(I2)=H1I+WRS*H2I+WIS*H2R DATA(I3)=H1R-WRS*H2R+WIS*H2I DATA(I4)=-H1I+WRS*H2I+WIS*H2R WTEMP=WR WR=WR*WPR-WI*WPI+WR WI=WI*WPR+WTEMP*WPI+WI 11 CONTINUE IF (ISIGN.EQ.1) THEN H1R=DATA(1) DATA(1)=H1R+DATA(2) DATA(2)=H1R-DATA(2) ELSE H1R=DATA(1) DATA(1)=C1*(H1R+DATA(2)) DATA(2)=C1*(H1R-DATA(2)) CALL FOUR1(DATA,N,-1) ENDIF RETURN END SUBROUTINE FOUR1(DATA,NN,ISIGN) REAL*8 WR,WI,WPR,WPI,WTEMP,THETA DIMENSION DATA(*) N=2*NN J=1 DO 11 I=1,N,2 IF(J.GT.I)THEN TEMPR=DATA(J) TEMPI=DATA(J+1) DATA(J)=DATA(I) DATA(J+1)=DATA(I+1) DATA(I)=TEMPR DATA(I+1)=TEMPI ENDIF M=N/2 1 IF ((M.GE.2).AND.(J.GT.M)) THEN J=J-M M=M/2 GO TO 1 ENDIF J=J+M 11 CONTINUE MMAX=2 2 IF (N.GT.MMAX) THEN ISTEP=2*MMAX THETA=6.28318530717959D0/(ISIGN*MMAX) WPR=-2.D0*DSIN(0.5D0*THETA)**2 WPI=DSIN(THETA) WR=1.D0 WI=0.D0 DO 13 M=1,MMAX,2 DO 12 I=M,N,ISTEP J=I+MMAX TEMPR=SNGL(WR)*DATA(J)-SNGL(WI)*DATA(J+1) TEMPI=SNGL(WR)*DATA(J+1)+SNGL(WI)*DATA(J) DATA(J)=DATA(I)-TEMPR DATA(J+1)=DATA(I+1)-TEMPI DATA(I)=DATA(I)+TEMPR DATA(I+1)=DATA(I+1)+TEMPI 12 CONTINUE WTEMP=WR WR=WR*WPR-WI*WPI+WR WI=WI*WPR+WTEMP*WPI+WI 13 CONTINUE MMAX=ISTEP GO TO 2 ENDIF RETURN END ---------------------------------------------------------------- GETSPEC_NR, GETTS_NR and TAPER are my own inventions; REALFT and FOUR1 are from Numerical Recipes. Hopefully the documentation for GETSPEC_NR and GETTS_NR is self-explanatory. The GETSPEC_NR function includes an option to multiply the time series by a cos^2 function (i.e., Hanning taper) before computing the FFT. This generally improves the appearence of the spectrum and avoids artifacts that may result from the abrupt truncation of the time series at the end of the data window. A typical call to GETSPECNR would have the form: call GETSPEC_NR(a, n, dt, itap, b, nfreq, df); where a is the data array. You should define the sizes of a and b in the calling program as, for example: integer, parameter :: maxpts = 8192 real, dimension(maxpts) :: a complex, dimension(maxpts/2+1) :: b WARNING: The routines do not check if n is a power of two. This would be a useful improvement to make the routines more robust with respect to user errors in the calling program. ---------------------- GUITAR STRING EXAMPLE ------------------ A properly tuned guitar will have strings with fundamental mode frequencies as follows: E = 82 Hz A = 110 Hz D = 147 Hz G = 196 Hz B = 247 Hz E = 330 Hz There will be overtones (harmonics) at frequencies of n*f0 where f0 is the fundamental (lowest) frequency. Thus, for the A string, we should expect to see peaks at 110 Hz, 220 Hz, 330 Hz, 440 Hz, etc. The relative power between the different harmonics is what determines the tone of the guitar. The shareware program MacCRO (http://pderrin.cjb.net/ maccro.html) uses the audio input on the Mac to simulate an oscilloscope. For our purposes, its most useful feature is the ability to download digitized sounds to a file. For a class demo, my plan is to use the program to digitize the sound of the A string, save the result to a file, and then bring the file over to the Sun where it can be FFTed and the resulting spectrum plotted. If we have time, it will be interesting to compare: (1) string plucked near center vs. near one end (2) sound sampled just after pluck vs. a few seconds later (3) cheap vs. expensive guitar ASSIGNMENT FFT1 Get the file bfo1 from /net/fishpub/239 (use anonymous ftp to mahi.ucsd.edu if you don't have a fishnet account). This is a time series of the 1994 deep Bolivian earthquake as recorded at Black Forest Observatory, Germany (48.3319N,8.3311E). The sample interval is 40 s and there are 24225 points in the file, representing over 11 days of data. The sensor is vertical in orientation so primarily spheroidal modes are recorded. (a) Plot the time series using whatever plotting program you want. Label the x-axis in units of hours. (b) Write a F90 program to read the data and compute the spectrum, both with and without first applying a Hann taper. (c) Make nice plots of the spectra that compare the results with and without the Hann taper. Label the x-axis in units of mHz. Make plots that go from zero to the Nyquist frequency and also some closeups from zero to 2 mHz. (d) Using Table V from the PREM paper (Dziewonski and Anderson, PEPI 25, 297-356, 1981, see me if you want to make a copy), identify and label as many of the peaks below 2 mHz as you can. Example: 0S4 is at 1545.6 s = 0.65 Hz. NOTE: 24225 is not a power of two. You have two choices. You can either truncate the time series at 2^14=16384 or pad the time series with zeros to 2^15=32768. In the latter case, make sure you apply the taper to the 24225-point series and THEN pad with zeros (i.e., you can't just use the taper option in getspec).