Rapid Execution of Fan Beam Image Reconstruction Algorithms Using Efficient Computational Techniques and Special-Purpose Processors

19 pages
of 19
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Rapid Execution of Fan Beam Image Reconstruction Algorithms Using Efficient Computational Techniques and Special-Purpose Processors
  IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. BME-28, NO. 2, FEBRUARY 1981 tomography, in Statistical Mechanics and Statistical Methods in Theory and Application,U. Landman, Ed. New York: Plenum, 1977.David Nahamoo was born in Hamedan, Iran, in 1953. HeDreceived the B.Sc. degree in electrical engineering from Tehran University, Iran, and the M.Sc. degree in electrical engineering from Imperial College of Science and Technology, University of London, England, in 1975and 1976, respectively. He is currently working toward a Ph.D. degree in electrical engineering at Purdue University, West Lafayette, IN. He has been a Research Assistant at Purdue University since August, 1977. Mr. Nahamoo is a member ofPhi Kappa Phi. Carl R. Crawford was born in Milwaukee, WI, in 1956. He received the B.S. and M.S. degrees in electrical engineering fromPurdue University, West Lafayette, IN, in 1976 and 1977, respec- 14 A dz flyl ively. He is currently pursuing a Ph.D. degree in electrical engineering at Purdue University. Mr. Crawford is a member of Eta Kappa Nu and Tau Beta Pi. Avinash C. Kak (M'7 1), for a photograph andbiography, see this issue, p.49. RapidExecution of Fan Beam Image Reconstruction Algorithms Using Efficient Computational Techniques and Special-Purpose Processors BARRY K. GILBERT,MEMBER, IEEE, SURENDER K. KENUE, RICHARD A. ROBB, ALOYSIUS CHU, ARNOLD H. LENT, AND EARL E. SWARTZLANDER, JR., SENIOR MEMBER, IEEE Abstract-Rapid advances during the pastten years of several forms of computer-assisted tomography(CT) have resulted in the develop- ment of numerous algorithms to convert raw projection data into cross- sectional images. These reconstruction algorithms are either  iterative, in which a large matrix algebraic equation is solved by successive ap- proximation techniques; or  closed form. Continuing evolution of the closed form algorithms hasallowed thenewest versions to produce excellent reconstructed images in most applications. This paper will review several computer software and special-purpose digital hardwareimplementations of closed form algorithms, either proposed during the past several years by a number of workers or actually implemented in commercial or research CT scanners. The discussion willalso cover a number of recentlyinvestigated algorithmic modifications which reduce Manuscript received March 2, 1980; revised September 1, 1980. This work was supported in part by the U.S. Public Health Service under Grant HL-04664, the National Institutes of Health underGrant RR- 00007, the U.S. Air Force under Grant F33615-79-C-1875, andthge Fannie E. Rippel Foundationunder a research grant. B. K. Gilbert, R. A. Robb, and A. Chu are with the Biodynamics Re- search Unit, Department of Physiology and Biophysics, Mayo Founda- tion, Rochester, MN 55901. S. K. Kenue was with the Biodynamics Research Unit, Department of Physiology and Biophysics, Mayo Foundation, Rochester, MN 55901. He is now with the Picker Corp., Cleveland, OH. A.H. Lent was with the Biodynamics Research Unit, Department of Physiology and Biophysics, Mayo Foundation, Rochester, MN 55901. He is now with the New Ventures Division, Technicare Corp.,Salem, OH. E.E. Swartzlander, Jr. is with TRW Defense andSpace Systems Group, Huntsville, AL. the amount of computation required to execute the reconstruction process,as well as several new special-purpose digital hardware imple- mentations under development in our laboratories at the Mayo Clinic. INTRODUCTION D APID advancesduring thepast ten years of several forms ^ of computer assisted tomography (CT) have resulted in the development of numerous algorithms to convert raw projection data into cross-sectional images. These reconstruc- tion algorithms areeither  iterative, in which alarge matrix algebraic equation is solved by successive approximation tech- niques;or  closed form. Closed form reconstruction algo- rithms which operate in the frequency domain are generally referred to as Fourieralgorithms; those which operate in the spatial domain are referred to as filtered back projection algorithms. Though the iterative algorithms are frequently employed when the signal/noise ratio of the inputdata is low, or when several of theprojections are unavailable, they are in less frequent use for high quality data, since iterative solutions consume large amounts of computer time. Continuing evolu- tion of the closed form algorithms has allowed the newest versionsto produce excellent reconstructed images in most applications. This paper will review several computer software and special- purpose digital hardware implementations of closed form 0018-9294/81/0200-0098$00.75 © 1981 IEEE  GILBERT et al.: FAN BEAM IMAGE RECONSTRUCTIONALGORITHMS algorithms, either proposed during the past several years by a number ofworkers oractually implemented in commercial orresearch CT scanners. The discussion will also cover several new special-purpose digital hardware implementations under development in our laboratories at the Mayo Clinic, Rochester, MN. Alternate implementations, using analog electronics or optical methods [1], though of considerable interest, have been omitted from considerationhere in the interests of brevity. For excellentrecent overviews of the numerous types of iterative and closed form reconstruction algorithms, see [2]-[4]. Early X-ray CT scanners used a pencil beam of X-rays, thereby allowing anassumption that all X-rays from a given angle of view were parallel to one another, in turn allowing so- called  parallel beam reconstruction algorithms to be imple- mented. In order to permit more rapid scanning and datacollection than was feasible for pencil beam systems, the newer X-ray CT scanners now employ a fan beam of X-rays with an edge to edge beam divergence of approximately 30°-450. Though the fan beam algorithms were more diffi- cult to derive and frequently are more costly to execute than the parallel beam algorithms, they are today in common use. Implementation of any of the closed form reconstruction methods requires a careful scrutiny ofevery stage of the process, including an analysis of thealgorithm's numerical char- acteristics, and selection of the appropriateprocessor on which to execute it. Section I of this paper will concentrate on the algorithmic aspects of such an implementation. Although a brief description will be presented of various fan beam algo- rithmscurrently extant, primary emphasis will be given to the filtered back projection approach. Methods of executing the filtration portion of this algorithm, as well as simplifications thereof to decreasethe required amount ofcomputation, will be discussed. The requirement for, and mechanization of, an interpolation of extravalues into the filtered projections will then be described.Several optimizations of the back projec- tion computations using approximation methods will be covered, followed by -an example of improvementswhich can be identified in the fundamental numerical operations employed in reconstruction, e.g., the inner product.Section II of the paper concentrates on physical implemen- tation of the algorithms, whether in software or on specially designed processors. A discussion of several implementations of the filtration step is followed by descriptions both of serial processing ( projection-driven ) and of parallel processing ( pixel-driven ) approaches. Finally,the impactof advanced technology on processorefficiency will beexamined. I. ALGORITHMIC CONSIDERATIONS A. Fan Beam Reconstruction byRebinningThe first attempt to generalize the parallel beam filtered back projection reconstruction algorithm of Ramachandran and Lakshminaryanan [5] to a- fan beam energy source is based upon therecognition that projections generated by a fan beam source and collected from around 3600 of the object to be reconstructed contain data samples, or ray sums, which could equally well have been formed by energy sourcesgenerat- ing parallel beams of penetrating radiation. Judicious rearrange- ment of ray sums from all fan beam generated projections could thus create a new set of projections with the same characteristics as thosegenerated by parallelsets of penetrat- ing rays. The rearranged projections couldthen be recon-structed into cross-sectional images by application of parallel beam reconstruction algorithms such as thatderived in [5]. These  rebinning algorithms [6] -[9] have been implemented in software by several workers and also in hardware for at least two special-purpose computers dedicated to thereconstructiontask [10]. One of the major deficiencies of rebinningalgorithms is the requirement to execute a two-dimensional interpolation of ray sums in both the angular dimension andbetween measured ray sums. If no interpolation is carried out, rebinning algo- rithms generate parallel projections in which the number of ray sums is usually very much less than the number of srcinal ray sums recorded in thefan beam projections. Thisreduction in sample density results in a decrease in bandwidth of the information contentof the parallel projections, and is thusnever used in actual practice. In some schemes, additional projections are generated by linear interpolation of projections between two consecutivefanangles;thedesired parallel ray sums are thenobtained by interpolation from the two closest fan beam ray sums. Linear interpolation between measured ray sums usually suffices, butdoes degrade the resolution of the parallelized projection data. Higher orderinterpolation (e.g., cubic splines) canbeused at the expenseof morecompu- tation. In addition, the filtration and back projection (see SectionI-C for a discussion of back projection) of the paral- lelized projections cannotproceed until all projections have been collected. However, the rebinningalgorithms are more computationally efficient than the direct fan beam algorithmsduring the back projection portion of theprocedure, since they omit a weighting operation which must beexecuted in the direct fan beam algorithms (as described later). B. Fan Beam Reconstruction by Fourier Techniques Several workers have presented parallel beam and fan beam reconstruction algorithms which operate partially or totally in the frequency (Fourier) domain. In onesuch algorithm, the raw projections are first  back projected; the back projected image is then filtered in the two-dimensional (2-D) Fourier domain by a suitably chosen filter function, e.g., a ramp, ora ramp multiplied by a window function. Fourier filtration is accomplished by taking the forward2-D fast Fourier trans- form (FFT) of the back projected image, then multiplying each frequency component of the transformedimage by the filter function, and finally by forming the inverse 2-D FFT of the transformed image [11],[12]. Although the use of the FFT technique tends to minimize the amountof computation required to execute the filtration process(see below), this method requires four times morecomputer memory than for the spatial domain techniques. C. Truly Fan Beam Reconstruction Algorithms The development in 1975 of a truly fan beam filtered back projection reconstruction algorithm [13], followed by a rigor- ous derivation thereof [14] -[16], resulted in a conversion bymany workers from parallel beam or rebinning approaches to the direct reconstruction of thefan beam projection data. 99  IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. BME-28, NO. 2, FEBRUARY 1981 This algorithm is composed of three stages.First, the unpro- cessed raw projection vectors are multiplied by a precalculated weighting vector of the same length. Each X-ray projection is then filtered with a carefully formulated finite impulse response (FIR) filter kernel [17], either by direct one-dimensional con- volution in the spatial domain, or by means of the FFT or number-theoretic transforms [18], [191 and direct vector multiplication of the data vector and filter function vector in the frequency domain [2], followed by inverse transformation back to the spatial domain. Afterpreweighting and filtration, the projections are  back projected into the image space to generatethegray scale values of the image picture elements (pixels). To generate an image pixel by back projection, one ray sum from each of the filtered projections is selected according toa prespecified selection algorithm; all ray sum values are added together tocreate thegray scale value of the pixel [2] . Stated alternately, since each ray sum representsthe sum of the energy absorp- tion coefficients of the pixels through which its associated ray traversed, includingappropriate corrections for theactual ray path length through each pixel, back projection is thepro- cedure by which the magnitude ofeach ray sum is partitioned among those pixels. In addition, if the individualrays in the planar beam diverge in their passage through the structure, as is the case when the energy source radiates in a fan-shaped pat- tern, the intensity of the unattenuated beam decreases with increasing distance from the source. Hence, a weighting term must be incorporated into the calculation of the contribution of each ray sum to its associated image pixels as a function of the relative locations of the source and the pixels. The back projection equations for datacollected with such a fan-shaped beam are formulated so that the individual  ray which passed through eachimage pixel at a given angular position of thesource is identified, its corresponding ray sum is indexed from the array of all measured ray sums, and interpolation betweenmeasured ray sums is executed if a measured ray did not traverse the pixel in question (see below). The appropriate weighting factor to correct for the divergence of the beam is then calculated and applied to the ray sum to produce thecontribution to the pixel from that source position. The procedure is repeated for all angular positions of the energy source until the ray sums from all angles of viewhave been back projected to form thegray level of that pixel; theproce- -dure is then repeated for all the other image pixels. Several versions of direct fan beam reconstruction algorithms have appeared during the past few years [3], [13],[14], [16], [20],[21]. The equations for fan beam filtered back projection, assum- ing a curved X-ray detector strip [14]-[16] , are Zji = cJ i XAi N yj,, = E3 ali Zjj i=-N   A~r, 0) - E, Wj, ro YK(,rO) j=i preweighting (1) filtration (2) back projection (3) {r cos (j3- 4 1 K(j, r, )) = tan-'-O a D + rsin  pi -4)) }) ray sum index (4) 131+l - 13-1 Wjr4= 2U2 back U = [(r cos  i3 - 0))2 } projection + (D + r sin (pi -0 4))2 1 1/2 (5) in which the Xi, are samples of the unprocessed projection; the letter c is a constant;J(i) is a preweighting vector; and the a1,I are the elementsof a convolution kernel which is used to filter all projections [each projection contains (2N+ 1) ele- ments]. The Z1,i are elements of a preweighted projection prior tofiltration, and Ym,1 are the elementsof aprojection after filtering. The valuef(r, 4) is the magnitude of a back pro-jected pixel at location (r, 4) (the sign of 0 is positive for counterclockwise angular displacements measured from the positivex-axis), and the Yj,K represents the Kth ray sum from the jth filtered projection;the subscripts on Y, i.e., [j, K(j, r, 4)], representthe indexof ray sum Y, or theray sum index, referred to the image pixel at location (r, 4), from the jth projection. Although thesubscript i is an independent variable, K must be computed from (4). The -Wjir,, represents amultiplicative weight, applied to the Kth ray sumfrom the jth filtered projection undergoing back projection into the image pixel at location (r, 0). D represents the energy source- to-center-of-object distance; Piy is theangle between the y-axis and a line connecting the jth source position and the center of theobject; and ca is the angular increment between individualrays in each fan beam. D. Overview of Mechanizations of the Filtration Step The linearfiltration of the individual projections can be carried out by any viable filtration method. For computed tomography applications, direct convolution filtration has beenmost frequently employed, followed in popularity by an application of the classical FFT to implement a so-called  FFT-fast convolution linear filtration algorithm [22]. The number of multiplication and addition operationsrequired for Fourier filtration of vectors containing 256-1024 or more ele- ments is considerably smaller than for direct convolution approaches. The relative cost of arithmetic executed on gen-eral-purpose computers has been sogreat that thereduction in absolute numbers of computations achieved by the FFT-fast convolution method has resulted in a significant reduction in processing cost. In addition, although the FFT is capable of transforming vectors composed of complex elements, the elements of the projection data vectors to be filtered are pure real (i.e., not complex elements). In such a case, the effective throughputof the FFT algorithm canbe improved considerably by  packing two real vectors together and transforming them simultaneously; such approaches are referred to as  decima- tion in time or  decimation in frequency [23], [24]. None- theless, several characteristics of the classical FFT algorithms present their own unique sets of complications. FFT convolu- 100  GILBERT et al.: FAN BEAM IMAGERECONSTRUCTION ALGORITHMS tion, even of real vectors, requires executionof numerical operations in the domain of complex numbers; further, because these computations require use of trigonometric func- tions and thereby of irrational numbers, the availability of floating point arithmetic software and hardware capability is desirable (though not absolutely mandatory) to minimize com- putational roundoff error. Inaddition,to ensurethat aliasing errors [22] do not appear in the data vector during the trans- formation or inverse transformation processes, a set of zero elements must be included at each end of the vector to be transformed, thereby effectively increasing the bandwidth of the transformoperation itself. A variety of FFT algorithms are documented in [24] -[27]. The theory of linear transformationshas recently been expanded by the development of number-theoretic and finite- field transforms [18], [191, [28], [29] , a class of linear trans- formations defined to exist only on carefully selected finite fields of numbers, e.g., the subspace of real rational (integer) values. Although these new transforms require a much lower absolute count of arithmeticoperations even in comparison to the classical FFT, they appear torequire significantly greater numbers of logical operations and memory references, as well as very complicated control sequences to assure correct genera- tion of intermediate results. Though the finite field transforms appear slightly faster than the FFT when executed on general- purpose computers, it has yet to be demonstrated conclusively that they are superior to the classical FFT or direct convolu- tion methods in computedtomography applications. Direct convolution of real vector operands of fixed precision can be carried out entirely within the domain of real integers, thereby eliminatingthe requirement forfloating point pro- cessors. The conceptual simplicity of direct convolution, and the straightforward design, preparation, and testing of appro- priate software or hardware makes implementation of the direct convolution approach very attractive. In addition, direct convolution can be simplified by exploiting specific numerical characteristics of the filter function itself. For example, a considerable computational saving can be achieved if the filter function is even, i.e., symmetric about its ordinate, in the spatial domain. Inthat case halfthe kernel may be reflected about theordinate; since the filter weights a1 and a-i have the same numerical value, elements of theprojectionvector which multiply an1 and ai are first summed, and there- after multiplied by ai. Only the order of execution, not the number, of additions is altered, but the requisite number of multiplications is nearly halved. That this approach can always be invoked is assured by the requirement that a linear phase response be maintained by the filter function to prevent phase distortion of the filtered projections, and thereby of the complete reconstructions. Linear phase response is guaranteed only if the filter function is pure real in the spatial frequency domain, and therefore aneven function, i.e., symmetric about its ordinate, in the spatial domain [30]. Another simplification of direct convolution filtration is based on the extension of a well-known signal processingcon- cept, that of the half-band filter kernel. In such an imple- mentation, all elements in the filter vector with even indices, except the central (zeroth)element, are defined to be zero [31]. The resulting restriction in equivalent filtration band- width of thehalf-band function, in comparison tothefull-bandkernel, does not preclude satisfactory image quality provided thatthe bandwidth of the projection data is not too great. As an extension of this concept, it is alsopossible to con-ceptualize a filter kernel in which several consecutive elements are identicallyzero, followed by one or more consecutive non- zero elements, and so on [32]-[34], yieldinga  fractional- band filter kernel [35],with aneven more restricted filtration bandwidth (e.g., a  low-pass filter with an even lower cutoff frequency). The tendency ofsuch fractional-band filter kernels to smooth the reconstructed images excessively can be partially overcome by judicious selection of thevalues of the remaining nonzero elements. Algorithms have been developed recently [32] -[34] for generating fractional-bandkernels which appear to yield satisfactory reconstructed images from high resolution phantoms scanned in research X-ray and ultra- sound CT scanners.Direct convolution filtration can be further simplified by exploitingrecently developed techniques [32], [33], [36] which can convert agiven convolution kernel into one in which the central value is afixed precision irrational number, while all othervalues of the kernel are simple fixedprecision binary multiples of one another, i.e., values such as 2, 16, 64, or - 128. The conversion ofnonbinary filter weights to fixed precision binary numbersmakes it possible to execute the filtration primarily by binary shift and add operations, nearly obviatingthe requirement for execution of multiplications. This technique is so powerful when implemented on binary arithmetic computers that it will be discussed in greater detail. It is well known that the integral of a continuous convolu- tion kernel is zero; however, when such a kernel is converted intoa sampled versionfor digital processing, and truncated to an acceptable (i.e., noninfmite) length to allow reasonable execution times,the sum, E, of the resulting kernel can no longer be zero. Such nonzero-sum convolution kernels are widely used in computedtomography applications with accept- able results. Satisfactory image qualityhas been achieved by emulating the characteristics of these nonzero-sum kernels during the formation of a binary valued kernel from an arbi- trary convolution kernel. The binary kernel is generated in the following manner. Let S be a binary approximation to the central element of thetruncated nonbinary kernel. The truncated nonbinary kernel is divided by the value of its own central element to yield a normalized kernel K,, (i). The binarykernel KB (i) =2 (except for the central element i = 0) is obtained from K,, (i) such that lKn(i) - 2ml < W~n (i) 2ni, m =A n Iml - Inl < 1 (6) assuming m and n areintegers. The central element KB (0) is formed by summing the other binaryelements. Every element is then multiplied by S, which is merely a number of the form 101  IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. BME-28, NO. 2, FEBRUARY 1981 2 . Lastly, the central element is adjusted (incremented or decremented) by thevalue E. Thus, except for thecentral element, all other elements are binary numbers of the form - 2 . Lastly, acorrection must be made to all pixels of the final reconstructedimage, consisting of a constant offset to correct its average gray level, followed by a multi- plicativescaling of the entire image. It is of interest to note that the generation of a binarykernelfrequently results in several adjacent elements with the same numerical value. Thus, a furthersimplification in computation can be achievedduring the execution of each inner product phase of the convolution by first summing the data values corresponding to the identical kernel elements, and then by executing a single binary shift operation, in contradistinction to theusual case in which a binary shift operation is executed for each data value. Forexample, theusual convolution formulation for the projection P(i) and kernel K(i) can be rewritten in a somewhat simplified format from that of (2) as N (P*K)=Z K(n-j)P(n). (7) n=o When executed using binary kernels, (7) canbe rewritten as L (P * KB), P(j)KB(O) + g;n 2n (8) n=l Sn g7 = E P(t) (9) t=Rn where Rn and Sn are the lower and upper indices in the binarykernel of identical terms of power 2n, and L is the total num- ber of such groups. Furthermore, a considerableincremental savingsin addition instructions can be realized by a recognition that with increas- ing distance from the central elementof the kernel, i.e., at its symmetric  tail ends, the number of consecutive kernel ele- ments with identical numerical valuesincreasesrapidly. Rather than summing all projectiondata values to be multi- plied by a givenkernel element every time the index j is incre- mented (i.e., every time the projection is indexed past the kernel by one element position), it is less costly tosubtractthe  oldest projection sample from the running sum andadd a new projection sample to the running sum. This procedure is really efficient only if employed for the elements in the tails of the kernel, where many consecutive elements are iden- tical. This procedure is expressed by the relationship g7. = g7 - P(Rn) + P(Rna+)i (1 0) The algorithm described abovewasused to generate binary filter kernels whose performancewas then tested on simulated and real X-ray and ultrasound projection data. The fractional- band kernels and their binary versions were testedfor accuracy of reconstructed density values for X-ray projectiondata as well as for ultrasound time-of-flight data [32],[33]. Recon- struction of the same projectiondata sets, both with binary kernels and with the srcinalfull-valued kernels, followed by 0.50 Z 0.25 0L 0 0.25 0.50- FREQUENCY Fig. 1. Modulus of Fourier transform of convolution kernels. The kernels shown are Ram-Lak [14] and Shepp-Logan [37] and their binary versions. Since kernels are symmetric about ordinate, only half of Fourier domain is shown. Reproduced with permission from [33]. comparison of the resultingimages, demonstrated that the binary kernel reconstructions were both stable and accurate. However, several errors in the reconstructeddensity values were observed near the skull boundaries of very high contrastskull phantoms [37] T he frequency response of the binary kernels is given in Fig. 1, demonstrating the close approxima- tion of the binary kernels to thesrcinal kernels. If in some applicationsthe errors in reconstructed image density values caused by the binary kernel approximations are intolerable, the followingrecently developed technique for improvement of the results can be employed. Since the convo- lution algorithm is both associative and commutative, the con- volutionof a projection with an arbitrary full-valued kernel can be subdivided into severalsuccessive convolutions of the projection with a sequence of different binary kernels. The resulting intermediate convolved projections can then be summed to yield the final filtered projection. Forsuch a tech-nique,the binary approximation to thesrcinal kernel is obtained, and then convolved with the preweighted projection (e.g., using binary shift and add instructions). Next, a  residual convolutionkernel is obtained, which is the element-by- element difference between thesrcinalkernel and the binary kernel. A binary approximation to the residual kernel is then generated,the preweighted projection is again convolved, this time with theresidual binary kernel, and the result is added to the initial convolved projection generated by the first con- volution described. If desired, this successive approximation process can be repeated. In effect,the elements of the full- valued kernel are rewritten as sums and differences of binary numbers.Note that the central element of the kernel is always a nonbinary value and hence requires the execution of a fixedprecision multiplicationoperation for the evaluation of its contribution at each stage of the convolution. This technique 102
Related Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks