Efficient Image Restoration Algorithms for Near-Circulant Systems Except where reference is made to the work of others, the work described in this dissertation is my own or was done in collaboration with my advisory committee. This dissertation does not include proprietary or classifled information. Ruimin Pan Certiflcate of Approval: Thomas S. Denney Professor Electrical and Computer Engineering Stanley Reeves, Chair Professor Electrical and Computer Engineering Jitendra K. Tugnait Professor Electrical and Computer Engineering George T. Flowers Interim Dean Graduate School Efficient Image Restoration Algorithms for Near-Circulant Systems Ruimin Pan A Dissertation Submitted to the Graduate Faculty of Auburn University in Partial Fulflllment of the Requirements for the Degree of Doctor of Philosophy Auburn, Alabama May 10, 2007 Efficient Image Restoration Algorithms for Near-Circulant Systems Ruimin Pan Permission is granted to Auburn University to make copies of this dissertation at its discretion, upon the request of individuals or institutions and at their expense. The author reserves all publication rights. Signature of Author Date of Graduation iii Vita Ruimin Pan, daughter of Weizhong Pan and Xuejun Tan, was born in Chongqing, Sichuan Province, P.R.China on June 2nd, 1976. She received her Bachelor degree and MasterdegreeinElectronicEngineeringfromSoutheastUniversityinJuly, 1998, andMarch, 2001, respectively. She entered the Ph.D program in the Department of Electrical and Computer Engineering at Auburn University in August 2001. Her research interests are in the areas of digital signal/image processing, image restoration, compression and inverse problems. iv Dissertation Abstract Efficient Image Restoration Algorithms for Near-Circulant Systems Ruimin Pan Doctor of Philosophy, May 10, 2007 (M.E., Southeast University, P.R. China, March 2001) (B.E., Southeast University, P.R. China, July 1998) 130 Typed Pages Directed by Stanley J. Reeves The problem of image restoration has been extensively studied for its practical impor- tance as well as its theoretical interest. Restoration problem arises in almost every branch of engineering and applied physics. The goal of image restoration is to recover the original scene from the degraded observations. It seeks to model the degradations, blur and noise, and apply an inverse procedure to obtain an approximation of the original scene. Due to the fact that image restoration requires a huge amount of computation and storage, techniques and algorithms that can improve the speed or quality are desirable. A great variety of fast restoration methods have been proposed. The most well known fast restoration algorithms involve the use of fast Fourier transforms (FFT?s) to implement shift- invariant deblurring. However, in the face of unknown boundaries, shift-variant smoothing, and other conditions, FFT?s cannot be used in a straightforward manner in the inversion process. A group of e?cient image restoration algorithms for circulant or near-circulant systems are proposed in this dissertation that overcome some of these limitations in the direct v application of fast transforms. We assume that the system is a circulant or near-circulant system so that the convolution property of the FFT can be used. The regularization of the least-squares criterion is an efiective approach in image restoration to reduce noise ampliflcation. To avoid the smoothing of edges, edge-preserving regularization using a Gaussian Markov random fleld (GMRF) model is often used to al- low realistic edge modeling and provide stable maximum a posteriori (MAP) solutions. However, this approach is computationally demanding because the introduction of a non- Gaussian image prior makes the restoration problem shift-variant. In this case, a direct solution using FFT?s is not possible even when the blurring is shift-invariant. We consider a class of edge-preserving GMRF functions that are convex and have non- quadraticregionsthatimposelesssmoothingonedges. Weproposeadecomposition-enabled edge-preserving image restoration (DEEPIR) algorithm for maximizing the likelihood func- tion. By decomposing the problem into two sub-problems with one shift-invariant and the other shift-variant, our algorithm exploits the sparsity of edges to deflne an FFT-based iteration that requires few iterations and is guaranteed to converge to the MAP estimate. The assumption of a circulant system does not always hold under certain circum- stances. Many problems in signal and image processing require the solution of systems with a Toeplitz-block-Toeplitz (TBT) structure in which both the Toeplitz blocks and the block structure are banded. Some fast algorithms make use of the persymmetry (symmetry about the main antidiagonal) of the Toeplitz blocks, while the \bandedness" remains unexplored. Other algorithms exploit block bands but not banded blocks (or vice versa). vi We present a fast algorithm that exploits both the bandedness of the blocks and the block-bandedness of the block Toeplitz structure by extending the system to a circulant- block-circulant (CBC) system and solve for the original image by solving a larger system. Since a Toeplitz-block-Toeplitz system has a near-circulant structure, the computation in- volved in the extending and solving is comparatively small. Other published algorithms for TBT matrices typically involves O(M5) operations or O(6M3) operations with ?bandlimited assumption?. This method requires O(k2M3) operations for an M2 ? M2 Toeplitz-block- Toeplitz matrix with bandwidth k without any assumptions about the system. The edge-preserving regularization method proposed for edge preserving also has ap- plications in deblocking JPEG images where the block discrete cosine transform and quan- tization cause contouring and blocky artifacts in the compressed image. By integrating regularization into decompression of a compressed JPEG image, this algorithm signiflcantly reduces blocky efiects in the image. The proposed edge-preserving regularization scheme can be applied to reduce ringing artifacts on edges and smooth block boundaries in JPEG images. When the image restora- tion system is a TBT system instead of a CBC system, we can still make use of the FFT by extending and displacing the system to achieve a fast solution. In general, the pro- posed techniques in this dissertation improve the quality of restored images and improve the computational performance. vii Acknowledgments This dissertation would not have been possible without the generous help of my research advisor Stanley J. Reeves. I would like to sincerely thank him for his invaluable guidance, encouragement and inspiration during the course of my research. I am particularly indebted to him for having faith in me and never losing patience in answering my questions and concerns on both technical and nontechnical topics. Our stimulating discussions not only helped to shape the contents of this work, but also gave me a unique insight into the challenging process of conducting independent scientiflc research. I would like to thank Dr. Thomas S. Denney for being my committee member and setting up the computer lab where I conducted most of my experiments and simulations. I would like to thank Dr. Jitendra K. Tugnait and Dr. Amnon J. Meir for their advice on the thesis and for being on my committee. I would also like to express my deep gratitude to the Department of Electrical and Computer Engineering, my research advisor and the Air Force O?ce of Scientiflc Research for providing indispensable flnancial support for my graduate studies. Finally, I would like to especially thank my beloved parents, Weizhong Pan and Xuejun Tan, my husband, Tao Zhou for their love and support that have made this work possible. This dissertation is dedicated to them. viii Style manual or journal used Discrete Mathematics (together with the style known as \auphd"). Bibliograpy follows van Leunen?s A Handbook for Scholars. Computer software used The document preparation package TEX (speciflcally LATEX) together with the departmental style-flle auphd.sty. ix Table of Contents List of Figures xii 1 Introduction and Background 1 1.1 Digital Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Digital Versus Analog . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.2 Digital Image Processing System . . . . . . . . . . . . . . . . . . . . 2 1.1.3 Image Processing Methods . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Image Restoration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.1 Image Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.2 Ill-Conditioned Nature of Restoration . . . . . . . . . . . . . . . . . 10 1.2.3 Image Restoration Filters . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2.4 Regularized Restoration . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 Huber-Markov Edge-Preserving Regularization 22 2.1 Edge-Preserving Regularization . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 Restoration Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3 Huber-Markov Random Field Model . . . . . . . . . . . . . . . . . . . . . . 27 2.4 Majorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.5 HMRF Edge-Preserving Regularization . . . . . . . . . . . . . . . . . . . . . 37 2.6 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.7 Block-Based Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.7.1 Block-based Approximation . . . . . . . . . . . . . . . . . . . . . . . 48 2.7.2 Experiment on Block-based Approximation . . . . . . . . . . . . . . 50 2.7.3 Discussion about Block-based Approximation . . . . . . . . . . . . . 50 2.7.4 Analysis of Overlapping . . . . . . . . . . . . . . . . . . . . . . . . . 52 3 Fast Algorithm for Solving Block Banded Toeplitz Systems with Banded Toeplitz Blocks 57 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2 Solution Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.3 Generalized Displacement Rank . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.1 Deflnition and properties . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3.2 Choice of Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.4 Modifled System Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.5.1 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.5.2 Regularized Image Restoration . . . . . . . . . . . . . . . . . . . . . 76 x 3.6 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4 Deblocking of JPEG-compressed Images 80 4.1 JPEG Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.1.2 JPEG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.2 Discrete Cosine Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2.1 1-D DCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2.2 2-D DCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.3 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.4 Deblocking JPEG Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.4.1 Blocking Efiect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.4.2 Deblocking with Regularization . . . . . . . . . . . . . . . . . . . . . 92 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5 Conclusion and Discussion 107 5.1 Edge-Preserving Regularization . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.2 Fast Algorithm for Solving Block Banded Toeplitz Systems with Banded Toeplitz Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.3 Deblocking of JPEG-compressed Images . . . . . . . . . . . . . . . . . . . . 109 5.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Bibliography 111 xi List of Figures 1.1 Digital image processing system . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Contrast stretching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Image formation system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Inverse fllter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.5 Wiener fllter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1 Ringing efiect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2 Non-quadratic ?(?) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3 Huber function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.4 Majorizing function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.5 Original images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.6 Restored images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.7 Experiment on ?shape? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.8 Samford hall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.9 Overlapping blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.10 Block-based restoration of ?cameraman? image . . . . . . . . . . . . . . . . . . . 55 2.11 Normalized E image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.1 Restoration of Jupiter image . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.1 JPEG-compressed image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.2 JPEG encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.3 JPEG decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 xii 4.4 The eight basis vectors for the discrete cosine transform of length eight . . . . . . 101 4.5 Discrete cosine basis functions for N = 8 . . . . . . . . . . . . . . . . . . . . . . 102 4.6 Scalar quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.7 Zigzag scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.8 Blocking efiect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.9 Block boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.10 Vertical boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.11 Horizontal boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.12 Deblocking Lena image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 xiii Chapter 1 Introduction and Background 1.1 Digital Image Processing 1.1.1 Digital Versus Analog A digital image is an array of real or complex numbers represented by a flnite number of bits [1]. Normally, the term \digital image processing" refers to processing of a two- dimensional picture by a digital computer. Although an image by nature is usually an analog quantity, there are some compelling reasons why digital image processing is preferable to optical methods [2]. ? Increasingly, the natural form in which an image is formed and acquired is not analog but discrete. Digital imaging devices like digital cameras and camcorders have a CCD image sensor array that converts light into electrons and transport the charges to an analog-to-digital converter where the volume of charge is measured and converted to binary form. With the development of such devices, higher-resolution images are available so that the image quality is close enough to analog ones. ? There has been and will be a continuing increase in the price/performance capa- bilities of digital hardware. Computers with great computing power can perform image processing operations in seconds. Increasing sophistication in digital image input/output systems has resulted in eliminating what used to be a major source of errors in digital image processing [2]. Modern scanners, digital cameras and displays are not only accurate and reliable but also inexpensive. 1 ? Image processing algorithms have also developed vastly during the last three decades. Fast transform and fast convolution algorithms have led to several orders of magnitude improvement in the computation time. Digital image processing has a broad spectrum of applications, such as remote sensing via satellites, medical imaging, radar, sonar, and acoustic image processing. Image trans- mission and storage is also made easy with digital methods. In fact, there is almost no area of technical endeavor that is not impacted in some way by digital image processing. In nuclear medicine, the patient is injected with a radioactive isotope that emits gamma rays as it decays. Images are recorded from the emissions and used to locate sites of bone pathology, such as infections or tumors. Multi-spectral imaging is widely used in monitoring environmental conditions such as vegetation, maximum water penetration and soil moisture [3]. 1.1.2 Digital Image Processing System An image processing system consists of a sequence of steps. First, the object is observed and recorded on an imaging system. A typical example of an imaging system is the human eye. An image forms upon the human retina by the iris-lens portion of the eye, then the image is sent to the brain through the neural system. In a digital system, the image is sampled and quantized so that it can be stored in a digital medium. Then the digital image, normally a two-dimensional array is processed by computer with certain algorithms and sent to display or recording devices such as printers (Fig. 1.1). In general, any two-dimensional function that bears information can be considered an image [1]. An image can represent luminance of an object, temperature proflle of a region, 2 Store Imaging system Sample&quantize Digital storage Digitalcomputer Display Record Object Observe Process Outputdigitize Figure 1.1: Digital image processing system the absorption characteristics of the body tissue or the radar cross section of a target. An image can also be a color image or a monochrome image. Monochrome images are often described using gray levels. Color images have more than one representation; the most commonly used include the RGB model, CMYK model and YCbCr model. Another classical method of image representation is by an orthogonal series expansion, such as the Fourier series. According to Nyquist?s sampling theorem, when sampling a band-limited signal, the sampling frequency must be greater than twice the input signal bandwidth in order to be able to reconstruct the original perfectly from the sampled version. In reality, an image usu- ally is not strictly band-limited. However, by ignoring (or flltering out) the high-frequency components and sampling at a comparatively large sampling rate, the sample image can be close enough to the original version. After the image is sampled, it is quantized so that it has a flnite number of gray levels that can be stored in a computer. Quantization and sampling may introduce non-invertible errors and noise to the image and should be part of the image model in the processing algorithm. After an image is digitized, it is then processed according to certain image processing requirements. For example, image enhancement techniques could be applied to improve 3 the contrast or brightness of the image or highlight certain features of interest in it. If the image is blurry because of some optical distortion, it should be restored. With the dramatic booming of the internet, more and more information is transmitted online. To save bandwidth and improve the speed, most pictures and graphs are compressed images such as JPEG and GIF images. These images are converted from basic RGB data by all kinds of compression toolkits using compression algorithms such as the Discrete Cosine Transform, Hufiman coding or LZW compression. The output of an image processing system can be sent to a display device such as a computer monitor or some recording devices like printers or plotters. 1.1.3 Image Processing Methods Digital image processing is a general concept referring to a group of difierent operations on an image. Mathematically, any operation on a two-dimensional matrix is meaningful in some way. But for an image, there are a few special operations besides image restoration that help extract useful information from the array. Image enhancement is among the simplest and most appealing areas of digital image processing [3]. Typical image enhancement includes contrast manipulation, histogram mod- iflcation, noise cleaning, edge sharpening and color enhancement. Most acquired images are not in ideal condition; for example, an image might be too dark to tell any details. To accentuate certain features in the image for future analysis, brightness or contrast can be improved by amplitude rescaling of each pixel, or the image histogram can be stretched to separate difierent gray levels (Fig 1.2). In many image processing systems, image enhance- ment is used as the pre-processing step. 4 Figure 1.2: Contrast stretching Image reconstruction is closely related to image restoration. It difiers from image restoration in that image reconstruction operates on a set of image projections and not on a full image. Image reconstruction shares the same objective with restoration|recovering the original image|and ends up solving the same mathematical problem, which is flnding a solution to a set of linear or nonlinear equations [4]. Geometric transformation is another common image processing operation, in which an image is spatially translated, scaled, rotated or nonlinearly warped. Image translation, scaling and rotation are all linear operations on the coordinates while image warping is a nonlinear mapping of coordinates. It can also be viewed as a spatial distortion or rubber- sheet stretching. Geometric transformations have found uses in medical imaging, computer vision, and computer graphics. Image analysis refers to the extraction of measurements, data or information from an image by automatic or semi-automatic methods. It is distinguished from other types of im- age processing, such as enhancement, restoration, in that the ultimate product of an image 5 analysis system is usually numerical output rather than a picture [5]. Edge detection is a commonly used image analysis operation. Discontinuities in an image amplitude attribute such as luminance are fundamentally important primitive characteristics of an image be- cause they often provide an indication of the physical extent of objects in the image. In edge detection, an edge model is established flrst, then a certain algorithm is applied to match the edge model with the original image, thus flnding the edges. First or second derivatives are often used in edge detection. Image compression is concerned with minimizing the number of bits required to rep- resent an image. In broadcast television, remote sensing, teleconferencing and computer communications, fast transmission and small storage are of signiflcant practical and com- mercial interest. Data compression exploits data redundancy to represent the information with less data. In an image, certain correlations exist among pixels. Similarly, inter-frame redundancy exists among frames of a video. Additionally, a lower image fldelity criterion requires less data and can still present enough information since human eyes are relatively tolerant when measuring image quality. With the development of fast algorithms, more and more applications of image compression are available. For example, almost all images on the internet are compressed images such as JPEG or GIF images. Other image analysis such as texture deflnition, image segmentation and registration are also widely used in separating text or difierent patterns, evaluating image feature and pattern recognition. 6 1.2 Image Restoration 1.2.1 Image Formation Anyimageacquiredbyopticalorelectronicmeansislikelytobedegradedbythesensing environment. Sensor noise, blur due to camera misfocus, relative object-camera motion or random atmospheric turbulence contribute to the degradation. A degraded image is often a blurry image. The output of an image restoration system is called a restored image. The efiectiveness of image restoration depends on the extent and the accuracy of the knowledge of the degradation process as well as on the fllter design criterion [1]. As in image enhancement, the ultimate goal of image restoration is to improve an im- age in some sense. The major difierence between enhancement and restoration lies in the judgement of the result. Image enhancement is largely a subjective process in that it ma- nipulates the luminance or colors of the image to make certain features available to human eyes. On the other hand, image restoration recovers an image using a priori knowledge about the degradation and the result is evaluated using objective criteria like mean square error. Image restoration problems can be quantifled precisely, whereas enhancement criteria are di?cult to represent mathematically [1]. g(x;y) Imageformationsystem Objectf(?;?) Image Figure 1.3: Image formation system 7 Since image restoration is the inverse procedure of image formation, it is necessary to understand the process of forming an image to better understand the process of restoring one. An image formation system is where the object is mapped onto the image plane and an image is formed. Either the object is illuminated by a source of radiant energy or itself is a source of radiant energy. No matter what kind of imaging system is used, digital or optical, the image g(x;y) (Fig. 1.3) on the image plane is a projection of object f(?;?). It should be noted that images are representations of objects that are indirectly sensed and that various forms of radiant energy transport are the mechanisms by which the sensing is carried out [2]. So there are a few general principles for image formation: ? Neighborhood Processes. The point (x0;y0) on the image plane comes not only from (?0;?0), but also all other points in the object plane. That is, all points on the object plane may have an efiect on any point on the image plane. It is also reasonable to expect that as the distance from the object (?0;?0) to other points in the object plane increases, the efiect on point (x0;y0) decreases. In other words, the image formation process is a neighborhood process [2]. ? Nonnegativity. An image is formed by the transport of radiant energy. Radiant energy is nonnegative. The smallest possible amount of radiant energy is zero. i.e., f(?;?) ? 0 g(x;y) ? 0 for any ?; ?; x and y. 8 ? Superposition. radiant energy is superposable. Consider two points in the object plane (?0;?0) and (?1;?1). They both emit radiant energy and form image (x0;y0) and (x1;y1) respectively on the image plane. The result on the image plane is the superposition of energy distribution corresponding to all x and y?s in the object plane. Accordingly, if we consider a single point source in the object f(?;?), the function that describes the transformation of energy from object plane to image plane should be as follows: g(x;y) = h(x;y;?;?;f(?;?)) If the above image formation system is linear, h(x;y;?;?;f(?;?)) = h(x;y;?;?)f(?;?) By superposition, the image g(x;y) on the image plane is formed from contributions of all object points in the object plane. g(x;y) = Z 1 ?1 Z 1 ?1 h(x;y;?;?;f(?;?)d?d? For the linear case, g(x;y) = Z 1 ?1 Z 1 ?1 h(x;y;?;?)f(?;?)d?d? where function h(x;y;?;?) is the point-spread function (PSF). Generally, a point-spread function varies with the position in both image and object plane since it is a function of (x;y) and (?;?). Nevertheless, some imaging systems act uniformly across image and object planes. Such an image formation system is said to have a shift-invariant point-spread 9 function. The function h(?) is a function only of the difierence in variables in the coordinate systems [2]. g(x;y) = Z 1 ?1 h(x??;y ??;f(?;?))d?d? For the linear case, g(x;y) = Z 1 ?1 Z 1 ?1 h(x??;y ??)f(?;?)d?d? (1.1) = h(x;y)?f(x;y) where ? represents linear convolution. For a digital imaging system, the image is sampled and quantized during the process of energy transformation. Now the acquired image is a 2D matrix with positive values. A shift-invariant and linear digital imaging process can be described in (1.2) g(m;n) = X k;l h(m?k;n?l)f(k;l) (1.2) 1.2.2 Ill-Conditioned Nature of Restoration In (1.2), the output of a linear shift-invariant system g(x;y) is the convolution of the point-spread function and the original image f(?;?). In the presence of additive noise, the expression of the linear degradation model becomes g(x;y) = h(x;y)?f(x;y)+n(x;y) (1.3) 10 The values of the noise term n(x;y) are random, and the statistics are assumed to be independent of position. We can express (1.3) in the frequency domain G(u;v) = H(u;v)F(u;v)+N(u;v) (1.4) The convolution in (1.3) now becomes multiplication according to the property of the Fourier transform. Equations (1.3) and (1.4) are the spatial-domain and frequency- domain representation of a linear shift-invariant noisy imaging system. The advantage of a frequency-domain representation is the use of an FFT algorithm. Although real imaging systems may not be linear and shift-invariant, most of them can be approximated by such processes in order to make use of the extensive tools of linear system theory. In fact, nonlinear and shift-variant models introduce di?culties that often have no known solution or are very di?cult to solve computationally, even though they are more general and accurate. In most image restoration systems, noise is assumed to be independent of spatial coor- dinates and uncorrelated with respect to the image itself. There are exceptions like X-ray and nuclear-medicine imaging. In this dissertation, we restrict our discussion to the case of independent and uncorrelated noise. A great variety of methods have been proposed for image restoration problems during the last 30 years. But without common benchmarks or points of reference, it is impossible to pick the optimal method. Even if a few benchmarks have been established, it is still di?cult to provide a unifled view of image restoration. This is partly due to the ill-conditioned nature of image restoration problems. 11 In the imaging model above, the image g is a transformation from the original function f. Suppose there is no noise in this system, then g(x;y) = h(x;y)?f(x;y) (1.5) = H[f(x;y)] Given the blurred image g and knowledge of the transformation H, the problem seems to be to flnd the inverse transformation H?1 such that H?1[g(x;y)] = f(x;y) (1.6) Then the existence and uniqueness of the inverse transform H?1 are important in flnding f(x;y). If H?1 does not exist, there is no mathematical basis for asserting that f(x;y) can be exactly recovered from g(x;y); it is then called a singular problem. Finally, if there is a unique H?1, it might be ill-conditioned. That is, a trivial perturbation in g(x;y) can produce nontrivial perturbations in f(x;y). H?1[g(x;y)+?] = f(x;y)+? where ? ?; ? is not arbitrarily small and is not negligible. In other words, an ill- conditioned problem is one in which inherent data perturbations can result in undesirable efiects in the solution by inverse transformation [2]. In actual calculation, small perturbations like computer round-ofi errors can present undesirable efiects in the solution by inverse transformation. Besides the ill-conditioned 12 nature of image restoration problem, the existence of noise makes it impossible to flnd the exact solution f. In conclusion, image restoration is an ill-conditioned problem at best and a singular problem at worst [2]. 1.2.3 Image Restoration Filters Because of the ill-conditioned nature of image restoration problems, the seemingly feasible inverse fllter will give an unstable solution. Let G(u;v) be the frequency domain representation of the blurred image, H(u;v) be the point-spread function and F(u;v) be the original image. By (1.4), G(u;v) = H(u;v)F(u;v)+N(u;v): The inverse fllter will produce an estimate ^F(u;v) = G(u;v) H(u;v) (1.7) = F(u;v)+ N(u;v)H(u;v) Notice the second term N(u;v)H(u;v) is actually unknown since the noise is a random function. Furthermore, if the degradation H(u;v) has zero or very small values, the ratio N(u;v)H(u;v) could easily dominate the estimate F(u;v). In fact, this is frequently the case. Figure 1.4 shows the restoration using inverse fllter. To handle noise and avoid the zero or small-value problem, Hellstrom [6] proposed the application of the Wiener fllter in image processing in 1967. The Wiener fllter is founded on considering images and noise as mutually uncorrelated random processes, and the objective 13 is to flnd an estimate ^f of the uncorrupted image f such that the mean square error between them is minimized [3]. ^f = argmin ^f Efeteg (1.8) = argmin ^f Ef(f ? ^f)t(f ? ^f)g where e = f ? ^f. Let f and g be column-ordered original and degraded images and H be the corresponding matrix for the blur h, then the blurred image g = Hf +n and the estimate ^f = Lg; where L is the matrix solution that restores f from g. The mean square error is as follows: Efeteg = Eftr[eet]g (1.9) = Eftr[(f ?Lg)(f ?Lg)t]g = Eftr[(f ?L(Hf +n))(f ?L(Hf +n))t]g = Eftr[fft ?L(Hfft +nft)?(fftHt +fnt)Lt +L(HfftHt +nftHt +nnt)Lt]g 14 Since the trace is a linear operation, it interchanges with the expectation operator E and Effntg = Efnftg = 0; Efeteg = Tr[Rf ?2LHRf +L(HRfHt +Rn)Lt]; (1.10) where Rf = Effftg and Rn = Efnntg. Taking derivative of (1.10) with respect to L and setting it to 0, we have L = RfHt(HRfHt +Rn)?1: (1.11) Thus the estimate ^f = Lg (1.12) = RfHt(HRfHt +Rn)?1g In frequency domain, the solution can be expressed as F(!1;!2) = H ?(!1;!2)SfG(!1;!2) jH(!1;!2)j2Sf +Sn ; (1.13) where Sf and Sn are the power spectral densities of the original image f and the noise n. Figure 1.5 shows the Lena image restored with a Wiener fllter. The Wiener fllter minimizes the mean square error between the estimate and the original, so it is also called the minimum mean square error fllter. Notice in (1.13), when there is no noise, i.e., Sn = 0, it becomes the inverse fllter. The Wiener estimate has desirable properties, since it controls ill-conditioning in a fashion that is explicitly determined by the signal-to-noise ratio as a function of spatial 15 frequency in the term Sn=Sf. Although there has been much success in applying the Wiener fllter to images from a variety of real-world problems, the results of Wiener estimation do not appear as good visually as those produced by other criteria in low signal-to-noise ratio (SNR) cases [2]. The following factors contribute to this shortcoming, ? The Wiener fllter is based on linear assumptions, and there are nonlinearities in image recording and in the human visual system. ? Mean squared error is not the criterion that the human visual system naturally em- ploys. Wiener restoration in low SNR cases appears too smooth; the human eye is not often willing to accept more visual noise in exchange for the additional image structure lost in the process [2]. ? The Wiener estimate assumes stationary random process models, which is probably insu?cient to describe the meaningful structure adequately in images of interest, and uses only the covariance information of the stationary model in the estimate [2]. In general, the Wiener fllter is only optimal in an average sense since it minimizes a statistical criterion. It might not be the optimal solution for a speciflc image. In the mean time, it presents another di?culty in addition to the problem of having to know something about the degradation H: the power spectra of the original image and noise must be known. On the other hand, the inverse fllter can be derived from another approach. The objective function W(f) = kg?Hfk2 aims at minimizing the difierence between the noisy blurred image and the degraded image without noise. W(f) reaches its minimum value of 0 when g = Hf, i.e., there is no noise. Take derivative of W(f) with respect to f and set 16 it to 0. The minimizer of W(f) is ^f = H?1g; which is exactly the same as inverse fllter. As mentioned earlier, the inverse fllter does not deal with the noise because it assumes there is zero noise. In order for the fllter to have more efiect than simple inversion, a constraint is developed so that there is more control over the restoration process. First, the cost function W(f) = kg?H ^fk2 ? ?2 subject to kL^fk2 ? E2 where ? and E are some flxed values and the control variables in the fllter. L is a high-pass fllter so that it imposes a smoothness condition on the restored image. After the Lagrangian multiplier fi is applied, the overall objective function becomes `(^f) = kg ?H ^fk2 +fikL^fk2 where fi = ( ?E)2. By minimizing the objective function, we minimize the data error but bias solution away from \rough" restored image. This is called the constrained least-squares fllter. It is in the class of Wiener fllters but has much more applications than general Wiener fllter because of the control variable fi. 1.2.4 Regularized Restoration As described in previous discussion, noisy, blurred images are often represented in the following algebraic model: g = Hf +n; 17 where the linear degradation H is known and n is random noise uncorrelated with f. f and g are column-ordered versions of the original and the blurred images. Image restoration is an ill-conditioned problem and sometimes even a singular problem. Therefore, the exact original image f cannot be computed from the blurred image g. Instead, restoration algo- rithms focus on achieving an estimate ^f that is as close as possible to the original image [4]. Among all sorts of restoration algorithms, the constrained least-squares fllter not only avoids rough results but also gives us more control over the restoration procedure. The objective function for the constrained least-squares fllter is as follows: `(^f) = kg ?H ^fk2 +fikL^fk2 (1.14) To flnd the minimizer ^f to (1.14), we take a derivative with respect to f and set it to 0. @` @f = ?2H tg +2HtHf +2fiLtLf = 0; thus ^f = (HtH +fiLtL)?1Htg (1.15) or in the frequency domain ^F = H?G kHk2 +fikLk2: (1.16) There are a few interesting facts about regularized restoration. 18 ? When fiLtL = RnR?1f , where Rn and Rf are the autocorrelation matrices of the noise and the original image, the regularization fllter is actually a Wiener fllter (see equation 1.13). ? To gain more control over the restoration and avoid ringing efiects, \norms in a weighted Hilbert space" can be used to adapt the restoration process to local proper- ties of the image [7]. In other words, ^F = (HtRH +fiLtSL)?1HtRg (1.17) where R and S are diagonal matrices containing weight coe?cients with prior infor- mation about the original image. 19 (a) Original image (b) Noisy blurred image (c) Restored image using inverse fllter Figure 1.4: Inverse fllter 20 Figure 1.5: Wiener fllter 21 Chapter 2 Huber-Markov Edge-Preserving Regularization 2.1 Edge-Preserving Regularization As discussed in previous chapters, by using regularization, a numerically good esti- mation to the original image can be obtained. However, many restored images are liable to incur the phenomenon of ringing (over- and undershoots, super-blacks and whites) [7]. The ringing efiects occur on boundaries of the image and in the vicinity of steep intensity transitions such as edges. In linear shift-invariant restoration fllters, the ringing efiects are caused generally by the poor match between the stationary assumed image model and the actual image data, because shift-invariant regularization penalizes high frequencies at the same level across the image. Fig 2.1 shows the ringing efiect in the cameraman image. The ringing efiects near the boundaries of the image result from the intensity jumps at the boundaries, which introduce leakage frequencies. When an FFT is used for fast compu- tation, it assumes that signals are periodic. Because of the sudden change on boundaries, the image spectrum is broadened, the leakage frequencies get amplifled, thus introducing the ringing phenomenon. Several methods have been proposed to reduce this kind of ringing artifacts [8, 9]. On the other hand, ringing efiects near edges are dependent on the local structures within the image. To reduce this kind of ringing artifact, difierent penalties should be applied. In (1.14), the minimization process penalizes the square of local pixel difierences, which will generate an estimate ^f either excessively noisy or generally blurred. This is because the squared difierence of pixel values is so high on edges that the penalties are too 22 (a) (b) Figure 2.1: Ringing efiect (a) blurred cameraman image (b) restored image with ringing efiects high. One way to avoid it is to reduce the penalties at edge locations. In a shift-invariant regularization, the cost function (1.14) is quadratic everywhere in the image. Normally the squared pixel difierence, or the value of kLfk2 in (1.14) is relatively high in the vicinity of edges but much smaller in smooth areas. In order to put less penalties on edges but a normal amount of penalty elsewhere, a non-quadratic cost function is often proposed in edge-preserving regularization algorithms. In (1.14), the flrst term kg ? H ^fk2 is the data-matching term, where the difierence between the noisy image and the noiseless blurred image is minimized. The second term kLfk2 is the constraint that minimizes the roughness of the solution. It is largely dependent on the local properties of the image. To apply difierent constraints or penalties, we can use 23 a more general cost function instead of (1.14) ^`(^f) = kg ?H ^fk2 +?(L^f) (2.1) In (1.14), ?(L^f) = kL^fk2; which is too high a penalty near edges. A smaller penalty can be applied to edge locations if ?(x) has smaller values for large x. That is, the function is no longer quadratic outside of a certain region but may be, for example, a small constant or a linear function. In flgure 2.2(a) and (b), ?(x) is at or straight lines for large x values. There can be a lot of difierent shapes for the penalty as long as they have smaller values than the quadratic function [10]. Notice that when x is small enough, which is the case for smooth regions in an image, the quadratic penalty is ok. Most edge-preserving regularization algorithms exploit the shift-variant property of penalty functions although difierent kinds of non-quadratic or half-quadratic cost functions may be used. 2.2 Restoration Models Shift-invariant image restoration algorithms often introduce ringing efiects near sharp intensity transitions. To reduce the ringing efiects, one can make use of deterministic a priori knowledge about the original image or locally regulate the severity of the noise magniflcation and the ringing phenomenon depending on the edge information in the image [7]. Many papers also have described edge-preserving methods for image reconstruction and 24 ?(x) x (a) Flat ?(?)?(x) x (b) Linear ?(?) Figure 2.2: Non-quadratic ?(?) image restoration based on the Bayesian formalism with non-Gaussian priors [11, 12, 13]. Most of these methods incorporate the edge information as a constraint in the optimization problem. Typical regularization approaches, including the constrained least-squares (CLS) and the Tikhonov-Miller formulations, utilize quadratic functionals and compensate for the ill- posedness of the restoration problem by applying prior information such as smoothness into the restoration process [14, 15]. Quadratic functions are attractive because they enable the 25 analytic derivation of the corresponding estimators and provide cost-e?cient implementa- tion. Robust estimation, however, induces nonquadratic objective functions that can be utilized in the representation of the prior signal statistics [16]. The existence of sharp edges manifests uncertainty regarding the distribution of the signal [17, 18]. To capture the de- tailed structure around sharp edges, most robust functions employed in these formulations induce limits and/or structural parameters that must be selected appropriately. Such func- tions have been motivated in stochastic maximum a posteriori (MAP) formulations under Markov random flelds with nonquadratic Gibbs distributions [19, 20]. A MAP technique maximizes the conditional probability of a restored image given a certain blurred image. For MAP estimation, the image estimate ^x is given by ^x = argmaxx L(xjy) where L(xjy) is the log-likelihood function. This function can be computed by using Bayes? formula, with [21] L(xjy) = logPr(xjy) (2.2) = logPr(yjx)+logPr(x)?logPr(y): Since the term logPr(y) is independent of x, it can be eliminated from the optimization in (2.3). Then the MAP estimate of the image becomes ^x = argmaxx flogPr(yjx)+logPr(x)g (2.3) = argminx f?logPr(yjx)?logPr(x)g 26 For a noisy system (y = Ax+n;where n 6= 0), the conditional density Pr(yjx) is related to the noise density by P(yjx) = Pr(n)kn=y?Ax Thus, for cases with i.i.d. Gaussian noise, the conditional density has the Gaussian form Pr(yjx) = 1 (2? 2)MN2 exp ? ?ky ?Axk2 2 2 ! where 2 is the variance of the density, M and N denote the dimensions of the image, and k:k is the Euclidean norm. 2.3 Huber-Markov Random Field Model For the second term Pr(x) in (2.3), a Gaussian Markov random fleld (GMRF) model is commonly chosen as the prior image model. This density has the form Pr(x) = 1 (2?)MN2 jCj12 exp ? ?12xtC?1x where C is the covariance matrix. By decomposing C?1 into a sum of products, the above can be written as: Pr(x) = 1 (2?)MN2 jCj12 exp ( ?12 X c2C xtdcdtcx ) where dc is a coe?cient vector for clique c and is determined by a priori assumptions about the image. While the GMRF prior has many analytical advantages, it generally results in estimates ^x which are either excessively noisy or generally blurred. This is because the squared 27 difierence of pixel values applies too high a penalty to edges that often occur in images [17]. Besides Gaussian Markov Random Field models, non-Gaussian MRF?s are also interesting because they can potentially model both the edges and smooth regions of images. Early approaches often used an additional unobserved random fleld called a line process that determines the location of edges [22, 23]. Many non-Gaussian MRF methods have focused on MRF?s with a simpler Gibbs distribution of the general form logg(x) = ? X fs;rg2C bsr?(?jxs ?xrj)+constant where ? is a scaling parameter, and ? is a monotone increasing but not convex function [24, 25, 20, 26]. However, analysis on non-Gaussian MRFs shows unstable behavior of the MAP estimate under the nonconvex prior [24]. Not only does the solution often depend substantially on the method used to perform the minimization, it often puts the same penalties on any edges. (see Fig 2.2 (a)) [17]. Consequently, the MAP estimate may abruptly change as the magnitude of an edge increases, which may lead to an unnatural quality in the reconstruction. Recently, convex functions have also been considered for ?(?) [19, 20, 27]. Green [20] employed a function of the form ?(?) = 2T2 logcosh(?T ) which produces useful Bayesian estimates of emission tomograms. This potential function is approximately quadratic for small ?, and linear for large values. Lange [26] derived several other strictly convex potential functions in a study of convergence of the expectation 28 maximizationalgorithm. StevensonandDelpstudiedtheuseofanalternativeconvexenergy function for the problem of surface reconstruction [19]. They chose the Huber function flrst introduced in robust statistics [28]. An MRF image model with Gibbs density function has the form [21] Pr(x) = 1Z exp ? ?1? X c2C Vc(x) ! (2.4) where Z is a normalizing constant, ? is the \temperature" parameter of the density, Vc is some function of a local group of points c called cliques, and C denotes the set of all cliques throughout the image [21, 22]. Let us denote the criterion to be minimized (the negative log-likelihood) as M?[x;T]. If the original image is corrupted with i.i.d. Gaussian noise, the functional for the noisy image will be: M?[x;T] = ky ?Axk 2 2 2 + 1 ? X c2C Vc(x) The selection of Pc2C Vc(x) is the key factor to the quality of the estimate. To make the problem well-posed, we only consider convex functions. A convex surface also has an obtainable global minimum, which makes the minimization job easier. Otherwise, local minima will be present, and the function must be minimized through a computationally expensive technique such as simulated annealing [21, 22]. The image prior in (2.4) can be rewritten as Pr(x) = 1Z exp ( ?1? X c2C ?(dtcx) ) where ?(?) is some function satisfying the following properties: 29 ? Convexity: 8fi;x;y 2 R; ?[fix+(1?fi)y] ? fi?(x)+(1?fi)?(y); ? Symmetry: 8x 2 R; ?(x) = ?(?x); ? Allows regions of discontinuities: for jxj large, ?(x) < x2 The convexity property of ?(?) insures that the sum of clique functions in the expo- nential argument remains convex. The smoothness measure dtcx has a small magnitude in smooth regions and a large magnitude in discontinuous regions. Edges in an image are thus penalized to a lesser extent by a cost function ?(dtcx) that increases less rapidly than the quadratic cost (dtcx)2 [21, 29]. Following the discontinuity-preserving stabilizing functionals in Tikhonov regulariza- tion, Stevenson and Delp used the Huber-Markov Random Field, where ?(?) is deflned as ?T(x) = 8 >>< >>: x2; jxj? T T2 +2T(jxj?T); jxj > T (2.5) 30 Huber function ?T T x Figure 2.3: Huber function The Huber function is convex and nonquadratic [28] (see Figure 2.3). The methodology here can be applied to any penalty function that is quadratic in a non-edge region around the origin. However, we restrict the development to the Huber function to keep the discussion speciflc. By observing the deflnition above, we can see that below the threshold in ?T(dtcx) the quadratic penalty applies. If the value is greater than the threshold, a linearly varying cost is used, and the discontinuities of the original image are not penalized as severely as with a quadratic. This characteristic provides an edge-preserving regularization efiect. We consider an approximately rotationally symmetric operator within a 3 ? 3 grid. Now the exponential kernel of the Huber-Markov random fleld (HMRF) image model is ?[x;T] = X c2C Vc(x) = X k X l 3X m=0 ?T(dtk;l;mx) 31 Finite-difierence approximations to second-order derivatives are used to deflne the image roughness measure at pixel xk;l, which are given as dtk;l;0x = xk;l+1 ?2xk;l +xk;l?1 dtk;l;1x = 12(xk?1;l+1 ?2xk;l +xk+1;l?1) dtk;l;2x = xk?1;l ?2xk;l +xk+1;l dtk;l;3x = 12(xk?1;l?1 ?2xk;l +xk+1;l+1) These four discrete directional derivatives approximate a rotationally invariant operator such that approximately the same image is produced by the restoration regardless of image orientation [21]. d3 d0 d1d2 32 2.4 Majorization By using the MAP estimation and the Huber-Markov random fleld image model, we have the functional to minimize: M?[x;T] = ?[x;T]+ ?2 2ky ?Axk2 (2.6) = X c2C Vc(x)+ ?2 2ky ?Axk2 = X k X l 3X m=0 ?T(dtk;l;mx)+ ?2 2ky ?Axk2 where ?(dtk;l;mx) = (dtk;l;mx)tdtk;l;mx for non-edge pixels (jdtk;l;mxj ? T) and ?(dtk;l;mx) = T2 +2T(jdtk;l;mxj?T) for edges (jdtk;l;mxj > T). As previously mentioned, the HMRF is a convex but non-Gaussian MRF; i.e., it is nonquadratic and convex. The HMRF will pose a nonquadratic cost function that is not easy to minimize compared to the one based on the GMRF. To simplify the minimization process and guarantee convergence, one can use a quadratic surrogate function with the same minimizer ^x that is easy to minimize. Thus we can minimize (2.7) by minimizing the surrogate function. The use of surrogate functions is common in solving minimization problems. Many variations have been proposed, among them is \majorization" proposed by Stoica and Sel?en [30]. A function g( ) is said to majorize the function f( ) at i if f( i) = g( i) f( ) ? g( ) for all 33 and i is called the majorizing point. To minimize (2.7), we deflne a series of functionals Ni?[x;T] such that Ni?[xi;T] = M?[xi;T] Ni?[x;T] ? M?[x;T] for all x (2.7) Assuming that the minimization of Ni?[x;T] at each i with respect to is easier than the minimization of M?[x;T], we can minimize Ni?[x;T] directly and quickly converge to the solution of M?[x;T]. This iterative algorithm is called majorization [30]. Let xi denote the minimum of Ni?(x;T). This point is then used as the majorizing point of the next function Ni+1? (x;T), of which xi+1 is the minimum. This procedure is repeated until the minimum of Nj?(x;T) at a certain j is close enough to argminx M?[xi;T]: This technique satisfles the descent property M?(xi+1;T) ? M?(xi;T); since Ni+1? (xi+1;T) ? Ni+1? (xi;T): Thus M?(xi+1;T) ? Ni+1? (xi+1;T) 34 ? Ni+1? (xi;T) = M?(xi;T): M?(xi+1;T) = M?(xi;T) only when Ni+1? (xi+1;T) = Ni+1? (xi;T) (2.8) and M?(xi+1;T) = Ni+1? (xi+1;T) (2.9) i.e., when xi is already the minimum of M?[xi;T] ( 2.8 ) and the majorizing function is M?[xi;T] itself (2.9). In other words, as long as we have not reached the global minimum argminx M?[xi;T]; then M?(xi+1;T) < M?(xi;T): Convergence of this algorithm is guaranteed. To form the majorizing functional Ni?[x;T] in this application, we need a quadratic function that has the same minimum. One practical choice Ni?(?) for ?(?) is deflned as follows: Ni?(x) = 8 >>< >>: x2 jxij? T T jxijx 2 +Tjxij?T2 jxij > T (2.10) 35 where xi is the majorizing point of function Ni?(?) at the ith iteration. The majorizing function is shown in Figure 2.4. Ni?(x;T) $x$ -T T-x x ? Figure 2.4: Majorizing function Note that Ni?(x) is quadratic everywhere, with a constant smoothing penalty where jxj? T (non-edge regions) and a penalty that varies with xi elsewhere (edge regions). Let Dm denote the convolution operator corresponding to X k;l dtk;l;mx. Consider the deflnition of Ni(:). Since a constant term in the objective functional has no efiect on the solution, Tjxij?T2 can be ignored, which leaves the modifled Ni?[x;T] (denoted as ~N) as: ~Ni?[x;T] (2.11) = PkPlP3m=0 ?(dtk;l;mx)+ ?2 2ky ?Axk2 = P3m=0(Dmx)T?0mDmx+ ?2 2ky ?Axk2 36 where ?0m are diagonal matrices whose elements are the quadratic scale factors in (2.10) (either 1 or Tjxij). 2.5 HMRF Edge-Preserving Regularization Since the cost function (2.11) has the exact same structure as the constrained mini- mization method known as Tikhonov regularization, the minimum of ~Ni?[x;T] is given below [31]: ^x = (ATA+fi 3X i=0 DTi ?iDi)?1ATy: (2.12) This system cannot be solved e?ciently even though A and Di represent shift-invariant operations because of the presence of the ?i. Suppose instead that we have a matrix B = (ATA+fi 3X i=0 DTi Di)?1: (2.13) B can be inverted via FFTs because it is shift-invariant. If we can rewrite the inverse in (2.12) in terms of B, we can invert it e?ciently. The Sherman-Morrison matrix inversion is given by: (X +WYZ)?1 = X?1 ?X?1W(Y?1 +ZX?1W)ZX?1; (2.14) Applying this formula to (2.12), (ATA+fiP3i=0 DTi ?iDi)?1 (2.15) = (B?1 ?fiP3i=0 DTi (I ??i)Di)?1 = (B?1 ?fiP3i=0 dTi di)?1 37 = (B?1 ?fidTd)?1 = B +fiBdT[I ?dBdT]?1dB In the above expressions, d = 2 66 66 66 66 66 4 d0 d1 d2 d3 3 77 77 77 77 77 5 where d0;d1;d2;and d3 are matrices formed from the rows of D0;D1;D2;and D3 corre- sponding to the diagonal elements of ?0;?1;?2;and ?3 that are not equal to 1. From this, it follows that: ^x = (B +fiBdT[I ?dBd]?1dB)ATy (2.16) = ^xfi +BdT! where ^xfi = BATy (2.17) and ! = fi[I ?fi dBdT]?1d^xfi (2.18) Thus the estimate ^x is computed in two steps. The flrst step, expressed by (2.17) is a shift-invariant restoration. It generates a basic solution with all the artifacts associated with the ringing efiect. Then the basic solution is corrected by a second term, which is obtained through a shift-variant method involving values corresponding to edges only in 38 the image. Generally, a shift-variant method is computationally expensive. But in most cases, if the two-dimensional image has N2 pixels, the number of edge pixels in an image is O(N). This makes the shift-variant component a much lower-dimensional problem. Thus, the computation is reduced signiflcantly [32]. By formulating the solution for the estimate ^x as (2.16), the flxed-point iteration is assumed. Notice the choice of rows in the Di?s depends on ^x. In other words, the matrix d is a function of ^x. The iteration process can be described as follows: 1. Given an initial ^x, use HMRF as the prior image model and flnd out ?i and then d using Huber function. 2. Compute the next ^x using (2.16). 3. Plug in the new ^x to flnd out the new d and repeat the second step. By majorizing the original HMRF model, the minimization has become a quadratic problem for each iteration. But it is nonquadratic overall, since the majorizing function varies with the result of each iteration. Similar ideas such as the half-quadratic method have been proposed [33, 34]. Half-quadratic regularization formulates the cost function dependent on an auxiliary variable. Like our method, when the auxiliary variable is flxed, it becomes a constrained least squares problem with quadratic regularization even though it is not quadratic overall [33]. In our method, the auxiliary variables are related to a map of edges in the image. Difierent smoothing penalties are put in the cost function according to the map. For each flxed edge map, the problem becomes a quadratic one. With varying edge maps, the overall criterion is not quadratic. 39 With majorization, we set up a new minimization criterion which converges to the so- lution of the original criterion. The new minimizing technique using the Sherman-Morrison matrix inversion lemma is applied to improve the computational performance of the al- gorithm. Then we can apply the two-step algorithm including both shift-invariant and shift-variant methods to minimize the Huber criterion. In the next section, we compare our method to conjugate gradients, which is well established as an e?cient minimiza- tion method. Since this algorithm is based on decomposing the blurred image into two components and applying shift-invariant and shift-variant restorations, we refer to it as decomposition-based edge-preserving image restoration (DEEPIR). 2.6 Experiment In this section, the performance of DEEPIR is examined by simulation. Experiments were conducted using 128-by-128 images with difierent shapes and letters as shown in Fig- ure 2.5. The original image was blurred with a 5?5 averaging blur, a 9?1 motion blur and a 7?7 Gaussian blur with standard deviation 1 and 30 dB Gaussian noise. All computation was done on a workstation with an AMD Opteron 280 processor. The blurred images, corresponding shift-invariant restoration, and restoration with DEEPIR are shown in Figure 2.6. We observe that the new algorithm considerably reduces the ringing efiects on edges. The mean square error (MSE) between the original and the restored image is now less than 40% of that of shift-invariant restoration, as shown in Tables 2.4 and 2.2. This new algorithm not only has better edges but is also e?cient. Experiments on the above images show a reduced computation time compared to conjugate gradients (CG), as 40 (a) difierent shapes (b) letters Figure 2.5: Original images shown in Fig 2.7. Fig 2.7 shows how the criterion decreases vs. time in CG and DEEPIR. Each circle on the DEEPIR curve denotes one iteration. Each diamond on the CG curve denotes 40 iterations. We can see that each iteration in the DEEPIR takes longer than con- jugate gradients, but the criterion goes down faster. This indicates that DEEPIR requires more computation at each iteration, but it needs fewer iterations than CG to reach the same level of precision. For conjugate gradients, the curve is smooth since each iteration takes much less time than DEEPIR. Table 2.3 gives speciflc computing times for each image and PSF. 41 (a) (b) (c) (d) (e) (f) (g) (h) (i) Figure 2.6: Restored images (a) original image blurred with 5 ? 5 averaging blur and 40dB Gaussian noise (b) shift-invariant restoration for averaging blur (c) restoration by DEEPIR for averaging blur (d) original image blurred with 9?1 motion blur and 40dB Gaussian noise (e) shift-invariant restoration for motion blur (f) restoration by DEEPIR for motion blur (g) original image blurred with 7?7 Gaussian blur with standard deviation 1 and 40dB Gaussian noise (h) shift-invariant restoration for Gaussian blur (i) restoration by DEEPIR for Gaussian blur. 42 Table 2.1: MSE of shape Image Blur MSE of shift- invariant restoration MSE of DEEPIR Gaussian 55 6.26 Motion 86 9.33 Average 165.56 58.87 Table 2.2: MSE of letter Image Blur MSE of shift- invariant restoration MSE of DEEPIR Gaussian 83.66 11.69 Motion 103.77 18.64 Average 140.93 55.89 Figure 2.8 is another example showing the restoration of a picture of Samford Hall at Auburn University. The original 256?256 image was blurred with a 5?5 averaging blur and 40 dB Gaussian noise. The Huber threshold T is 25 and fi = 0:05. The blurred image, the shift-invariant restoration, and restoration with DEEPIR are shown in Figure 2.8 (b)(c)(d). Table 2.4 compares MSE and computing time among shift-invariant restoration, CG and DEEPIR. 2.7 Block-Based Implementation The computational complexity of DEEPIR grows with the cube of the number of edges, and the memory usage grows with the square of the number of edges. Consequently, the computation or memory requirements may be prohibitive for large images. In DEEPIR, only a system whose size corresponds to the number of edge pixels must be solved. The 43 0 50 100 150 200 250200 400 600 800 1000 1200 1400 1600 1800 2000 2200 Computation time (s) Criterion CG DEEPIR criterion vs. time Figure 2.7: Experiment on ?shape? fewer the edge pixels, the faster the system solution. Therefore, we introduce a modiflcation to DEEPIR by partitioning the system into sub-systems. Instead of solving one big system, now we solve a group of smaller systems. To do this, we divide the image into k2 equal- size blocks where k denotes the number of blocks we have in each coordinate, i.e., k = 1 represents the non-block method; k = 2 means we have 4 blocks, 2 on each side of the image; k = 4 represents 16 blocks and so on [35]. Then we group the unknown edge values according to their block membership. We apply a modifled block Gauss-Seidel (BGS) approach to limit the growth in com- putational resources and speed up the algorithm. Further improvement in computing time is possible with a block approximation method proposed later in this section. Generally, block iterative methods require more computation per iteration, but this is ofiset by a fast 44 (a) (b) (c) (d) Figure 2.8: Samford hall (a) original Samford hall image (b) blurred image (c) shift-invariant restoration (d) DEEPIR restora- tion 45 Table 2.3: Computation Time Blur Letters (s) Difierent shapes (s) DEEPIR CG DEEPIR CG Gaussian 12.17 24.5 4.81 8.89 Motion 8.23 22.7 4.15 8.33 Average 6.44 21.3 3.71 8.16 Table 2.4: Mean Square Error and Time MSE Time (s) Shift-invariant 791.7 8.9 CG 308.1 301.3 DEEPIR 308.1 196.2 rate of convergence [36]. The most well-established block iterative methods include block Gauss-Seidel and block Jacobi. Let Q = I?fidBdT, and b = fid^xfi. To solve Q! = b for !, the system Q is partitioned into k ? k sub-systems such that each group of edge variables in ! corresponds to one partition of Q. 0 BB BB BB BB BB @ Q11 Q12 ::: Q1k Q21 Q22 ::: Q2k ... ... ... ... Qk1 Qk2 ::: Qkk 1 CC CC CC CC CC A : Then it is split as follows: Q = D?(L+U); (2.19) 46 where D = 0 BB BB BB BB BB @ D11 0 ::: 0 0 D22 ::: 0 ... ... ... ... 0 0 ::: Dkk 1 CC CC CC CC CC A ; and L = 0 BB BB BB BB BB @ 0 0 ::: 0 L21 0 ::: 0 ... ... ... ... Lk1 Lk2 ::: 0 1 CC CC CC CC CC A ; U = 0 BB BB BB BB BB @ 0 U12 ::: U1k 0 0 ::: U2k ... ... ... ... 0 0 ::: 0 1 CC CC CC CC CC A : The Gauss-Seidel iterative procedure is given by D!r+1 = L!r+1 +U!r +b: (2.20) In block Gauss-Seidel, only the Dii matrices need to be inverted at each iteration, since L is strictly block-lower triangular. With this partitioning, the Dii matrices are equivalent to a system that would arise if only the ith image block contained edges. To make the algorithm e?cient, a few relaxations of (2.20) are applied to get ! for a flxed Q. Then the 47 matrix Q is updated according to the edge locations in the current estimate of the image. This constitutes a single iteration of DEEPIR in this context. The cameraman image was blurred with a 7?7 circular averaging blur with radius 3 pixels. Gaussian noise was added to a level of 40 dB SNR. We applied three relaxations of (2.20) for each iteration. The regularization parameter was chosen to be fi = 0:05: Since the goal of this work is primarily to demonstrate the computational e?ciency of the algorithm, no attempt was made to optimize the regularization parameter fi. Table 2.5 shows signiflcant improvement on computation time. Theoretically, the larger the block size, the fewer the iterations needed to achieve convergence [36]. On the other hand, the larger the block size, the more computation needed for each block and therefore the slower the convergence. As shown in Table 2.5, it is the most e?cient when k = 2 for the cameraman image. Table 2.5: Time and MSE for BGS k Time (s) MSE 1 316.6 897 2 138.0 897 4 208.5 897 8 287.3 897 2.7.1 Block-based Approximation The modifled block Gauss-Seidel has proved to improve the performance of DEEPIR. However, there is still more room for speed enhancement when a small sacriflce is made in image quality. We divide the edge image into several blocks and apply the edge-preserving regularization to each block independently. That is, we treat the whole image as having 48 edges only in a particular block and apply DEEPIR to get a correction image that is edge- preserving only in the particular block; we repeat this procedure for every part of the image. Then we extract the solution in each block to put together a full correction image. The correction image is added to the shift-invariant restoration to get an edge-preserving result. Due to the smaller number of edge pixels in each block, all the block systems can be solved independently much more e?ciently than the full system at once. Thus we can reduce the computation to a larger extent in general and also deal with large images. The computation in inverting an M ?M matrix is O(M3). Suppose the edge image is decomposed into k2 blocks, where k is the number of blocks in each coordinate. If we assume edges are evenly distributed in each block, the computation needed to invert matrices associated with these blocks is k2O((Mk2)3) = 1k4O(M3): Theoretically, the bigger k is, the faster the inversion is; therefore, the more e?cient DEEPIR is. Certain limitations on k are necessary to ensure acceptable quality, as discussed in the next section. To avoid errors on the boundary of each small block, we use overlapping blocks. That is, we apply the edge-preserving regularization to a larger part of the image, then crop the central part and use it as the result. In Figure 2.9, edges inside the dashed square are used while the shaded part is saved as the result. It needs to be noted that the approximation method is related to block Gauss-Seidel in that in the non-overlapping case, it leaves out L and U in (2.19) and treats Q as its block diagonal D. By ignoring L and U, the blocks are considered uncoupled. This is motivated by the fact that edge values in difierent blocks are very loosely correlated. It does not give us the exact solution but an e?cient one as long as the tradeofi in image quality is still tolerable. 49 Figure 2.9: Overlapping blocks 2.7.2 Experiment on Block-based Approximation To test the approximation method, we applied the same blur and noise to the cam- eraman image as in block Gauss-Seidel. The image shown in Figure 2.10(c) was restored using a standard circular deconvolution algorithm with 2-D FFT?s. Figure 2.10(d)(e)(f) show images restored using the block-based algorithm with four 128?128, sixteen 64?64 and sixty-four 32?32 blocks, respectively. In Table 2.6, it is shown that with more blocks in the image, the mean square error increases while the computation time decreases. But when we continue to reduce the size of each block, the gains in speed decrease. This happens because there is a flxed overhead for restoring each block regardless of size. Using bigger blocks produces better image quality but with more computation since a bigger edge matrix must be inverted. 2.7.3 Discussion about Block-based Approximation In the above experiments, we used overlapping blocks with an overlapping margin of 4. In image restoration, each restored pixel value in the image afiects the restored values of pixels within a certain neighborhood. Similarly, the existence of an edge pixel has a 50 Table 2.6: Time and MSE for Block-Based Approx. k Time (s) MSE Conjugate Gradients 315.8 896 2 102.4 937 4 81.6 982 8 70.9 990 larger impact on the restored image in a neighborhood near the edge pixel. When the size of the afiected neighborhood is small enough, the error caused by decomposing the image into blocks is also small. In the block-based method, we only consider one block of edges at a time; i.e., we are ignoring the efiect of other edge pixels on this particular edge block. When there is more overlapping among the blocks, more edge pixels that could possibly afiect the restored image nearby are included, thereby reducing error. Under certain circumstances, preconditioning can make the CG algorithm more com- petitive [37]. However, the application of a preconditioner requires some further design choices and computational complexity. Furthermore, our method itself can be used as a preconditioner in a CG algorithm by using the block-based method as an approximate solver within an iterative framework. Deflne the matrix M by the block-based approximation such that the exact solution x = (ATA+fi 3X i=0 DTi ?iDi)?1ATy: is replaced by an approximate solution x = MATy; (2.21) 51 where M is the block-based approximation of (ATA+fiP3i=0 DTi ?iDi)?1: Since M is easy to calculate and includes information about edges, it could help reduce computational time in CG. Let l be the number of blocks in the block-based restoration, r be the number of iterations. Assuming the number of edges is proportional to n, the computation for an n ? n image is proportional to 21rn2 log2 n + 2rl n3. Increasing the number of blocks will reduce the computation signiflcantly since it is mainly proportional to n3rl . For l = 1, i.e., the non-block restoration, the computation is proportional to 21rn2 log2 n + 2rn3. When the number of blocks changes from 4 to 16 for a 256?256 image, the computation drops by almost one third from 19:3?106 to 13:1?106 multiplications. 2.7.4 Analysis of Overlapping Our experiments show that in the block-based approximation the restored image is not as faithful to the original image as the non-block method. The MSE gets worse as the size of the blocks is decreased. In other words, we see the efiect of difierent levels of approximation with difierent block sizes. Meanwhile, the overlapping margin also afiects the error because it determines the actual block size used in calculation. Using a higher overlapping margin reduces error while increasing computation. In order to study the efiect of the overlapping margin, we consider a simplifled case to examine the efiect that one edge pixel has on its neighbors. A simple model function with only one edge pixel is used. The edge-preserving restoration now becomes ^xfi;S = ^xfi +fiBlTi [I ?filiBlTi ]?1li^xfi (2.22) 52 where li is a row vector from D corresponding to a particular edge location i. The difierence between the image without any edges and the image with one edge pixel is E = ^xfi;S ? ^xfi = fiBlTi [I ?filiBlTi ]?1li^xfi: (2.23) Since [I ?filiBlTi ] is a scalar, E = fi1?fil iBlTi BlTi li^xfi (2.24) Since li is a row vector and ^xfi is the column-ordered image, li^xfi is a scale factor. Now the difierence becomes: E = KBlTi (2.25) where K = fi1?fil iBlTi li^xfi is a scale factor. Notice E is an image showing the difierence be- tween the shift-invariant restoration and the edge-preserving restoration. In this particular case, we only care about the distribution of the image instead of the actual pixel values because the purpose of this analysis is to flnd out how one single edge pixel will afiect its neighborhood as a function of distance from the edge pixel. The scalar factor K will vary in difierent parts of the image and will tend to be higher where i is any index corresponding to pixels near an edge. However, the values die out only a few pixels away from the edge pixel, so the efiect of this edge is still negligible. In this case, a small overlapping margin will not introduce much error. The distribution of E only depends on distance from the edge pixel. Since li is a row vector from a BCCB (block circulant with circulant blocks) matrix, a difierent edge location 53 gives an li with the same elements circularly shifted. Thus, the distribution around the edge pixel will be the same around every location. To illustrate the E image, we calculated a normalized version of E. Figure 2.11 shows the absolute value of the normalized image E. The spreading pattern in Figure 2.11 seems to be conflned to a very small neighborhood around the edge location and dies out very fast as it goes away from the center. According to Figure 2.11, the efiect between two pixels that are 6 pixels apart is close to zero. Thus a very small overlap can be used to avoid unnecessary computation without neglecting the most important neighbor pixels. 54 (a) original image (b) blurred image (c) shift-invariant restoration (d) DEEPIR with k = 2 (e) DEEPIR with k = 4 (f) DEEPIR with k = 8 Figure 2.10: Block-based restoration of ?cameraman? image 55 5 10 15 20 5 10 15 20 0.2 0.4 0.6 0.8 1 Figure 2.11: Normalized E image 56 Chapter 3 Fast Algorithm for Solving Block Banded Toeplitz Systems with Banded Toeplitz Blocks 3.1 Introduction In many image restoration problems, linear and shift-invariant systems are assumed such that the blurring procedure can be expressed as g(x;y) = h(x;y)?f(x;y)+n(x;y) (3.1) where ? stands for linear convolution and n(x;y) is the additive noise that is independent of the signal. Rewrite the above system in matrix form, it becomes g = Hf +n (3.2) where g, f and n are the column-ordered version of the blurred image, the original image and noise of size n2?1 for an n?n image. To represent the convolution operation, matrix H has block-Toeplitz-with-Toeplitz-block (BTTB) structure. Many problems arising in signal and image processing are in fact characterized by matrices with a block-Toeplitz-with-Toeplitz-block (BTTB) structure in which both the Toeplitz blocks and the block structure are banded. This is due to the fact that most point spread functions (PSF) h have flnite extent and are much smaller than the image. When the image formation equations are written with the original and blurred images ordered as single 57 vectors, either row- or column-wise, the matrix that describes the blurring operation has a banded block-Toeplitz structure with banded Toeplitz blocks [2]. Also, in some non-banded cases, the matrix may be approximated by such a banded structure. On the other hand, most image restoration algorithms assume circulant systems to simplify the computation by making use of FFT?s while a Toeplitz system should be used instead according to the deflnition of the point spread function. For example, the discus- sion about edge-preserving restoration methods in the previous chapter assumes a block- circulant-with-circulant-block (BCCB) system. Thus artifacts on image boundaries (not edges) are ignored. This is often the case because FFTs signiflcantly improves computa- tional performance with a small price in image quality. For most images, information on boundaries is not as important as that in the center of the image. Although the BCCB assumption is used in many algorithms, it is not the exact repre- sentation of the original system and can pose problems when there is important information along the image boundaries. Fast algorithms that help solve Toeplitz systems have been proposed since the late 1960?s and are still being studied by many not only for their appli- cations in image processing but also general purposes in solving linear systems. It is well known that the solution of a general Toeplitz system can be found e?ciently using the Levinson algorithm or the Schur algorithm [38, 39]. These algorithms require O(N2) operations for an N ? N matrix or O(N log2 N) if a superfast algorithm is used [40, 41]. (Superfast algorithms are generally only competitive for very large N due to a large scale factor.) E?cient algorithms that exploit the band structure of a banded Toeplitz matrix using the Schur algorithm require O(kN) operations for a bandwidth of k. More 58 e?cient methods have also been developed that require O(N logk + k2 log Nk ) operations [42]. The case of block-Toeplitz matrices with Toeplitz blocks is much more problematic. Algorithms that exploit the block-Toeplitz structure usually destroy the Toeplitz structure of the blocks in the process, and vice versa. Thus, general solvers for block-Toeplitz-with- Toeplitz-block (BTTB) systems require O(M5) operations for M2 ? M2 matrices with M ?M blocks [38, 43, 44]. The banded Toeplitz outer (inner) structure can be exploited in a Schur algorithm to yield an operation count of O(kM4). However, this does not fully exploit the inner (outer) banded Toeplitz structure. Since multiplication of a BTTB matrix by a vector can be performed in O(M2 logM) operations [45], iterative algorithms can be used. However, iterative algorithms can take thousands of iterations to converge and are not always parallelizable [46]. Furthermore, iterative algorithms must repeat the bulk of the computation for each new input even if the system is the same. A fast algorithm exploiting the doubly Toeplitz structure with O(6M3) ops has been proposed in [46], but it requires the assumption that the PSF is bandlimited. In this chapter, we discuss a new algorithm that not only makes use of the BTTB structure, but also exploits the bandedness of the system. By exploiting a form of the generalized Schur algorithm in [38, 47], this algorithm only requires O(k2M3) operations without any further assumption about the BTTB system. 59 3.2 Solution Formulation We consider a matrix T to be a square block-Toeplitz matrix of order M and block bandwidth k if TfM2?M2g = 2 66 66 66 66 66 66 66 66 66 4 t0 t?1 ??? t1?k 0 ??? t1 t0 t?1 ??? t1?k ??? ... ... ... ... ... ??? tk?1 tk?2 ??? t0 t?1 ??? 0 tk?1 tk?2 ??? ... ??? ... ... ... ... ??? t 0 3 77 77 77 77 77 77 77 77 77 5 (3.3) where each block entry ti;i = 1 ?k;2 ? k;:::;0;:::;k ? 1 is an M ? M Toeplitz matrix with bandwidth k. No symmetry is assumed in (3.3), i.e., ti = t?i;i = 1;:::;k ?1 is not necessary. The algorithm is easily generalized to the case where the block size is not equal to the number of blocks and where the bandwidth in each block is not equal to the block bandwidth. We consider the simpler case for simplicity of presentation. We desire to flnd x such that y = Tx (3.4) Here the noise factor is also ignored for the purpose of simplifying the description. By focusing on the structure of the blurring operation T, we are able to show the mechanism of the algorithm in an efiective fashion. The impact of noise can always be added later on. Deflne Tc as the matrix T extended so that the structure of Tc is block-circulant with circulant blocks (BCCB) having dimensions of (at least) (M +k?1)2 ?(M +k?1)2. 60 Tc = 2 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 64 c0 ??? c1?k 0 ??? 0 j ck?1 ??? c1 c1 c0 ??? c1?k 0 ??? j 0 ck?1 ??? ... ... ... ... ... ... j ??? ... ... ck?1 ck?2 ??? c0 ??? c1?k j 0 ??? 0 0 ck?1 ck?2 ??? c0 ??? j c1?k 0 ??? ... ... ... ... ... ... j ??? ... ??? 0 ??? 0 ck?1 ck?2 ??? j c0 ??? c1?k ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? c1?k 0 ??? 0 ck?1 ck?2 j ??? c0 ??? ... ... ... ... ... ... j ... ??? ??? c?1 ??? c1?k 0 ??? 0 j ck?1 ck?2 ??? 3 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 75 (3.5) Each block entry ci;i = 1?k;2?k;:::;0;:::;k?1 is an (M +k ?1)?(M +k ?1) circulant matrix expanded from corresponding Toeplitz matrix ti in the same fashion. Then deflne unitary permutation P, i.e., PPT = PTP = I, such that in ~Tc = PTcPT the blocks augmenting the inner Toeplitz structure and the circulant blocks augmenting the outer structure are in the upper left corner [48]. Details of this permutation are shown in the experiment section. Following the procedure in [46], we can rewrite (3.4) as a 2-D circular convolution operation 2 66 4 z y 3 77 5 = ~Tc 2 66 4 0 x 3 77 5 (3.6) = 2 66 4 C D B T 3 77 5 2 66 4 0 x 3 77 5 61 The vector z is an unknown extension of y. If z were known, the right-hand side could be calculated directly using a 2-D FFT-based deconvolution operation. From (3.6) we can write 2 66 4 0 x 3 77 5 = ~T?1c 2 66 4 z y 3 77 5 = 2 66 4 Y W V U 3 77 5 2 66 4 z y 3 77 5 (3.7) Then we must solve ?Wy = Yz (3.8) for z to flnd the solution to the original problem. 3.3 Generalized Displacement Rank A BCCB system will remain block-circulant with circulant blocks after inversion [49], so ~T?1c = (PTcPT)?1 (3.9) = PT?1c PT where T?1c is a BCCB system with the same block structure as Tc and ~T?1c is the permuted version of T?1c with the same block structure as ~Tc. The matrix Y in (3.8) is a (k ?1)(2M + k ?1)?(k ?1)(2M + k ?1) matrix with a special structure; that is, Y is permuted so that the blocks augmenting the inner Toeplitz 62 structure to a circulant structure are permuted to the lower right corner of Y, while the circulant blocks augmenting the outer block structure are left in the upper left corner. This matrix has a near BTTB structure. It is not block Toeplitz but has Toeplitz blocks where the blocks are not equal in size. Furthermore, the matrix contains some structure that is not immediately obvious from inspection. This structure is best revealed using the concept of generalized displacement rank [38]. 3.3.1 Deflnition and properties Let Z be a strictly lower triangular matrix and A be an N ? N matrix. Then the matrix ?Z(A) = A?ZAZT (3.10) is the positive displacement of A with respect to the displacement operator Z and ?ZT(A) is the negative displacement. Then fi+(A) = rank?Z(A) (3.11) is the positive displacement rank of A and fi?(A) = rank?ZT(A) (3.12) is the negative displacement rank of A. One can show [50] that fi+(A) = fi?(A?1) (3.13) 63 and fi?(A) = fi+(A?1) (3.14) Furthermore, we can represent A as A = fi+(A)X i=1 L(xi)U(yTi ) (3.15) where L(xi) (U(yTi )) denotes a lower (upper) triangular Toeplitz matrix whose flrst column (row) is xi (yTi ). Likewise, A = fi?(A)X i=1 U(~xTi )L(~yi) (3.16) where ~x = [xN ???x1]T. Thus, a matrix with displacement rank fi can be fully represented by vectors xi;yi;i = 1;:::;fi. Furthermore, procedures exist for computing the LU (UL) de- composition of a matrix with positive (negative) displacement rank fi in O(fiN2) operations [38]. Procedures also exist for converting from representation (3.15) for A to representation (3.16) for A?1 in O(fiN2) operations. 3.3.2 Choice of Z The key to solving (3.8) as e?ciently as possible is to represent Y in terms of (3.15) for an appropriate choice of displacement operator Z. Deflne Zc as an (M+k?1)2?(M+k?1)2 matrix with (M +k?1)?(M +k?1) identity matrices on the flrst block subdiagonal and 64 zeros elsewhere. Zc = 2 66 66 66 66 66 66 66 4 0 0 0 ??? 0 I 0 0 ??? 0 0 I 0 0 ??? ... ... ... ... ??? 0 ??? 0 I 0 3 77 77 77 77 77 77 77 5 (3.17) where I is an (M + k ?1)?(M + k ?1) identity matrix. It should be noted that Zc is a block shift matrix [38]; i.e., for any matrix X, ZcX shifts rows of X down by the size of the block, which is M +k ?1 in this case. XZTc shifts columns of X to the left by the size of the block. Then deflne Z as Z = PZcPT (3.18) Theorem 1 For Z deflned above, fi+(Y) ? 2(M +k?1) Proof: Using Z deflned above and knowledge of ~Tc, ?Z(~Tc) = ~Tc ?Z ~TcZT (3.19) = PTcPT ?PZcPTPTcPTPZTc PT = PTcPT ?PZcTcZTc PT = PTcPT ?P 2 66 4 0 0 0 Tc(1;1) 3 77 5PT 65 = P 2 66 4 Tc0 Tc2 Tc1 0 3 77 5PT = P?Z(Tc)PT: Let N1 = (M +k?1)(M +k?2), N2 = M +k?1. ZcTcZTc shifts Tc to the left and down by N2. Since Tc is a block-circulant matrix, Tc(1;1) is an N1?N1 principle submatrix of Tc obtained by deleting the flrst block row and the flrst block column of Tc. Then Tc0fN2?N2g, Tc2fN2?N1g and Tc1fN1?N2g are the upper left, upper and left part of Tc, respectively. Since ~T?1c has the same block structure as ~Tc, ?Z(~T?1c ) has the same structure as ?Z(~Tc) in (3.19). Following (3.9), ?Z(T?1c ) can also be expressed as 2 66 4 TcolfN?N2g j TrowfN2?N1g j 0 3 77 5 (3.20) where N = N1+N2. Let li;i = 1;:::;N2 be the columns in TcolfN?N2g and ri;i = 1;:::;N2 be 1?N row vectors such that ri = [0 r0i] where r0i is the ith row in TrowfN2?N1g. It is easy to verify that ?Z(T?1c ) = N2X i=1 lieTi +eiri (3.21) where ei;i = 1;:::;N2 is the ith column of an N ? N identity matrix. Properties of the rank of a matrix show [51] rank(A+B) ? rank(A)+rank(B); 66 so rank(?Z(T?1c )) ? N2X i=1 rank(lieTi )+rank(eiri) (3.22) ? 2N2 = 2(M +k?1) Then we have fi+(~T?1c ) ? 2(M +k?1) since elementary operations do not change the rank of a matrix. If we let Z = 2 66 4 Za Zb Zc Zd 3 77 5 (3.23) and ?Z(~T?1c ) = 2 66 4 ?y ?w ?v ?u 3 77 5 (3.24) then because Z is lower triangular, ?Z(~T?1c ) = 2 66 4 ?y ?w ?v ?u 3 77 5 (3.25) = 2 66 4 Y W V U 3 77 5? 2 66 4 Za Zb Zc Zd 3 77 5 2 66 4 Y W V U 3 77 5 2 66 4 ZTa ZTc ZTb ZTd 3 77 5: If we deflne N3 = (k?1)(2M +k?1), ?yfN3?N3g = Y ?ZaYZTa ?ZbVZTa ?ZaWZTb ?ZbUZTb (3.26) = Y ?ZaYZTa = ?Za(Y) 67 since Zb = 0. Following (3.20) and (3.19), ?Za(Y) has a structure similar to ?Z(~T?1c ): ?Za(Y) = 2 66 4 ?y0fN2?N2g ?y2fN2?N4g ?y1fN4?N2g 0 3 77 5 (3.27) where N4 = N3?N2. ?y0, ?y1and ?y2 are the left and upper part of Y. Similar to (3.22), we can show that rank(?Za(Y)) ? 2(M + k ?1) as long as N3 ? 2(M + k ?1), i.e., the dimension of Y ? 2(M +k?1). 3.4 Modifled System Solution For a square block-Toeplitz matrix with square blocks, a few fast triangular factor- ization algorithms exist such as the Bareiss algorithm [52], the Levinson algorithm [39] and the Schur algorithm [53]. These approaches require matrix permutation and inversion, which makes them restricted to certain block structures. The generalized Schur algorithm described in [38] avoids those problems by treating block-Toeplitz matrices in the same way as scalar Toeplitz matrices. Since matrix Y is not a square block-Toeplitz matrix with square blocks, we use a modifled version of the generalized Schur algorithm with the deflned Z to obtain Y = LU [38]. Then (3.8) can be solved by forward- and back-substitution. Let (R;S) be a generator of Y. That is, ?Za(Y) = RST (3.28) Then an LU factorization of Y can be obtained using the following procedure [38]: 68 for j = 1 : (k?1)(2M +k?1), 1. [pvt;R;S] = MakeProper(R;S), make (R;S) a proper generator to obtain the pvtth column of R rpvt and the pvtth column of S spvt. 2. Set the jth column of L to rpvt and the jth row of U to sTpvt. 3. Replace rpvt with Zarpvt and spvt with ZTa spvt. A generator is proper with respect to Step j above if all the elements in R are zero on and above row j except for a particular element rji and all the elements in S are zero on and above row j except for the element sji in the corresponding location in S. The MakeProper procedure is described below: 1. rT= flrst non-zero row of R; sT= flrst non-zero row of S. 2. [pvt;FIRST;NEXT] = FindOrdering(r;s). 3. For each k 2 FIRST then for each k 2 NEXT except k = pvt: determine matrix C(kjpvt) and R = RC(kjpvt), S = SC?T(kjpvt). 4. return pvt R S. The procedure FindOrdering divides the indices i;i = 1;:::;2(M + k ? 1) for each row into two groups: FIRST and NEXT so that real spinors exist [38]. Since rank(?Z(Y) ? 2(M +k?1), the procedure is valid for indices i = 1;:::;2(M +k?1). A complete set of i can also be used; i.e., i = 1;:::;(k ?1)(2M + k ?1). But the smallest possible number should be used for the e?ciency of the algorithm. The reader is referred to [38] for details on FindOrdering. The generalized Schur algorithm requires 2(M + k ? 1)(k ? 1)2(2M + k ?1)2 operations while the forward- and back-substitution requires O(k2M2) operations. 69 Assuming M k, the overall computation is O(k2M3). Storage of O(k2M2) is needed to store the LU decomposition result. To use this algorithm, an initial generator (R;S) must be available. For the Z deflned above, ?Za(Y) has the form in (3.27). By inspection, the following (R;S) is an appropriate initial generator. R = 2 66 4 ?y0fN2?N2g IN2?N4 ?y1fN4?N2g 0N4?N4 3 77 5 (3.29) and ST = 2 66 66 66 66 66 4 IN2?N2 0N2?N4 ?????? 0fN4?N2g j ?y2fN2?N4g j 0(N4?N2)?N4 3 77 77 77 77 77 5 (3.30) It is straightforward to show that ?Za(Y) = RST. To obtain the system solution, we solve ?Wy = LUz using forward- and back-substitution and then use a circular deconvolution operation with [zTyT]T as input to obtain [0TxT]T. Notice that ?Wy can be obtained with circular deconvolution from ~T?1c 2 66 4 0 y 3 77 5 = 2 66 4 Y W V U 3 77 5 2 66 4 0 y 3 77 5 = 2 66 4 Wy Uy 3 77 5 (3.31) 70 3.5 Experiments This fast algorithm applies to any square BTTB system. In this section, we show a small numerical example solving an arbitrary BTTB system and a realistic example in regularized image restoration. 3.5.1 Numerical Example Once a BTTB system is expanded to a BCCB system, it can be solved with FFT?s. However, to acquire the BCCB system, we have to solve for a smaller system using the generalized Schur algorithm described in the previous section. This fast algorithm exploits the bandedness of the BTTB system and turns the problem of solving an M2?M2 system into one that only involves a (k?1)(2M +k?1)?(k?1)(2M +k?1) matrix. The mechanism of expanding and representing a BTTB matrix in FFT?s is described in the following example. Consider a 16?16 BTTB system T = 2 66 66 64 t0 t1 t2 0 t3 t0 t1 t2 t4 t3 t0 t1 0 t4 t3 t0 3 77 77 75 (3.32) where t0 = 2 66 66 64 1 2 3 0 4 1 2 3 5 4 1 2 0 5 4 1 3 77 77 75; (3.33) 71 t1 = 2 66 66 64 6 7 8 0 9 6 7 8 10 9 6 7 0 10 9 6 3 77 77 75; (3.34) ... and t4 = 2 66 66 64 21 22 23 0 24 21 22 23 25 24 21 22 0 25 24 21 3 77 77 75: (3.35) In this case, M = 4, k = 3. T can be expanded to a BCCB system Tc = 2 66 66 66 66 66 64 c0 c1 c2 0 c4 c3 c3 c0 c1 c2 0 c4 c4 c3 c0 c1 c2 0 0 c4 c3 c0 c1 c2 c2 0 c4 c3 c0 c1 c1 c2 0 c4 c3 c0 3 77 77 77 77 77 75 (3.36) where each block ci, i = 0;1;:::;4 is expanded from ti in the same fashion. For example, c0 = 2 66 66 66 66 66 64 1 2 3 0 5 4 4 1 2 3 0 5 5 4 1 2 3 0 0 5 4 1 2 3 3 0 5 4 1 2 2 3 0 5 4 1 3 77 77 77 77 77 75 : (3.37) 72 Now the problem becomes 2 66 4 z y 3 77 5 = PTcPT 2 66 4 0 x 3 77 5 According to the properties of FFT?s, Tc~x = column ordered version of (tc ? ~xI) (3.38) where tc is the convolution kernel corresponding to Tc, in other words, the PSF that Tc stands for. ~x and ~xI are the column-ordered version and the image form of the expanded original, respectively; ? stands for circular convolution. By inspection, we have tc = 2 66 66 66 66 66 64 1 16 21 0 11 6 4 19 24 0 14 9 5 20 25 0 15 10 0 0 0 0 0 0 3 18 23 0 13 8 2 17 22 0 12 7 3 77 77 77 77 77 75 (3.39) Since a BCCB matrix can be diagonalized by 2-D FFT, we obtain the convolution kernel of T?1c without directly inverting Tc: t?1c = 10?3 ? 2 66 66 66 66 66 64 ?:991 ?6:02 4:78 ?3:70 10:80 :002 ?:222 ?3:39 ?1:77 ?3:09 5:66 4:03 :503 ?2:68 ?3:59 ?3:31 4:28 5:19 ?:827 17:24 ?4:85 15:84 ?30:83 ?8:74 1:26 ?4:41 2:40 ?4:97 8:25 1:44 :535 ?5:12 4:22 ?4:75 9:63 :284 3 77 77 77 77 77 75 (3.40) 73 Notice t?1c is not the inverse of tc considered as a matrix. Following (3.27), we have ?Za(Y) = 2 66 4 ?y0 ?y2 ?y1 0 3 77 5 where ?y0f6?6g = 10?3 ? 2 66 66 66 66 66 64 ?:991 :535 1:26 ?:827 :503 ?:222 ?:222 ?:991 :535 1:26 ?:827 :503 :503 ?:222 ?:991 :535 1:26 ?:827 ?:827 :503 ?:222 ?:991 :535 1:26 1:26 ?:827 :503 ?:222 ?:991 :535 :535 1:26 ?:827 :503 ?:222 ?:991 3 77 77 77 77 77 75 ; ?y1f14?6g = 10?3 ? 2 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 4 ?6:02 ?5:12 ?4:41 17:24 ?2:68 ?3:39 ?3:39 ?6:02 ?5:12 ?4:41 17:24 ?2:68 ?2:68 ?3:39 ?6:02 ?5:12 ?4:41 17:24 17:24 ?2:68 ?3:39 ?6:02 ?5:12 ?4:41 ?4:41 17:24 ?2:68 ?3:39 ?6:02 ?5:12 ?5:12 ?4:41 17:24 ?2:68 ?3:39 ?6:02 4:78 4:22 2:40 ?4:85 ?3:59 ?1:77 ?1:77 4:78 4:22 2:40 ?4:85 ?3:59 ?3:70 ?4:75 ?4:97 15:84 ?3:31 ?3:09 ?3:09 ?3:70 ?4:75 ?4:97 15:84 ?3:31 10:80 5:66 4:28 ?30:83 8:25 9:63 9:63 10:80 5:66 4:28 ?30:83 8:25 :002 4:03 5:19 ?8:74 1:44 :284 :284 :002 4:03 5:19 ?8:74 1:44 3 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 5 : 74 and ?Ty2f6?14g = 10?3 ? 2 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 66 4 :002 4:03 5:19 ?8:74 1:44 :284 :284 :002 4:03 5:19 ?8:74 1:44 1:44 :284 :002 4:03 5:19 ?8:74 ?8:74 1:44 :284 :002 4:03 5:19 5:19 ?8:74 1:44 :284 :002 4:03 4:03 5:19 ?8:74 1:44 :284 :002 10:80 5:66 4:28 ?30:83 8:25 9:63 9:63 10:80 5:66 4:28 ?30:83 8:25 ?3:70 ?3:09 ?3:31 15:84 ?4:97 ?4:75 ?4:75 ?3:70 ?3:09 ?3:31 15:84 ?4:97 4:78 ?1:77 ?3:59 ?4:85 2:40 4:22 4:22 4:78 ?1:77 ?3:59 ?4:85 2:40 ?6:02 ?3:39 ?2:68 17:24 ?4:41 ?5:12 ?5:12 ?6:02 ?3:39 ?2:68 17:24 ?4:41 3 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 77 5 The size of ?Za(Y) is (k?1)(2M +k?1)?(k?1)(2M +k?1) = 20?20. Then the initial generator (R;S) is R20?20 = 2 66 4 ?y0f6?6g I6?14 ?y1f14?6g 014?14 3 77 5 (3.41) and ST20?20 = 2 66 66 66 4 I6?6 06?14 014?6 ?y2f6?14g 08?14 3 77 77 77 5 (3.42) where I6?14 = ? I6?6 06?8 ? 75 It is straightforward to verify that ?Za(Y) = RST. Applying the generalized Schur algo- rithm, we have the LU decomposition of Y. LUz = ?Wy With forward- and back substitution, we can solve for z. Then we plug z in (3.6), and x can be solved with FFT?s. 3.5.2 Regularized Image Restoration A blurred image is often modeled as: y = Hx+n; (3.43) where H is a BTTB system representing the blurring when linear convolution is assumed, i.e., H is taller than wide and the pixels outside x are considered all zeros. x and y are orig- inal and blurred images; n is noise. Regularization is applied to reduce noise ampliflcation. Using a simple shift-invariant penalty for the regularization, the solution is expressed as: ^x = (HTH +fiLTL)?1HTy; (3.44) where L is a BTTB matrix representing 3 ? 3 Laplacian smoothing, HTH and LTL are Hermitian BTTB systems, so T = HTH + fiLTL is also a Hermitian BTTB system. The regularization parameter fi is chosen to be 0:001. We considered the case of a 256 ? 256 image with flnite support blurred by a 5 ? 5 uniform PSF. 40 dB Gaussian noise is added. Thus M = 256 and the one-sided bandwidth 76 of both the inner and outer structures is k = 5. Using the procedure described in the previous section and forward- and back-substitution, we obtain the restored image shown in Fig 3.1. Because we assumed a shift-invariant penalty in regularization, there will be ringing efiects along edges in the image. The restored ^x is the exact solution to (3.44). The computation required to compute the solution with the algorithm described is 151 seconds on a workstation with an AMD Opteron 280 processor. By comparison, a circular deconvolution with approximate solutions requires 0.19 seconds. The solution using a simple banded generalized Schur decomposition is not available due to memory limitation. We applied the algorithm to images with difierent sizes, and the computation time is shown in Table 3.1. The new fast algorithm requires O(k2M3) ops while the general solver needs Table 3.1: Computation Time Image size General solver(s) Fast algorithm(s) 32?32 1.3 0.35 64?64 36.4 2.8 128?128 1210 22.9 256?256 N/A 151.1 O(M5). The bigger the image is, the faster the new algorithm is compared to the general solver. Although this example shows great improvement in solving a symmetric BTTB system, it should be noted that our method applies to non-symmetric BTTB systems as well. Applications of BTTB matrices in signal and image processing problems are not limited to regularized image restoration. Other problems such as correlation matrices in autoregressive models also have BTTB structure [54, 55]. 77 (a) (b) (c) Figure 3.1: Restoration of Jupiter image (a) original jupiter image (b) blurred image (c) restored image 78 3.6 Summary and Discussion This new algorithm extends a banded M2 ?M2 BTTB system with bandwidth k to a (M + k ?1)2 ?(M + k ?1)2 BCCB system and solves it with circular deconvolution. In order to obtain the BCCB system, a smaller system ?Wy = Yz needs to be solved. The generalized Schur algorithm is used to get an LU decomposition for Y, and then forward- and back-substitution are used to solve for z. The overall algorithm is dominated by the cost to compute the LU decomposition of Y for an overall operation requirement of O(k2M3). Thus, both the block-band structure and the banded block structure are exploited in the algorithm count compared to the full structure operation count of O(M5). The matrix extension is also proposed in [46], although the augmented part z is esti- mated with an assumption that the original PSF is bandlimited. Superfast algorithms that requires O(8M2 log2 M2) ops exploit the block Toeplitz structures but are only e?cient for huge systems with M ? 256 [40, 41]. Compared to [46], [40] and [41], the new algorithm is applicable to arbitrary square BTTB systems without any assumption about the original PSF or the size of the system. Notice in the BTTB system, the block size does not need to equal the number of blocks. For a matrix with M ? M blocks of size p ? p, the LU decomposition will need 2(p + k ? 1)(k ? 1)2(M + p + k ? 1)2 operations. Compared to 2(M +k?1)(k?1)2(2M +k?1)2 in Section 4, it can be faster if p ? M. This algorithm works best when k is small, i.e., the Toeplitz structure has a small bandwidth. As k ! M, the e?ciency approaches the non-banded algorithm e?ciency of O(M5). 79 Chapter 4 Deblocking of JPEG-compressed Images 4.1 JPEG Compression 4.1.1 Introduction JPEG is a commonly used standard method of compression for photographic images [56]. The name JPEG stands for Joint Photographic Experts Group, the name of the committee which created the standard. The group was organized in 1986, issuing a standard in 1992 which was approved in 1994 as ISO 10918-1. JPEG provides for lossy compression of images (although there are variations on the standard baseline JPEG that are lossless). The flle format that employs this compression is commonly also called JPEG; the most common flle extension for this format is .jpg, though .jpeg, .jpe, .jflf and .jif are also used [57]. JPEG compression has certain characteristics that distinguish it from other compres- sion mechanisms: ? JPEG is designed for compressing either full-color or gray-scale images of natural, real-world scenes. It works well on photographs, naturalistic artwork, and similar material but not so well on lettering, simple cartoons, or line drawings, unlike Graphics Interchange Format (GIF). ? JPEG is \lossy", meaning that the decompressed image is not quite the same as the original. (There are lossless image compression algorithms such as GIF, but JPEG achieves much greater compression than is possible with lossless methods.) JPEG 80 is designed to exploit known limitations of the human eye, notably the fact that small color changes are perceived less accurately than small changes in brightness. Thus, JPEG is intended for compressing images that will be looked at by humans. If analyzed by machine, the small errors introduced by JPEG may seem a big problem, even if they are invisible to the eye. ? A useful property of JPEG is that the degree of lossiness can be varied by adjusting a compression parameter called the quality factor. This means that the image maker can trade ofi flle size against output image quality. You can make extremely small flles if you don?t mind poor quality; this is useful for applications such as indexing image archives. Conversely, if you are not happy with the output quality at the default compression setting, you can improve the quality until you are satisfled, and accept lesser compression. ? Another important aspect of JPEG is that decoders can trade ofi decoding speed against image quality, by using fast but inaccurate approximations to the required calculations. Some viewers obtain remarkable speedups in this way. (Encoders can also trade accuracy for speed, but there is usually less reason to make such a sacriflce when writing a flle.) Although JPEG is a lossy compression, it actually loses far less information than other compression methods such as GIF in the case of a real-world scene. As long as the image is not repeatedly compressed and decompressed, there is very little difierence visible to human eyes, given a reasonable quality factor [58]. As mentioned above, a low quality factor can result in poor image quality (Fig 4.1). 81 Now there is an improved JPEG standard, JPEG2000, that uses state-of-the-art com- pression techniques based on wavelet technologies. Its architecture should lend itself to a wide range of uses from portable digital cameras through to advanced pre-press, medical imaging and other key sectors. JPEG2000 was created by the Joint Photographic Experts Group committee with the intention of superseding their original discrete cosine transform- based JPEG standard. Common fllename extensions include .jp2 and .j2c [57, 59]. JPEG 2000 can operate at higher compression ratios without generating the character- istic blocky and blurry artifacts of the original DCT-based JPEG standard. It also allows more sophisticated progressive downloads. Part of JPEG2000 has been published as an ISO standard, ISO/IEC 15444-1:2000. As of 2006, JPEG2000 is not widely supported in web browsers, and hence is not generally used on the World Wide Web [57]. 4.1.2 JPEG Algorithm The JPEG algorithm performs its compression in four steps (Fig 4.2): 1. The JPEG algorithms flrst cuts up an image in separate blocks of 8x8 pixels. Since the format is based on luminance/chrominance perception, it does not analyze RGB or CMYK color values but instead converts image data to a luminance/chrominance color space, such as YUV. This allows for separate compression of these two factors. Since luminance is more important than chrominance for our visual system, the algorithm retains more of the luminance in the compressed flle. 2. The next step in the compression process is to apply a Discrete Cosine Transform (DCT) for the entire block. DCT replaces actual color data for each pixel for values that are relative to the average of the entire matrix that is being analyzed. This 82 operation does not compress the flle, it simply replaces 8x8 pixel values by an 8x8 matrix of DCT coe?cients. 3. Once the data is in DCT domain, the actual compression can start. First, the com- pression software looks at the JPEG image quality the user requested and calculates two tables of quantization constants, one for luminance and one for chrominance. Once these tables have been constructed, the constants from the two tables are used to quantize the DCT coe?cients. Each DCT coe?cient is then divided by its corre- sponding constant in the quantization table and rounded ofi to the nearest integer. The result of quantizing the DCT coe?cients is that smaller, unimportant coe?- cients will be replaced by zeros and larger coe?cients will lose precision. It is this rounding-ofi that causes loss in image quality. 4. The resulting data are a list of streamlined DCT coe?cients. The last step in the process is to compress these coe?cients using either a Hufiman or arithmetic encoding scheme. Usually Hufiman encoding is used. This is a second (lossless) compression that is applied. Now the compressed data includes the Hufiman code for the DCT coe?cients, the quantization table and the Hufiman table. The decompression works as the inverse procedure of the compression. First, the Hufiman code is decoded according to the Hufiman table. Then the same quantization table is used to dequantize the DCT coe?cients. Notice at this stage, some information is already lost in the rounding-ofi during compression. The so-called \dequantization" is simply a rough restoration of the DCT coe?cients to the original values. An inverse block DCT is then applied to the coe?cients to obtain the image data (Fig 4.3). 83 Among the steps of generating a JPEG-compressed image, the DCT is where the energy is packed into low-frequency coe?cients; quantization is where part of the information is lost and the \compression" happens; Hufiman coding works e?ciently in this case only if quantization gets rid of some \unimportant" coe?cients. In the sections that follow, we will explain how the DCT and quantization work. Readers interested in entropy coding are referred to [59, 60] for details. 4.2 Discrete Cosine Transform 4.2.1 1-D DCT The rapid growth of digital imaging applications, including desktop publishing, multi- media, teleconferencing, and high-deflnition television (HDTV) has increased the need for efiective and standardized image compression techniques. Among the emerging standards are JPEG, for compression of still images [61]; MPEG, for compression of motion video; and ITU H.26x for compression of video telephony and teleconferencing. All three of these standards employ a basic technique known as the discrete cosine transform (DCT) [62]. The discrete cosine transform of a list of real numbers c(n), n = 0;:::;N ?1, is the list of length N given by: C(k) = fi(k) N?1X n=0 c(n)cos ?(2n+1)k2N ; 0 ? k ? N ?1 (4.1) where fi(0) = r 1 N; fi(k) = r 2 N for 1 ? k ? N ?1: 84 Each element of the transformed list C(k) is the inner product of the input list c(n) and a basis vector. The factors fi(k) are chosen so that the basis vectors are orthogonal and normalized. The eight basis vectors for n = 8 are shown in Fig 4.4. The inverse transformation is given by c(n) = N?1X k=0 fi(k)C(k)cos ?(2n+1)k2N ; 0 ? n ? N ?1 (4.2) where fi(0) = r 1 N; fi(k) = r 2 N for 1 ? k ? N ?1: The DCT is real and orthogonal, that is, for DCT matrix C in C(k) = Cc(n), C?1 = CT. Notice that the DCT is closely related to the discrete Fourier transform (DFT) although not the real part of the unitary DFT. In fact, it is possible to compute the DCT via the DFT by symmetrically extending the original sequence [1]. The DCT is often used in signal and image processing, especially for lossy data compression, because it has a strong energy compaction property: most of the signal information tends to be concentrated in a few low-frequency components of the DCT, approaching the Karhunen-Loeve transform (which is optimal in the decorrelation sense) for signals based on certain limits of Markov processes [63]. There are eight standard DCT variants, of which four are common. The deflnition given above is also called the type-II DCT, which is the one used in image compression. The 1-D DCT is useful in processing one-dimensional signals such as speech waveforms. For analysis of 2-D signals such as images, we need a 2-D version of the DCT, which is explained in detail in the next subsection. 85 4.2.2 2-D DCT In JPEG compression, the 2-D block DCT is used for the transformation. The original image is divided into 8? 8 blocks and a DCT is applied to each block. The 2-D DCT is deflned as: C(k;l) = fi(k)fi(l) N?1X m=0 N?1X n=0 c(m;n)cos ?(2m+1)k? 2N ? cos ?(2n+1)l? 2N ? (4.3) where fi(0) = r 1 N; fi(k) = r 2 N for 1 ? k ? N ?1: The discrete cosine basis function for N = 8 is shown in Fig 4.5. For an m ? n matrix, the 2-D DCT is computed in a simple way: The 1-D DCT is applied to each row and then to each column of the result. Since the 2-D DCT can be computed by applying 1-D transforms separately to the rows and columns, the 2-D DCT is separable in the two dimensions, like the 2-D DFT [62]. To compute a block DCT (BDCT), we do not actually have to divide the image into blocks. Since the 2-D DCT is separable, we can partition each row into sequences of length 8, apply the DCT to them, rejoin the resulting lists, and then transpose the whole image and repeat the process. Separability is in fact one of the numerous advantages of the DCT over the Karhunen-Loeve transformation (KLT), since this makes the implementation of the DCT highly e?cient. 86 4.3 Quantization Quantization is the procedure where the DCT coe?cients are divided by a specially designed table of numbers and then rounded so that some of the coe?cients turn out to be zeros. The entropy coding following quantization can then make use of the redundancy of information [60]. There are various ways to design a quantization table. Correspondingly, difierent sorts of quantization algorithms exist. Among these quantization algorithms, scalar quantization quantizes DCT samples individually while vector quantization joins all samples. A simple example of scalar quantization is a function that maps each element in a subset of the real line to a particular value in that subset (Fig 4.6). Other quantization algorithms include Lloyd-Max scalar quantizer [64, 65], generalized Lloyd algorithm [66], Trellis coded quantization [67], etc. In our discussion about quantization in this work, we assume that the simple scalar quantization (Fig 4.6) is used to simplify the presentation. For the quantization table, the standard table for luminance (4.4) is used in experiments. 87 Q = 2 66 66 66 66 66 66 66 66 66 66 66 66 66 4 16 11 10 16 24 40 51 51 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 13 16 24 40 57 69 56 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 3 77 77 77 77 77 77 77 77 77 77 77 77 77 5 (4.4) Notice in (4.4), the quantization coe?cient increases roughly from the upper left corner to the lower right corner. It can be shown [68] that the DCT approximately diagonalizes the covariance matrix of a flrst-order Gauss-Markov random process. In fact, one of the propertiesofanywide-sensestationary(WSS)randomprocessisthattheFouriercoe?cients are uncorrelated. That is, as N in (4.3) becomes very large, the DFT, the DCT and other related frequency transforms all diagonalize the source covariance matrix [69]. Thus, these transforms are asymptotically equivalent to the KLT for WSS sources, up to a reordering of the coe?cients [59]. One suitable data-independent ordering for the DCT coe?cients is the \zig-zag" scan shown in Fig 4.7. This order is based on the observation that the power density spectra of most images tends to decrease rapidly with increasing spatial frequency; it is employed by the JPEG image compression standard and most video compression standards. Although most images are not well modeled as WSS random processes, the DCT has been found to be a robust approximation to the KLT for natural image sources [59]. 88 4.4 Deblocking JPEG Images 4.4.1 Blocking Efiect It has been shown that the DCT is an asymptotic approximation to the optimal Karhunen-Loeve transform when the statistical properties of the image can be described by a flrst-order Markov model. Furthermore, it has been demonstrated that most images encountered in visual communication applications are modeled extremely well by flrst-order Markov models [1]. Because the DCT can be computed very e?ciently using fast algorithms similar in nature to the well-known fast Fourier transform (FFT), it has been recommended by both the Joint Photography Experts Group (JPEG) and the Motion Pictures Experts Group (MPEG) for compression of still and sequences of motion images, respectively [70]. Although JPEG-compressed images are good approximations to the uncompressed im- ages, most JPEG images have the problem of visible block boundaries due to coarse quan- tization of the coe?cients [71]. This problem becomes especially serious when the DC or near-DC coe?cients are coarsely quantized. Therefore, more bits are usually required for the DC or near DC coe?cients than for other coe?cients. An example of the blocking efiect in the Lena image is shown in Fig 4.8. To cope with the blockiness problem without increasing the bit rate, a variety of post- processing approaches have been proposed: 1. Low-pass flltering on boundary pixels [72, 73, 74, 75]. 2. Postprocessing algorithms based on estimation theory, for example Stevenson [76] proposed a postprocessing algorithm based on the maximum a posteriori probability 89 approach. Yang et al. [77] proposed a postprocessing algorithm based on noise estima- tion. The algorithm estimates the quantization noise causing blocking artifacts, and subtracts the estimated noise from the quantized images. There are other approaches based on the wavelet representation. Xiong et al. [78] proposed a wavelet-based post- processing algorithm, which employs an edge classifler using the overcomplete wavelet representations. Choi and Kim [79] proposed a postprocessing algorithm in the sub- band domain. They employed a minimum mean square error fllter to remove the blocking artifacts located at the block boundaries of subbands. Nosratinia [80] pro- posed a postprocessing algorithm based on the reapplication of JPEG compression. The postprocessed image is obtained by averaging the images, which are obtained from the reapplication of the JPEG compression with various shifts. This algorithm is known to provide an excellent result among the current postprocessing algorithms. However, the computational complexity is high due to multiple reapplication of the JPEG compression. 3. Iterative algorithms, which are based on projection onto convex sets (POCS) or the constrained least squares, have also been proposed. Youla and Webb [81] introduced the theory of POCS to restore images. To postprocess the DCT-based encoded images, Zakhor [82] employed the quantization constraint set (QCS) as a convex constraint set and the lowpass fllter that is used by Reeve and Lim [72]. The algorithm consists of the iteration of lowpass flltering and projecting onto QCS. As pointed out by Reeves and Eddins [83], the iterative algorithm is based on the theory of constrained least squares. Based on the POCS theory, Yang et al. [84] introduced a projection operator onto the smoothness constraint set, which smooths the blocking artifacts instead of 90 using the linear lowpass fllter. Park and Kim [85], on the other hand, proposed a narrow QCS (NQCS) to improve the performance of the QCS-based postprocessing algorithms. They also extended the notion of NQCS to the vector-quantized signals by introducing hypercubes. At high bit rates, employing NQCS in a POCS-based post- processing algorithm can preserve details and edges from undesirable blurring without any sophisticated, edge-oriented classiflers and empirically determined thresholds [86]. 4. Non-iterative algorithms have several advantages over iterative algorithms. In order to secure the convergence of an iterative algorithm, the operators should meet several requirements. For example, the operators should be projectors if the iteration is based on the POCS theory. In non-iterative algorithms, on the other hand, the operators can be more exible. Liew and Yan [87] proposed a simple, non-iterative, wavelet-based deblocking algorithm. The algorithm exploits the fact that block discontinuities are constrained by the DC quantization interval of the quantization table, as well as the behavior of the wavelet modulus maxima evolution across scales for singular image structures, to derive appropriate threshold maps at difierent wavelet scales. Postprocessing algorithms do not modify the compression system itself and can thus be directly applicable to commercial products, such as the JPEG and MPEG compression systems [86]. Since the blocking efiect is primarily due to the inability of the DCT to exploit inter-block correlations, most of the above approaches exploit correlations of intensity values in neighboring blocks. 91 4.4.2 Deblocking with Regularization Mathematical Model In this section, we propose a new algorithm that is involved in the decompression process. This method makes use of the regularized restoration describe in Chapter 2 and is closely related to the lowpass flltering in [72, 73, 74, 75] but difiers in that it is not a postprocessing algorithm. In a regularized restoration, the objective is to minimize the cost function `(^f) = kg ?H ^fk2 +fikL^fk2 (4.5) where L is a high-pass fllter that imposes a smoothness condition on the restored image, fi is the Lagrangian multiplier. If we model the JPEG compression as a degradation of the image, we should be able to describe the degradation process as y = HQDx (4.6) where y is the compressed data, H, Q and D are the Hufiman coding, quantization and BDCT, respectively. To simplify the problem, we can remove the Hufiman coding since it is a lossless procedure. y = QDx (4.7) where y is the data before entropy coding, Q and D are the quantization and BDCT. Furthermore, the distortion introduced by coarse quantization is not linear in reality and can be modeled as noise. Normally, if uniform quantization is assumed, the noise is correlated 92 with the data. However, if the bit-rate allocated is high enough we may assume that the noise and input are uncorrelated with each other [88]. This assumption does not hold for low bit-rates but it is commonly used for the analysis of the quantization error as the problem becomes extremely complicated otherwise and we often flnd that the results carry on to the low bit-rate case as well [89]. Using this assumption we simplify the degradation process into y = Dx+n (4.8) Now we have the mathematical equivalent of a standard deblurring problem where we are given the blurred image y and the blurring matrix D. The objective is to flnd the original image x. We assume that the noise n is uncorrelated with the signal thus can be added to the model. As described in (4.5), the estimate ^x should be the minimizer of the function `(^x) = ky ?D^xk2 +fikL^xk2: (4.9) For this speciflc case of decompressing a JPEG image, L would be a high-pass fllter not only afiects natural edges in the image, but also block boundaries for each 8?8 block. Since the block boundaries and the natural edges are not correlated, we can separate the penalty term into two lowpass fllters LD for natural edges and Le for block boundaries. Now the cost function becomes: `(^x) = ky ?D^xk2 +fikLD^xk2 +flkLe^xk2 (4.10) 93 where fl is another Lagrangian multiplier as the smoothing control over the block bound- aries. Taking derivative of `(^x) with respect to x and setting it to 0, we have ^x = (DtD +fiLtDLD +flLteLe)?1Dty (4.11) Following the technique in Chapter 2, we can apply the Sherman-Morrison matrix inversion lemma and then divide the solution into two parts: one with the diagonalizable matrix so that we can easily invert it in the transform domain; the other that has to be inverted directly but easily since it is small. Notice that Le works on all the block boundaries inside the image (dash lines in Fig 4.9). To avoid the overcomplicated Le matrix, we divide it up into two matrix that deal with vertical and horizontal block boundaries separately (dash lines in Fig 4.10 and Fig 4.11). Now Le becomes Lve and Lhe, where Lve is the smoothing matrix on vertical boundaries and Lhe is the smoothing matrix on horizontal boundaries. Considering the fact that Lve has little efiect on horizontal boundaries and Lhe has little efiect on vertical boundaries, the correlation between them can be ignored. The cost function becomes `(^x) = ky ?D^xk2S +fikLD^xk2 +flkLve^xk2 +flkLhe^xk2 (4.12) where S is a diagonal weighting matrix. Taking the derivative with respect to x and setting it to 0, we have ^x = (DtSD +fiDtLDD +fl(LtveLve +LtheLhe))?1DtSy (4.13) 94 where LD is the diagonal matrix representing smoothing operation in the cosine domain. LD = DtLDD; (4.14) so LtDLD = DtLDD: (4.15) To apply the Sherman-Morrison matrix inversion lemma, we deflne A = (S +fiLD): (4.16) Plugging (4.16) in (4.13), we have ^x = (DtAD +fl(LtveLve +LtheLhe))?1DtSy (4.17) So far we have been trying to separate the vertical and horizontal smoothing Lve and Lhe, but in (4.17) they are still together. To simplify the computation, we consider the following equation: since DCT is orthogonal, i.e., DtD = DDt = I; (4.18) (DtA1=2D +flLtveLveDtA?1=2D)(DtA1=2D +flDtA?1=2DLtheLhe) (4.19) = DtAD +fl(LtveLve +LtheLhe)+fl2LtveLveDtA?1DLtheLhe 95 Considering the fact that fl2 is very small and Lve and Lhe are almost uncorrelated, the last term fl2LtveLveDtA?1DLtheLhe is relatively insigniflcant. Then (4.19) can be rewritten as (DtA1=2D +flLtveLveDtA?1=2D)(DtA1=2D +flDtA?1=2DLtheLhe) (4.20) ? DtAD +fl(LtveLve +LtheLhe): Then (4.17) can be rewrite as ^x (4.21) = ((DtA1=2D +flLtveLveDtA?1=2D)(DtA1=2D +flDtA?1=2DLtheLhe))?1DtSy = (DtA1=2D +flDtA?1=2DLtheLhe)?1(DtA1=2D +flLtveLveDtA?1=2D)?1DtSy In (4.21), Lve and Lhe can be implemented separately. Applying the Sherman-Morrison matrix inversion lemma to the flrst inversion (DtA1=2D +flDtA?1=2DLtheLhe)?1 in (4.21), (DtA1=2D +flDtA?1=2DLtheLhe)?1 (4.22) = DtA?1=2D?flDtA?1=2DDtA?1=2DLthe(I +flLheDtA?1=2DDtA?1=2DLthe)?1LheDtA?1=2D = DtA?1=2D?flDtA?1DLthe(I +flLheDtA?1DLthe)?1LheDtA?1=2D: Since I + flLheDtA?1DLthe is a smaller matrix than A, the inversion is relatively fast. Similarly, we have (DtA1=2D +flLtveLveDtA?1=2D)?1 (4.23) = DtA?1=2D?flDtA?1=2DLtve(I +flLveDtA?1=2DDtA?1=2DLtve)?1LveDtA?1=2DDtA?1=2D 96 = DtA?1=2D?flDtA?1=2DLtve(I +flLveDtA?1DLtve)?1LveDtA?1D: The techniques of computing the above expressions are described in Chapter 2. Experiment In this section, experiments were conducted using 64-by-64 and 128-by-128 sub-images from the Lena image. The 64-by-64 example is shown in Figure 4.12. All computation was done on a workstation with an AMD Opteron 280 processor. The computation time for a 64-by-64 image is 26 seconds. By applying the block Gauss-Seidel to the matrices, it can be implemented in 18 seconds. The mean square errors of the compressed JPEG images and the deblocked images are shown in Table 4.1. Table 4.1: Deblocking JPEG image size Time (s) MSE of original MSE of deblocked 64 18 62 58 128 119 231 201 Notice that in Table 4.1, the difierence in MSE between the original JPEG image and the deblocked image is not as considerable as that in a regular deblurring problem. One of the reasons is that deblocking blurs along block boundaries, thus degrading the image to some extent. By paying the price of a little blurrier image, the visual quality has been improved since the blocking efiect is reduced. The trade-ofi between blurry image and blocky image can be controlled by choosing difierent Lagrangian multiplier fl. 97 4.5 Summary Inthischapter, wedevelopedanewiterativedeblockingalgorithmforJPEG-compressed images that is designed as part of the decompression process. It makes use of the shift- variant regularization method to blur between boundaries thus get rid of blocking efiect. Compared to other deblocking algorithms based on blurring [72, 73, 74, 75], it is not a postprocessing method. By applying the Sherman-Morrison matrix inversion lemma and block Gauss-Seidel, this algorithm is promising in computation time. Although it is an iterative method, its convergence can be proved in the same fashion as in Chapter 2. The result also shows a smoother image with less blocking efiects. 98 (a)Original Lena (b)JPEG-compressed Lena with low quality factor (Q=25) Figure 4.1: JPEG-compressed image 99 ImageOriginal BlockDCT Quantizer EntropyEncoder Quant Table HufimanTable Compressed Data Figure 4.2: JPEG encoder Table Compressed Data Entropy Decoder Dequantizer InverseBlockDCT ReconstructedImage HufimanTable Quant Figure 4.3: JPEG decoder 100 0 2 4 6 80 0.2 0.4 0 2 4 6 8?0.5 0 0.5 0 2 4 6 8?0.5 0 0.5 0 2 4 6 8?0.5 0 0.5 0 2 4 6 8?0.5 0 0.5 0 2 4 6 8?0.5 0 0.5 0 2 4 6 8?0.5 0 0.5 0 2 4 6 8?0.5 0 0.5 Figure 4.4: The eight basis vectors for the discrete cosine transform of length eight 101 Figure 4.5: Discrete cosine basis functions for N = 8 -1.5 xq x1-1 0.5 1.5 -0.5 Figure 4.6: Scalar quantization 102 DC Figure 4.7: Zigzag scan 103 (a) Original Lena image (b) Blocking efiect in Lena image Figure 4.8: Blocking efiect 104 Figure 4.9: Block boundaries Figure 4.10: Vertical boundaries Figure 4.11: Horizontal boundaries 105 (a) JPEG-compressed Lena image (b) Lena?s hat with blocking efiect (c) Lena?s hat with reduced blocking efiect Figure 4.12: Deblocking Lena image 106 Chapter 5 Conclusion and Discussion Image restoration is a computationally intensive image processing task. Many im- age processing situations involve solving inverse problems for either block-Toeplitz-with- Toeplitz-block (BTTB) or block-circulant-with-circulant-block (BCCB) systems. In this work, e?cient schemes for solving BCCB system using FFTs are proposed. Since a BTTB system can be extended to BCCB, it is a near-circulant system and can be solved using a similar technique. E?cient algorithms for solving BCCB and BTTB systems are proposed in this work to reduce ringing artifacts in the restored image and reduce computation time. The same techniques are also applied to JPEG images to reduce the blocking efiect. 5.1 Edge-Preserving Regularization In Chapter 2, we proposed a method DEEPIR. MAP estimation, a Gauss-Markov ran- dom fleld, and the Huber function are used as the optimization criterion in this context. Then the optimization criterion is modifled with a majorization technique to achieve a more e?cient algorithm. Finally, the majorized criterion is minimized by a fast implemen- tation where the restoration is decomposed into a sum of two independent restorations: one yields an image that comes directly from an FFT restoration; the other involves a set of unknowns whose number equals the number of weights in the matrix that deviate from 1 (weights associated with edges). By summing the two, the ringing artifacts from FFTs are canceled. Because the second restoration has a signiflcantly reduced set of unknowns, it can be calculated very e?ciently even though no circular convolution structure exists. 107 Since we do not have the original image to derive edge information, an iterative method is used. Although flxed point iterations often do not converge, this algorithm with the Huber function converges monotonically to the minimizer. DEEPIR compares favorably to conjugate gradients in computational time. It can be improved considerably by dividing the system into blocks using a modifled block Gauss-Seidel procedure. A block-based ap- proximation technique is also proposed to yield an approximation of the direct solution at a much reduced computational cost. 5.2 Fast Algorithm for Solving Block Banded Toeplitz Systems with Banded Toeplitz Blocks In Chapter 3, we developed a new algorithm that extends a banded M2 ?M2 BTTB system with bandwidth k to a (M + k ? 1)2 ? (M + k ? 1)2 BCCB system and solves it with circular deconvolution. In order to obtain the BCCB system, a smaller system ?Wy = Yz needs to be solved. The generalized Schur algorithm is used to get an LU decomposition for Y , and then forward- and back-substitution are used to solve for z. The overall algorithm is dominated by the cost to compute the LU decomposition of Y for an overall operation requirement of O(k2M3). Thus, both the block-band structure and the banded block structure are exploited in the algorithm count compared to the full structure operation count of O(M5). The new algorithm is applicable to arbitrary square BTTB systems without any assumption about the original PSF or the size of the system. Notice in the BTTB system, the block size does not need to equal the number of blocks. This algorithm works best when k is small, i.e., the Toeplitz structure has a small bandwidth. As k approaches M, the e?ciency approaches the non-banded algorithm e?ciency of O(M5). 108 5.3 Deblocking of JPEG-compressed Images In Chapter 4, we applied a scheme similar to that in Chapter 2 for deblocking JPEG- compressed images. The method applies the Sherman-Morrison matrix inversion lemma and block the Gauss-Seidel method to reduce computation. It is difierent from postprocessing algorithms since it is involved in the decompression process but similar to some postprocess- ing methods that are based on blurring. The deblocked image looks smoother and the computation time is promising. 5.4 Future Work 1. The e?cient algorithm DEEPIR in Chapter 2 leads to several issues and opportunities. First, we have exibility in deflning the threshold for the regularization weights by adjusting the value of T. The fewer edge weights we allow, the faster the restoration will be. A small T will give a sharp restoration while it can be noisy if any variation in gray level is mistaken as an edge; a large T reduces the number of edges and thus reduces computation but results in blurry edges. Second, the block-based method can be used as a preconditioner in CG since it is considered an approximation to the originaledge-preservingregularization. Finally, wehaveshownthatasimilarapproach can address the fact that real-world blurring does not conform to circular convolution but rather linear convolution followed by windowing. This approach can easily be incorporated into the algorithm described above. Thus, this approach provides an e?cient method for shift-variant regularization. 2. The deblocking method in Chapter 4 reduces the blocking efiect by blurring between blocks. Since we made an approximation in (4.20), the deblocked image might not 109 be the exact reconstruction from the compressed data, which introduces some error. The control over the smoothness (or blockiness) is in the Lagrangian multiplier fl. The computation time can be further reduced with a block implementation describe in Chapter 2. 110 Bibliography [1] A. K. Jain, Fundamentals of Digital Image Processing. Englewood Clifis, NJ: Prentice- Hall, 1989. [2] H. C. Andrews and B. R. Hunt, Digital Image Restoration. New Jersey: Prentice Hall, 1977. [3] R. C. Gonzalez and R. E. Woods, Digital Image Processing. New Jersey: Prentice Hall, 2002. [4] A. K. Katsaggelos, Digital Image Restoration. Tiergartenstrasse 17, W-6900 Heidel- berg, Germany: Springer-Verlag, 1991. [5] W. K. Pratt, Digital Image Processing. New York, NY: John Wiley & Sons, 2001. [6] C. W. Hellstrom, \Image restoration by the method of lease square," J. Opt. Soc. Amer., vol. 57, pp. 297{303, 1967. [7] R. L. Lagendijk, J. Biemond, and D. E. Boekee, \Regularized iterative image restora- tion with ringing reduction," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, pp. 1874{1888, Dec. 1988. [8] J. W. Woods, J. Biemond, and A. M. Tekalp, \Boundary value problem in im- age restoration," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 692{695, 1985. [9] S. J. Reeves, \Fast image restoration without boundary artifacts," IEEE Transactions on Image Processing, to appear. [10] J. A. Fessler, H. Erd?ogan, and W. Wu, \Exact distribution of edge-preserving map esti- mators for linear signal models with gaussian measurement noise," IEEE Transactions on Image Processing, vol. 9, pp. 1049{1055, June 2000. [11] P. Charbonnier, L. Blanc-F?eraud, G. Aubert, and M. Barlaud, \Deterministic edge-preserving regularization in computed imaging," IEEE Transactions on Image Processing, vol. 6, pp. 298{311, February 1997. [12] M. Nilkolova, J. Idier, and A. Mohammad-Djafari, \Inversion of large-support ill- posed linear operators using a piecewise gaussian mrf," IEEE Transactions on Image Processing, vol. 7, pp. 571{585, April 1998. 111 [13] A. Delaney and Y. Bresler, \Globally convergent edge-preserving regularized recon- struction: An application to limited-angle tomography," IEEE Transactions on Image Processing, vol. 7, pp. 204{221, February 1998. [14] A. K. Katsaggelos, \Iterative image restoration algorithms," Optical Engineering, vol. 28, pp. 735{748, July 1989. [15] A. K. Katsaggelos, J. Biemond, R. M. Mersereau, and R. W. Schafer, \An iterative method for restoring noisy blurred images," Circuits, Systems, and Signal Processing, vol. 3, no. 2, pp. 139{160, 1984. [16] M. E. Zervakis, A. K. Katsaggelos, and T. M. Kwon, \Robust estimation techniques in regularized image restoration," Optical Engineering, vol. 4, pp. 752{773, June 1995. [17] C. Bouman and K. Sauer, \A Generalized gaussian image model for edge-preserving map estimation," IEEE Transactions on Image Processing, vol. 2, July 1993. [18] M. E. Zervakis and T. M. Kwon, \Robust estimation techniques in regularized image restoration," Optical Engineering, vol. 31, pp. 2174{2190, October 1992. [19] R. Stevenson and E. Delp, \Fitting curves with discontinuities," in Proceeding of First International Workshop on Robust Computer Vision, (Seattle, WA), 1990. [20] P. J. Green, \Bayesian reconstructions from emission tomography data using a modifled EM algorithm," IEEE Transactions on Medical Imaging, vol. 9, pp. 84{93, March 1990. [21] R. R. Schultz and R. L. Stevenson, \A Bayesian approach to image expansion for improved deflnition," IEEE Transactions on Image Processing, vol. 3, pp. 233{242, May 1994. [22] S. Geman and D. Geman, \Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6, pp. 721{741, November 1984. [23] J. Hutchinson, C. Koch, J. Luo, and C. Mead, \Computing motion using analog and binary resistive networks," Computer, vol. 21, pp. 53{63, March 1988. [24] A. Blake and A. Zisserman, Visual Reconstruction. Cambridge, MA: MIT Press, 1987. [25] C. Bouman and B. Liu, \A multiple resolution approach to regularization," in Proceeding of SPIE Conference on Visual Communications and Image Processing, (Cambridge, MA), pp. 512{520, 1988. [26] K. Lange, \Convergence of EM image reconstruction algorithms with gibbs priors," IEEE Transactions on Medical Imaging, vol. 9, pp. 439{446, December 1990. 112 [27] C. Bouman and K. Sauer, \An edge preserving method for image reconstruction from integral projections," in Proceedings of Conference on Information Science and System, (Johns Hopkins University, Baltimore, MD), pp. 382{387, 1991. [28] P. J. Huber, Robust Statistics. New York: Wiley, 1981. [29] R. L. Stevenson, B. E. Schmitz, and E. J. Delp, \Discontinuitypreserving regularization of inverse visual problems," IEEE Transactions on Systems, Man and Cybernetics, vol. 24, March 1994. [30] P. Stoica and Y. Sel?en, \Cyclic minimizers, majorization techniques, and the expectation-maximization algorithm: a refresher," IEEE Signal Processing Magazine, pp. 112{114, January 2004. [31] R. Raich and A. O. H. III, \Sparse image reconstruction for partially known blur func- tions," in Proceedings of the International Conference on Image Processing, (Atlanta, GA), October 2006. [32] R. Pan and S. J. Reeves, \Huber-Markov edge-preserving image restoration," in Proceedings of IS&T/SPIE Symposium on Electronic Imaging, (San Jose, CA), Janu- ary 2005. [33] D. Geman and C. Yang, \Nonlinear image recovery with half-quadratic regularization," IEEE Transactions on Image Processing, vol. 4, July 1995. [34] J. Idier, \Convex half-quadratic criteria and interacting auxiliary variables for image restoration," IEEE Transactions on Image Processing, vol. 10, pp. 1001{1009, July 2001. [35] R. Pan and S. J. Reeves, \E?cient Huber-Markov edge-preserving image restoration," IEEE Transactions on Image Processing, vol. 15, pp. 3728{3735, December 2006. [36] W. Stewart, Introduction to the Numerical Solution of Markov Chains. Princeton University Press, 1994. [37] M. K. Ng, R. H. Chan, and W. Tang, \A fast algorithm for deblurring models with Neumann boundary conditions," SIAM J. Sci. Comput., vol. 21, pp. 851{866, 1999. [38] J. Chun and T. Kailath, Numerical Linear Algebra, Digital Signal Processing and Parallel Algorithms, ch. Generalized displacement structure for block-Toeplitz, Toeplitz-block, and Toeplitz-derived matrices, pp. 215{236. Springer-Verlag, 1991. [39] N. Levinson, \The Wiener RMS error criterion in fllter design and prediction," J. of Math. Phys., vol. 25, pp. 261{278, 1947. 113 [40] G. S. Ammar and W. B. Gragg, \Superfast solution of real positive deflnite toeplitz sys- tems," SIAM Journal of Matrix Analysis and Applications, vol. 9, pp. 61{76, January 1988. [41] T. Huckle, \Implementation of a superfast algorithm for symmetric positive deflnite linear equations of displacement rank 2," in Proceedings of SPIE, Advanced Signal Processing: Algorithms, Architectures, and Implementations V, October 1994. [42] D. A. Bini and B. Meini, \Toeplitz systems," SIAM Journal of Matrix Analysis and Applications, vol. 20, no. 3, pp. 700{719, 1999. [43] N. Kalouptsidis, G. Carayannis, and D. Manolakis, \Fast algorithms for block Toeplitz matrices with Toeplitz entries," Signal Processing, vol. 6, pp. 77{81, 1984. [44] M. Wax and T. Kailath, \E?cient inversion of Toeplitz-block Toeplitz matrix," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 31, no. 5, pp. 1218{1221, 1983. [45] G. H. Golub and C. V. Loan, Matrix Computations. Baltimore, MD: Johns Hopkins University Press, 1989. [46] A. E. Yagle, \A fast algorithm for Toeplitz-block-Toeplitz linear systems," in Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1929{1932, 2001. [47] J. Chun and T. Kailath, Numerical Linear Algebra, Digital Signal Processing and Parallel Algorithms. Springer-Verlag, 1991. [48] S. J. Reeves, \Fast algorithm for solving block banded Toeplitz systems with banded Toeplitz blocks," in Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 3325{3328, May 2002. [49] P. Davis, Circulant Matrices. John Wiley & Sons, 1979. [50] T. Kailath, S.-Y. Kung, and M. Morf, \Displacement ranks of matrices and linear equations," Journal of Mathematical Analysis and Applications, vol. 68, pp. 395{407, 1979. [51] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge: Cambridge University Press, 1985. [52] E. Bareiss, \Numerical solution of linear equations with Toeplitz and vector Toeplitz matrices," Numer. Math., vol. 13, pp. 404{424, 1969. [53] T. Kailath, \Signal processing applications of some moment problems," in Proceedings of Symposia in Applied Mathematics, vol. 37, pp. 71{109, 1987. 114 [54] D. E. Dudgeon and R. M. Mersereau, Multidimensional Digital Signal Processing. New Jersey: Prentice-Hall, 1984. [55] S. Haykin, Adaptive Filter Theory. New Jersey: Prentice Hall, 1996. [56] ISO/IEC JTCl 10918-1, Information technologydigital compression and coding of continous-tone still image: Requirements and guidelines, ITU- T Rec. T.81, 1994. [57] O?cial Joint Photographic Experts Group site, http://www.jpeg.org. [58] Independent JPEG Group site, http://www.ijg.org. [59] D. S. Taubman and M. W. Marcellin, JPEG2000 Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, 2002. [60] D. A. Hufiman, \A method for the reconstruction of minimum-redundancy codes," in Proceedings of I.R.E., pp. 1098{1101, September 1952. [61] G. K. Wallace, \The JPEG still picture compression standard," Commun. ACM, vol. 34, no. 4, pp. 30{44, 1991. [62] A. Watson, \Image compression using the discrete cosine transform," Mathematica Journal, vol. 4, pp. 81{88, January 1994. [63] K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic Press, 1990. [64] S. P. Lloyd, \Least squares quantization in pcm," IEEE Transactions on Information Theory, vol. 28, pp. 129{137, 1982. [65] J. Max, \Quantizing for minimum distortion," IRE Transactions on Information Theory, vol. 6, pp. 7{12, March 1960. [66] Y. Linde, A. Buzo, and R. M. Gray, \An algorithm for vector quantizer design," IEEE Transactions on Communications, vol. 28, pp. 84{95, 1980. [67] M. W. Marcellin and T. R. Fischer, \Trellis coded quantization of memoryless and gauss-markov sources," IEEE Transactions on Communications, vol. 38, pp. 82{93, 1990. [68] N. Ahmed, T. Natarajan, and K. Rao, \Discrete cosine transform," IEEE Transactions on Computers, vol. 23, pp. 88{93, 1974. [69] M. Unser, \On the approximation of the discrete Karhunen-Loeve transform for sta- tionary processes," Signal Proc., vol. 5, no. 3, pp. 229{240, 1983. 115 [70] Y. Yang, N. P. Galatsanos, and A. K. Katsaggelos, \Regularized reconstruction to reduce blocking artifacts of block discrete cosine transform compressed images," IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 3, pp. 421{432, December 1993. [71] S. Minami and A. Zakhor, \An optimization approach for removing blocking efiects in transform coding," IEEE Transactions on Circuits and Systems for Video Technology, vol. 5, January 1995. [72] H. C. Reeve and J. S. Lim, \Reduction of blocking efiects in image coding," Opt. Eng., vol. 23, pp. 34{37, February 1984. [73] C. J. Kuo and R. J. Hsieh, \Adaptive postprocessing for block encoded images," IEEE Trans. Circuits Syst. Video Technol., vol. 5, pp. 298{304, April 1995. [74] B. Ramamurthi and A. Gersho, \Nonlinear space-variant postprocessing of block coded images," IEEE Trans. Acoust. Speech, Signal Process., vol. 34, pp. 1258{1268, October 1986. [75] T. Meier, K. N. Ngan, and G. Cheng, \Reduction of blocking artifacts in image and video coding," IEEE Trans. Circuits Syst. Video Technol., vol. 9, pp. 490{500, April 1999. [76] R. L. Stevenson, \Reduction of coding artifacts in transform image coding," in Proceedings of the 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. V, (Minneapolis, MN), pp. 401{404, March 1993. [77] J. Yang, H. Choi, and T. Kim, \Noise estimation for blocking artifacts reduction in DCT coded images," IEEE Trans. Circuits Syst. Video Technol., vol. 10, pp. 1116{1120, October 2000. [78] Z. Xiong, M. T. Orchard, and Y.-Q. Zhang, \A deblocking algorithm for JPEG com- pressed images using overcomplete wavelet representations," IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 7, pp. 433{437, April 1993. [79] H. Choi and T. Kim, \Blocking artifact reduction in block-coded images using wavelet- based subband decomposition," IEEE Trans. Circuits Syst. Video Technol., vol. 10, pp. 801{805, August 2000. [80] A. Nosratinia, \Denoising of JPEG images by re-application of JPEG," J. VLSI Signal Process., vol. 27, pp. 69{79, 2001. [81] D. C. Youla and H. Webb, \Image restoration by the method of convex projections: Part 1 | theory," IEEE Transactions on Medical Imaging, vol. MI-1, pp. 81{94, Oc- tober 1982. 116 [82] R. Rosenholtz and A. Zakhor, \Iterative procedures for reduction of blocking efiects in transform image coding," IEEE Transactions on Circuits and Systems for Video Technology, vol. 2, pp. 91{95, March 1992. [83] S. J. Reeves and S. L. Eddins, \Comments on ?Iterative procedures for reduction of blocking efiects in transform image coding?," IEEE Transactions on Circuits and Systems for Video Technology, vol. 3, pp. 439{440, December 1993. [84] Y. Yang, N. P. Galatsanos, and A. K. Katsaggelos, \Regularized reconstruction to reduce blocking artifacts of block discrete cosine transform compressed images," IEEE Trans. Circuits Syst. Video Technol., vol. 3, pp. 421{432, December 1993. [85] S. H. Park and D. S. Kim, \Theory of projection onto narrow quantization constraint set and its application," IEEE Transactions on Image Processing, vol. 8, pp. 1361{1373, October 1999. [86] K. Lee, D. S. Kim, and T. Kim, \Regression-based prediction for blocking artifact re- duction in JPEG-compressed images," IEEE Transactions on Image Processing, vol. 14, January 2005. [87] A. W.-C. Liew and H. Yan, \Blocking artifacts reduction in JPEG compressed im- ages using overcomplete wavelet representation," in Proceedings of 2001 International Symposium on Intelligent Multimedia, video and Speech Processing, pp. 129{132, May 2001. [88] N. S. Jayant and P. Noll, Digital Coding of Waveforms. Englewood Clifis, New Jersey: Prentice-Hall, 1984. [89] P.P.Vaidyanathan, MultirateSystemsandFilterBanks. EnglewoodClifis, NewJersey: Prentice-Hall, 1992. 117