A Brief History of Nonparametric Texture Synthesis
There has been a lot of work in the area of texture synthesis. The area that I am more familiar with is the area of nonparametric texture synthesis, particularly "synthesis from example". So here I have tried to present a very brief history of nonparametric texture synthesis. For a starting point I have chosen Popat's work.
Popat, '93: Novel cluster-based probability model for texture synthesis, classification, and compression.
This texture synthesis algorithm can be best classified as a nonparametric Markov chain synthesis algorithm. The basis of the algorithm is to order the pixels and then synthesis a new pixel from a nonparametric representation of the conditional probability function derived from samples of the input texture. Popat proposed to use stochastic sampling of the conditional probability function and also to compress the conditional probability function via a set of Gaussian kernels. This compression allowed for fast look ups, but limited the neighbourhood order that could be successfully modelled.
The one flaw in Popat's approach was that it was causal. That is the synthesis was performed in a sequential sequence starting from a "seed" and gradually moving further away. The problem with this type of approach is that if the past pixels start to deviate too far from those seen in the input image, the synthesis algorithm can get lost in a space where only garbage is produced. This severely limits the algorithm for synthesising large areas of texture.
Heeger and Bergen, '95: Pyramid based texture analysis/synthesis.
This paper was one of the first papers to show the possibility of synthesising colour textures to the graphics community. They did it using a combination of Laplacian and Steerable pyramids to deconstruct an input texture. The histograms from each of the pyramid levels was used to reconstruct a similar pyramid. Unfortunately the deconstruction was not orthogonal, which meant that Heeger and Bergen had to use an iterative approach of matching the histograms and expanding and reducing the pyramid.
In the end Heeger and Bergen got some very nice results, but their technique was limited to synthesising basically stochastic homogeneous textures with minimal structure.
Paget and Longstaff, '95, '96, '98: Texture synthesis via a nonparametric Markov random field model.
The problem with the sequential approach of Popat's algorithm is solved here. Paget and Longstaff introduced a top down approach, where the frequency components of a texture are gradually introduced into a synthetic texture from low to high frequency. This overcame a lot of problems inherent in a sequential approach which was prone to having the synthesis algorithm wander into a non recoverable "no man's land."
The noncausal nature of Paget and Longstaff's algorithm not only could synthesise a texture to any size, but also better supported the idea that arbitrary textures could be modelled by a Markov random field. The number and range of textures that could be synthesised by their scheme also showed that nonparametric Markov random field models were the models of choice for natural textures that included both stochastic and deterministic properties. However, the one draw back to their scheme was speed.
- Sampling is not performed over all possible colour values, but only over those values that occur with in the original input texture [Section 7.6.1].
- Given the sampling method of iterative condition modes (ICM) and the large sparse nature of the local conditional probability density function (LCPDF), the sampling algorithm can be computationally approximated as a minimum distance look up algorithm.
De Bonet, '97: Multiresolution sampling procedure for analysis and synthesis of texture images.
De Bonet's method could be considered a variant of Heeger and Bergen pyramid based texture analysis/synthesis method. It overcomes the iterative requirement of Heeger and Bergen's method by enforcing a top down philosophy akin to Paget and Longstaff's approach. That is, restricting the sampling procedure by conditioning it on the previous results from coarser resolutions in the decomposed pyramid. In De Bonet's method texture structure is also better handled than in Heeger and Bergen's method by further restricting the sampling procedure to pixels that fall within a threshold determined by texture features. Although De Bonet's method performed better than Heeger and Bergen's method for a wider variety of textures, the tuning of the threshold parameters is not as intuitive as selecting a neighbourhood size as in Paget and Longstaff's method. Also De Bonet's method was very sensitive to the choice of these threshold parameters, which if chosen incorrectly detrimentally affected the output texture.
Zhu, Wu, and Mumford, '97, '98: FRAME : Filters, random fields and maximum entropy towards a unified theory for texture modelling.
This paper amalgamated the filter technology and MRF technology to produce the so-called FRAME model. It did this by comparing the histograms of both the filter responses from the original texture and that of the synthetic. The synthetic texture was then continually updated with respect to an evolving MRF probability function, that was defined with respect to modulated filter responses. The modulation was defined with respect to differences between the expected filter response using the current MRF probability function and the filter response from the original texture. All this avoided the messy process of trying to reconstruct a texture from arbitrary filter responses and wrapped it all in some nice clean mathematics, but the synthesis/modelling process was very slow.
Simoncelli and Portilla, '98: Texture characterisation via joint statistics of wavelet coefficient magnitudes.
A similar technique to that of Heeger and Bergen, but where Heeger and Bergen updated the complete filter response using histogram equalisation, Simoncelli and Portilla updated each point in the pyramid of filter responses with respect to the correlations using a method similar to projection onto convex sets (POCS). They did this by finding an orthogonal projection from the filter response of the synthetic texture to that of the original. After the projection of all filter responses, the wavelet pyramid was collapsed, further projection was performed, and then the pyramid was reconstructed. This iteration continued until a convergence was reached. Simoncelli and Portilla found that only a few minutes of processing time was required to produce reasonable results. However they still had some failures, and had difficulty maintaining fidelity with textures containing structure.
Now that the mechanics of texture synthesis had basically been worked out, people started to look for more simpler, faster and robust methods that were more accessible to the graphics community.
Efros and Leung, '99: Texture synthesis by non-parametric sampling.
Efros and Leung gave a rendition of Popat's work. However in their case they explicity used the exhaustive nearest neighbour searching. They also initialised the synthetic texture with an arbitrarily placed patch of texture. However this just changed the order in which the pixels were synthesised, and did not change the fact that it was a sequential nonparametric Markov chain type of approach as proposed by Popat, which had inherent stability problems.
Wei and Levoy, '00: Fast texture synthesis using tree-structured vector quantisation.
Wei and Levoy, also used Popat's approach. However they did not change the synthesis order for a very good reason, speed. Keeping the synthesis order as proposed by Popat, meant that the Markov neighbourhood was fairly consistent over the whole synthesis process. This allowed for data compression. Popat had used Gaussian kernels to define a pdf, Wei and Levoy used tree-structured vector quantisation to quickly search for the nearest neighbour. As highlighted by Efros and Leung, synthesis via nearest neighbour gave superior results.
Zhu, Liu and Wu, '00: Julesz ensemble.
This is a cut down version of their previous work, FRAME : Filters, random fields and maximum entropy towards a unified theory for texture modelling. Here they propose a slightly faster algorithm for synthesis, one that does not need to estimate the model parameters. Instead of creating a pdf for a Gibbs ensemble, they directly use the statistics from the filters to do the sampling via the Markov chain Monte Carlo algorithm.
Xu, Guo and Shum, '00: Chaos mosaic: fast and memory efficient texture synthesis.
Xu, Guo and Shum, '00: Patch-based texture synthesis.
These two papers saw the birth of the so-called "patch-based" texture synthesis. Instead of copying one pixel at a time from an input image, they copy whole patches. In this case, these algorithms randomly distribute patches from an input texture onto an output lattice, smoothing the resulting edges between overlapping patches with simple cross-edge filtering.
Liang et.al.,'01: Real-time texture synthesis by patch-based sampling.
This paper highlights the problems with both Efros and Leung's and Wei and Levoy's sequential based nonparametric Markov chain type of approach. It also displays specific failures. It is interesting to note that these failures do not occur in the previous work of Paget and Longstaff. Anyway, Liang et.al. propose a solution for Efros and Leung's and Wei and Levoy's models. Instead of copying one pixel at a time, they copy a whole patch. This also leads to a marked speed increase in synthesis time.
Turk, '01: Texture synthesis on surfaces.
Turk shows how the nonparametric MRF techniques of Paget and Longstaff and Wei and Levoy can be extended to texture synthesis on arbitrary meshes. He uses a similar coarse to fine synthesis approach as presented in both papers. He also supplements the synthesis process with a directional field over the mesh to help maintain a consistent orientation to the texture.
Harrison, '01: A non-hierarchical procedure for re-synthesis of complex textures.
Again another paper that highlights the inadequacies of Efros and Leung's sequential based nonparametric Markov chain texture synthesis algorithm. In this case, the solution that is presented is one in which the synthesis order is manipulated to correspond to particular directional characteristics of the input texture.
Ashikhmin, '01: Synthesising natural textures.
Here we have the first real solution to the time consuming procedure of exhaustive nearest neighbour searching, without loss of quality as in Wei and Levoy's approach. In fact Ashikhmin's method actually gives both an increase in synthesis quality and speed. In his seminal paper, he proposes a new measure of nearest neighbour instead of either the Manhatten or Euclidean distance, as he suggests that these may not be the best measure to test for perceptual similarity. He notes that if we are only taking pixel colours from the input image (and not sampling from a larger distribution), then when we synthesise a colour for a pixel, we can be assured that each of its defined neighbours correspond to a pixel within the input image. Speed can be gained if, instead of doing a exhaustive search, we only sample from those pixels with a corresponding neighbour.
Unfortunately he applied his new technique to Wei and Levoy's algorithm, which meant that his results were not as good as they should have been. Applying his technique to Paget and Longstaff's algorithm, we get an algorithm that is arguable one of the best for synthesising natural textures. In this fast version I have slightly modified Ashikhmin's approach by sampling from all pixels which have at least one of its neighbours the same colour as its respective neighbour in the synthetic texture.
Hertzmann et.al., '01 Image analogies: a general texture transfer framework.
Although Ashikhmin's technique does very well for natural textures, sharp phase discontinuities can occur with textures that contain a high degree of structure. In these cases, the Euclidean distance combined with exhaustive nearest neighbour searching gives a smoother transition between these discontinuities. Hertzmann et.al. recognised this and proposed their algorithm which uses both measures of perceptual similarity. They then used a heuristic measure to decide which sample is used in the synthetic texture.
Efros and Freeman, '01: Image quilting: stitch together patches of input image, texture transfer.
Developed concurrently with Liang et.al's approach, Efros and Freeman take patch-based texture synthesis a step further. Instead of blending overlapping edges with a filter, they propose cutting and joining the respective patches along a boundary for which the difference in pixel values is minimal.
Zelinka and Garland, '02: Towards Real-Time Texture Synthesis with the Jump Map.
What if nearest neighbour comparisons could be avoided during synthesis? This is what Zelinka and Garland try to accomplish by creating a k nearest neighbour lookup table as part of an input texture analysis stage. They then use this table to make random jumps (like as in Schödl el.al's video textures) during their sequential texture synthesis stage. No neighbourhood comparisons are done during synthesis, which makes the algorithm very fast.
Tong et.al., '02: Synthesis of Bidirectional Texture Functions on Arbitrary Surfaces.
Tong et.al also use a k nearest neighbour lookup table, but instead of defining random jump paths like Zelinka and Garland, they simply use the list of k nearest neighbours as a sample base from which to choose the neighbourhood that gives the minimal Euclidean distance measure. When k equals one, this method is comparable to Ashikhmin's method.
Tong et.al. noted that k should be set depending on the type of texture being synthesised. For natural textures where the high frequency component is desired, a low k should be used, and for other textures where better blending is required, then a relatively high k should be used.
In this version of Paget and Longstaff's algorithm, I have implemented Tong et.al's nearest neighbour search strategy. Also the multiresolution is implemented via a Gaussian pyramid of the input image (instead of a decimated quadtree as in Paget and Longstaff's algorithm). Although there is an improvement in blending, the results are not as good for natural textures as in the fast version.
Nealen and Alexa, '03: Hybrid Texture Synthesis.
Further modification of Liang et.al's patch-based texture synthesis. This work was also inspired by Soler, Cani and Angelidis's hierarchical pattern mapping. Nealen and Alexa's algorithm uses adaptive patches to fill a lattice. These patches are chosen with respect to a minimum border error. An error that is quickly calculated via an FFT. Mismatches in the overlapping border above a certain threshold are resynthesised.
Kwatra, Schödl, Essa, Turk and Bobick, '03: Graphcut textures: Image and video synthesis using graph cuts.
An advanced version of Efros and Freeman's image quilting. Kwatra et.al. use a graph cut algorithm called min-cut or max-flow, Kwatra et.al. also uses Soler et.al.'s FFT-based acceleration of patch searching via sum-of-squared differences. Their algorithm is also adept at video texture synthesis.