Pratim GhoshWe propose an image duplicate detection method for identifying modified copies of the same image in a very large database. Modifications that we consider include rotation, scaling and cropping. A compact 12 dimensional descriptor based on Fourier Mellin Transform is introduced. The compactness of this descriptor allows efficient indexing over the entire database. Results are presented on a 10 million image database that demonstrates the effectiveness and the efficiency of this descriptor. In addition, we also propose extension to arbitrary shape representations and similar scene detection and preliminary results are also included.
Tested Video Sequences and Results:
1: South101_tracking Result
2: IVT_datasetResult
3: Cavier Dataset: OneLeaveShopReenter2cor Result
4: Cavier Dataset: Meet_Split_3rdGuy Result
5: North101_tracking Result
We introduce a robust image segmentation method based on a variational formulation using edge flow vectors. We demonstrate the non-conservative nature of this flow field, a feature that helps in a better segmentation of objects with concavities. A multi-scale version of this method is developed and is shown to improve the localization of the object boundaries. We compare and contrast the proposed method with well known state-of-the-art methods. Detailed experimental results are provided on both synthetic and natural images that demonstrate that the proposed approach is quite competitive.
We present an efficient and accurate method for duplicate video detection in a large database using video fingerprints. We have empirically chosen the Color Layout Descriptor, a compact and robust frame based descriptor, to create fingerprints which are further encoded by vector quantization. We propose a new non-metric distance measure to find the similarity between the query and a database video fingerprint and experimentally show its superior performance over other distance measures for accurate duplicate detection. Efficient search can not be performed for high dimensional data using a non-metric distance measure with existing indexing techniques. Therefore, we develop novel search algorithms based on precomputed distances and new dataset pruning techniques yielding practical retrieval times. We perform experiments with a database of 38000 videos, worth 1600 hours of content. For individual queries with an average duration of 60 sec (about 50% of the average database video length), the duplicate video is retrieved in 0.032 sec, on Intel Xeon with CPU 2.33GHz, with a very high accuracy of 97.5%.
We present a model for the automated segmentation of cells from confocal microscopy volumes of biological samples. The segmentation task for these images is exceptionally challenging due to weak boundaries and varying intensity during the imaging process. To tackle this, a two step pruning process based on the Fast Marching Method is first applied to obtain an over-segmented image. This is followed by a merging step based on an effective feature representation. The algorithm is applied on two different datasets: one from the ascidian Ciona and the other from the plant Arabidopsis. The presented 3D segmentation algorithm shows promising results on these datasets.
We present a method for object tracking over time sequence imagery. The image plane is represented with a 4-connected planar graph where vertices are associated with pixels. On each image, the outer contour of the object is localized by finding the optimal cycle in the graph such that a cost function based on temporal, appearance and shape priors is minimized. Our contribution is the particle filtering-based framework to integrate the shape cue with the temporal and appearance cues. We demonstrate that incorporating the shape prior yields promising performance improvement over temporal and appearance priors on various object tracking scenarios.
Simultaneous registration and segmentation (SRS) provides a powerful framework for tracking an object of interest in an image sequence. The state-of-the-art SRSbased tracking methods assume that the illumination is maintained constant across consecutive frames. However, this assumption does not hold in many natural image sequences due to dynamic light source and shadows. We propose a generalized model for SRS-based tracking in this paper to account for non-uniform additive illumination changes. More specifically, we introduce two new terms in the SRS energy functional which address the above mentioned problem. The first term couples the shape-based cue and intensity-based cue to establish a correspondence between them. The second term compensates for the illumination change which is complementary to the first term. We demonstrate that the proposed SRS energy functional yields superior performance over the state-of-the-art SRS-based methods for various indoor and outdoor image sequences.
We introduce a dynamical model for simultaneous registration and segmentation in a variational framework for image sequences, where the dynamics is incorporated using a Bayesian formulation. A linear stochastic equation relating the tracked object (or a region of interest) is first derived under the assumption that the successive images in the sequence are related by a dense and possibly non-linear displacement field. This derivation allows for the use of a computationally efficient and recursive implementation of the Bayesian formulation in this framework. The contour of the tracked object returned by the dynamical model is not only close to the previously detected shape but is also consistent with the temporal statistics of the tracked object. The performance of the proposed approach is evaluated on real image sequences. It is shown that, with respect to a variety of error metrics such as F-measure, mean absolute deviation and Hausdorff distance, the proposed approach outperforms the state-of-the art approach without the dynamical model.
In this paper we demonstrate the effectiveness of reference (or atlas)-based non-rigid registration to the segmentation of medical and biological imagery. In particular we introduce a segmentation functional exploiting feature information about the reference image and we minimize it with respect to the parameters of the non-rigid transformation, akin to a region based maximum likelihood estimation process. The warping transformation is modeled using Thin Plate Splines, which incorporate information about the global rigid motion and the non-rigid local displacements. Extensive experimental evaluations and comparisons with other segmentation techniques on a complex biological dataset are presented. The proposed algorithm outperforms the others in both classification rate and, in particular, localization accuracy.
A video "fingerprint" is a feature extracted from the video that should represent the video compactly, allowing faster search without compromising the retrieval accuracy. Here, we use a keyframe set to represent a video, motivated by the video summarization approach. We experiment with different features to represent each keyframe with the goal of identifying duplicate and similar videos. Various image processing operations like blurring, gamma correction, JPEG compression, and Gaussian noise addition are applied on the individual video frames to generate duplicate videos. Random and bursty frame drop errors of 20%, 40% and 60% (over the entire video) are also applied to create more noisy "duplicate" videos. The similar videos consist of videos with similar content but with varying camera angles, cuts, and idiosyncrasies that occur during successive retakes of a video. Among the feature sets used for comparison, for duplicate video detection, Compact Fourier-Mellin Transform (CFMT) performs the best while for similar video retrieval, Scale Invariant Feature Transform (SIFT) features are found to be better than comparable-dimension features. We also address the problem of retrieval of full-length videos with shorter-length clip queries. For identical feature size, CFMT performs the best for video retrieval.
The expression levels of rodopsin and glial fibrillary acidic protein (GFAP) capture important structural changes in the retina during injury and recovery. Quantitatively measuring these expression levels in confocal micrographs requires identifying the retinal layer boundaries and spatially corresponding the layers across different images. In this paper, a method to segment the retinal layers using a parametric active contour model is presented. Then spatially aligned expression levels across different images are determined by thresholding the solution to a Dirichlet boundary value problem. Our analysis provides quantitative metrics of retinal restructuring that are needed for improving retinal therapies after injury.
We present an image search and retrieval system, Cortina, that indexes over 10 Million images using image content, text and annotations. This large collection of image data, gathered from the World Wide Web (WWW), poses significant challenges to automated image analysis, pattern recognition and database indexing. At the systems level, the components of Cortina include building image collections using a Web crawler, collecting category information and keywords, and processing images to compute content descriptors. Functionalities of Cortina include duplicate image detection, category and image content based search, face detection and relevance feedback. A MySql database is used for storing textual annotations and keywords, whereas the image features are stored in flat file structures. This combination appears to be effective and scalable for large collection of image/video data and is easily parallelizable.
In this paper we propose a novel framework to obtain a very compact image signature (32 bits) which is invariant to rotation, translation, scaling and other minor perturbations like smoothing, random noise addition, JPEG compression etc. The framework involves Fourier-Mellin transform, conventional PCA and non-uniform scalar quantization. The high retrieval efficiency and low space consumption demonstrates the significance of our signature in duplicate image retrieval and large image database indexing.
Ph.D. Candidate at the Department of Electrical and Computer Engineering UCSB since 2007.
Serving as a member of the Program Committee of ICCV'11.
Served as a reviewer for CVPR'11.
Link to some of the CVPR'10 photos.
Link to some of the ICCV'09 photos.
Link to some of the Lab get-together photos.