PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 8135, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
High-resolution x-ray micro-tomography is used for imaging of solid materials at micrometer scale in 3D. Our
goal is to implement nondestructive techniques to quantify properties in the interior of solid objects, including
information on their 3D geometries, which supports modeling of the fluid dynamics into the pore space of the
host object. The micro-tomography data acquisition process generates large data sets that are often difficult to
handle with adequate performance when using current standard computing and image processing algorithms.
We propose an efficient set of algorithms to filter, segment and extract features from stacks of image slices of
porous media. The first step tunes scale parameters to the filtering algorithm, then it reduces artifacts using a
fast anisotropic filter applied to the image stack, which smoothes homogeneous regions while preserving borders.
Next, the volume is partitioned using statistical region merging, exploiting the intensity similarities of each
segment. Finally, we calculate the porosity of the material based on the solid-void ratio. Our contribution is to
design a pipeline tailored to deal with large data-files, including a scheme for the user to input image patches
for tuning parameters to the datasets. We illustrate our methodology using more than 2,000 micro-tomography
image slices from 4 different porous materials, acquired using high-resolution X-ray. Also, we compare our
results with standard, yet fast algorithms often used for image segmentation, which includes median filtering
and thresholding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Spatial transformations whose kernels employ sinusoidal functions for the decorrelation of signals remain as
fundamental components of image and video coding systems. Practical implementations are designed in fixed precision
for which the most challenging task is to approximate these constants with values that are both efficient in terms of
complexity and accurate with respect to their mathematical definitions. Scaled architectures, for example, as used in the
implementations of the order-8 Discrete Cosine Transform and its corresponding inverse both specified in ISO/IEC
23002-2 (MPEG C Pt. 2), can be utilized to mitigate the complexity of these approximations. That is, the
implementation of the transform can be designed such that it is completed in two stages: 1) the main transform matrix in
which the sinusoidal constants are roughly approximated, and 2) a separate scaling stage to further refine the
approximations. This paper describes a methodology termed the Common Factor Method, for finding fixed-point
approximations of such irrational values suitable for use in scaled architectures. The order-16 Discrete Cosine
Transform provides a framework in which to demonstrate the methodology, but the methodology itself can be employed
to design fixed-point implementations of other linear transformations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose fast algorithms for computing Discrete Sine and Discrete Cosine Transforms (DCT and DST) of types
VI and VII. Particular attention is paid to derivation of fast algorithms for computing DST-VII of lengths 4 and 8,
which are currently under consideration for inclusion in ISO/IEC/ITU-T High Efficiency Video Coding (HEVC)
standard.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In light of its capacity for remote physiological assessment over a wide range of anatomical locations, imaging
photoplethysmography has become an attractive research area in biomedical and clinical community. Amongst recent
iPPG studies, two separate research directions have been revealed, i.e., scientific camera based imaging PPG (iPPG) and
webcam based imaging PPG (wPPG). Little is known about the difference between these two techniques. To address this
issue, a dual-channel imaging PPG system (iPPG and wPPG) using ambient light as the illumination source has been
introduced in this study. The performance of the two imaging PPG techniques was evaluated through the measurement of
cardiac pulse acquired from the face of 10 male subjects before and after 10 min of cycling exercise. A time-frequency
representation method was used to visualize the time-dependent behaviour of the heart rate. In comparison to the gold
standard contact PPG, both imaging PPG techniques exhibit comparable functional characteristics in the context of
cardiac pulse assessment. Moreover, the synchronized ambient light intensity recordings in the present study can provide
additional information for appraising the performance of the imaging PPG systems. This feasibility study thereby leads
to a new route for non-contact monitoring of vital signs, with clear applications in triage and homecare.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A framework for region/zone classification in color and gray-scale scanned documents is proposed in this paper.
The algorithm includes modules for extracting text, photo, and strong edge/line regions. Firstly, a text detection
module which is based on wavelet analysis and Run Length Encoding (RLE) technique is employed. Local and
global energy maps in high frequency bands of the wavelet domain are generated and used as initial text maps.
Further analysis using RLE yields a final text map. The second module is developed to detect image/photo and
pictorial regions in the input document. A block-based classifier using basis vector projections is employed to
identify photo candidate regions. Then, a final photo map is obtained by applying probabilistic model based
on Markov random field (MRF) based maximum a posteriori (MAP) optimization with iterated conditional
mode (ICM). The final module detects lines and strong edges using Hough transform and edge-linkages analysis,
respectively. The text, photo, and strong edge/line maps are combined to generate a page layout classification of
the scanned target document. Experimental results and objective evaluation show that the proposed technique
has a very effective performance on variety of simple and complex scanned document types obtained from
MediaTeam Oulu document database. The proposed page layout classifier can be used in systems for efficient
document storage, content based document retrieval, optical character recognition, mobile phone imagery, and
augmented reality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper focuses on the 2D-3D camera pose estimation using one LiDAR view and one calibrated camera view.
The pose estimation employs an intelligent search over the extrinsic camera parameters and uses an error metric
based on line-segment matching. The goal of this search process is to estimate the pose parameters without
any apriori knowledge and in less processing time. We demonstrated the validity of the proposed approach by
experimenting on two sets of perspective views using lines as feature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Main defect of the structured light scanning is that the edge part is lost in the point clouds of scanned object. This
research tried to combine the image processing method to a structured light system in order to improve the quality of the
point cloud. The technique approaches are present, and the results are given as below: after overlying the edge part of
the 3D model to the original point cloud from the structured light system, their hiatus can be restored and the resolution
of the original point cloud can be improved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Correlation filters for target detection are usually designed under the assumption that the appearance of a target
is explicitly known. Because the shape and intensity values of a target are used, correlation filters are highly
sensitive to changes in the target appearance in the input scene, such as those of due to rotation or scaling.
Composite filter design was introduced to address this problem by accounting for different possibilities for the
appearance of the target within the input scene. However, explicit knowledge for each possible appearance is
still required. In this work, we propose composite filter design when an object to be recognized is given in noisy
training images and its exact shape and intensity values are not explicitly known. Optimal filters with respect
to the peak-to-output energy criterion are derived and used to synthesize a single composite filter that can be
used for distortion invariant target detection. Parameters required for filter design are estimated with suggested
techniques. Computer simulation results obtained with the proposed filters are presented and compared with
those of common composite filters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A two-step procedure for the reliable recognition and multiclassication of objects in cloudy environments is proposed. The input scene is preprocessed with the help of an iterative algorithm to remove the effects of the cloudy environment, followed by a complex correlation filtering for the multiclassication of target objects. The iterative algorithm is based on a local heuristic search inside a moving window using a nonlinear signal model for the input scene. The preprocessed scene is correlated with a multiclass correlation filter based in complex synthetic discriminant functions. Computer simulation results obtained with the proposed approach in cloudy images are presented and discussed in terms of different performance metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An overview of three-dimensional (3D) object recognition using neural networks is presented. Three dimensional sensing and imaging of 3D objects is performed using an integral imaging optical set up. Neural network technique is applied to recognize 3D objects. Experimental results are presented together with computational results for performance assessment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Laser radar (LADAR) systems produce both a range image and an intensity image. When the transmitted
LADAR pulse strikes a sloped surface, the returned pulse is expanded temporally. This makes it possible to
estimate the gradient of a surface, pixel by pixel. This paper seeks to find the gradient of the surface of an
object from a realistic LADAR return pulse that includes probabilistic noise models. Additionally, optimal
and computationally simple interpolation filters are each derived to recover Nyquist-sampled data from data
spatially sampled below the Nyquist rate. The filters will then be applied to the embedded information in
the gradient to allow the sampling density in the spatial domain to be taken at below the Nyquist criterion
while still facilitating an effective 3D reconstruction of an image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Compression standards such as H.264/AVC encode video sequences to maximize fidelity at a given bitrate. However,
semantic-oriented and content-aware compression remains a challenge. In this paper, we propose a semantic video
compression method using seam carving. Seam carving changes the dimension of an image/video with a non-uniform
resampling of each row and column while keeping the rectangular shape of the image. Our main contribution is a new
approach to identify areas where seams are concentrated. On the one hand, it allows to transmit supplemental seams data
at low cost. On the other hand, seams can be synthesized at the decoder in order to recover the original frame size and to
preserve the scene geometry. Experiments show that our seam carving method combined with standard H.264/AVC
coding results in significant bitrate savings compared with the original H.264/AVC. Reported gains reach 39% at very
high bitrates and 22% at very low bitrates. Furthermore, the reconstructed video has the same quality in semantically
significant regions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video retargeting from a full-resolution video to a lower-resolution display will inevitably cause information loss.
Content-aware video retargeting techniques have been studied to avoid critical visual information loss while resizing a
video. In this paper, we propose a mosaic-guided video retargeting scheme to ensure good spatio-temporal coherence of
the downscaled video. Besides, a rate-distortion optimization framework is proposed to maximize the information
retained in the downscaled video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Motion-compensated prediction induces a chain of coding dependencies between pixels in video. In principle,
an optimal selection of encoding parameters (motion vectors, quantization parameters, coding modes) should
take into account the whole temporal horizon of a GOP. However, in practical coding schemes, these choices are
made on a frame-by-frame basis, thus with a possible loss of performance. In this paper we describe a tree-based
model for pixelwise coding dependencies: each pixel in a frame is the child of a pixel in a previous reference
frame. We show that some tree structures are more favorable than others from a rate-distortion perspective, e.g.,
because they entail a large descendance of pixels which are well predicted from a common ancestor. In those
cases, a higher quality has to be assigned to pixels at the top of such trees. We promote the creation of these
structures by adding a special discount term to the conventional Lagrangian cost adopted at the encoder. The
proposed model can be implemented through a double-pass encoding procedure. Specifically, we devise heuristic
cost functions to drive the selection of quantization parameters and of motion vectors, which can be readily
implemented into a state-of-the-art H.264/AVC encoder. Our experiments demonstrate that coding efficiency is
improved for video sequences with low motion, while there are no apparent gains for more complex motion. We
argue that this is due to both the presence of complex encoder features not captured by the model, and to the
complexity of the source to be encoded.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper considers the reliability of usual assessment methods when evaluating virtual synthesized views in the multiview
video context. Virtual views are generated from Depth Image Based Rendering (DIBR) algorithms. Because DIBR
algorithms involve geometric transformations, new types of artifacts come up. The question regards the ability of
commonly used methods to deal with such artifacts. This paper investigates how correlated usual metrics are to human
judgment. The experiments consist in assessing seven different view synthesis algorithms by subjective and objective
methods. Three different 3D video sequences are used in the tests. Resulting virtual synthesized sequences are assessed
through objective metrics and subjective protocols. Results show that usual objective metrics can fail assessing
synthesized views, in the sense of human judgment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose two multiview image compression methods. The basic concept of both schemes is
the layer-based representation, in which the captured three-dimensional (3D) scene is partitioned into layers
each related to a constant depth in the scene. The first algorithm is a centralized scheme where each layer is
de-correlated using a separable multi-dimensional wavelet transform applied across the viewpoint and spatial
dimensions. The transform is modified to efficiently deal with occlusions and disparity variations for different
depths. Although the method achieves a high compression rate, the joint encoding approach requires the transmission
of all data to the users. By contrast, in an interactive setting, the users request only a subset of the
captured images, but in an unknown order a priori. We address this scenario in the second algorithm using
Distributed Source Coding (DSC) principles which reduces the inter-view redundancy and facilitates random
access at the image level. We demonstrate that the proposed centralized and interactive methods outperform
H.264/MVC and JPEG 2000, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent investigations have shown that one of the most beneficial elements for higher compression performance in highresolution
video is the incorporation of larger block structures. In this work, we will address the question of how to
incorporate perceptual aspects into new video coding schemes based on large block structures. This is rooted in the fact
that especially high frequency regions such as textures yield high coding costs when using classical prediction modes as
well as encoder control based on the mean squared error. To overcome this problem, we will investigate the
incorporation of novel intra predictors based on image completion methods. Furthermore, the integration of a perceptualbased
encoder control using the well-known structural similarity index will be analyzed. A major aspect of this article is
the evaluation of the coding results in a quantitative (i.e. statistical analysis of changes in mode decisions) as well as
qualitative (i.e. coding efficiency) manner.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Significant research has been performed on the use of directional transforms for the compression of still imagery, in
particular on the application of block-based and segmentation driven directional adaptive discrete wavelet transforms.
However, all of the proposed methodologies suffer from the extra side-information that needs to be encoded. This
encoding overhead and added complexity is unfortunately not negligible. This paper describes various considerations
and trade-offs that were made during the search towards a practical solution for using directional adaptive transforms in
still image coding. We propose two codec instantiations respectively based upon quadtree-coding (QT-L) and
JPEG 2000's EBCOT engine and discuss various experimental results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Today, several alternatives for compression of digital pictures and video sequences exist to choose from. Beside
internationally recognized standard solutions, open access options like the VP8 image and video compression
have recently appeared and are gaining popularity. In this paper, we present the methodology and the results
of the rate-distortion performance analysis of VP8. The analysis is based on the results of subjective quality
assessment experiments, which have been carried out to compare the two algorithms to a set of state of the art
image and video compression standards.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a parametric video compression framework which exploits both texture warping and dynamic
texture synthesis. A perspective motion model is employed to warp static textures and a dynamic texture model
is used to synthesise time-varying textures. An artefact-based video quality metric (AVM) is proposed which
prevents spatial and temporal artefacts and assesses the reconstructed video quality. This is validated using
both the VQEG database and subjective assessment, and shows competitive performance on both non-synthetic
and synthetic video content. Moreover, a local Rate-Quality Optimisation (RQO) strategy is developed based
on AVM in order to make a decision between waveform coding and texture warping/synthesis. The proposed
method has been integrated into an H.264 video coding framework with results offering significant bitrate savings
for similar visual quality (based on both AVM and subjective scores).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present an improved approach to predictive video decoding based on global and local motion
reliability. The framework consists of three processing stages. Global motion (GM) estimation and motion reliability
analysis are the key components in the first stage, where we model global motion and refine the MV field. In the second
stage, we predict local and global motion for the target frame, and determine corresponding weights based on the Linear
Minimum Mean Square Error (LMMSE) criterion. Finally in the third stage, a temporal interpolator is applied to
compose two future frames, which are linearly combined to form the final predicted frame. Our results indicate the
proposed method achieves better visual quality compare to other state-of-the-art predictive decoding approaches,
particularly in sequences involving moving camera and objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, we propose an effective And-Or tree based packetization algorithm of Luby Transform (LT) codes to
provide stable video streaming services by minimizing the deterioration of video streaming service quality caused by lost
packets over error-prone wireless network. To accomplish our goal, the proposed packetization algorithm considers the
relationships among encoded symbols of LT codes based on an And-Or tree analysis tool, and then puts the these
encoded symbols into packets to minimize the packet loss effect during packet transmission and improve the decoding
success rate of LT codes by reducing the correlations among packets. We conduct a mathematical analysis to prove
performance of our packetization algorithm of LT codes compared with conventional packetization algorithm. Finally,
the proposed system is fully implemented in Java and C/C++, and widely tested to show that the proposed packetization
algorithm works reasonably well. The experimental results are provided to demonstrate that the proposed packetization
algorithm supports more stable video streaming services with higher peak signal-to-nose ratio (PSNR) than the
conventional packetization algorithm with various packet loss patterns, including random and burst packet loss patterns.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We consider ad hoc sensor network topologies that aim for distributed delivery of correlated delay-sensitive
data. In order for efficient data delivery, network coding technique in conjunction with approximate decoding
algorithm is deployed. The approximate decoding algorithm enables receivers to recover the original source
data even when the number of received data packets is not sufficient for decoding. Therefore, it leads to
significantly improved decoding performance and enhanced robustness for delay-sensitive data. In this paper,
we further improve the approximate decoding algorithm by explicitly considering the characteristics of the
correlation. Specifically, we study the case where the source data are correlated by a simple linear correlation,
which is quantified by a similarity factor. We investigate several properties of the proposed algorithm and
analyze the impact of the similarity factor on the decoding performance. Our experimental results confirm
the properties of the proposed approximate decoding algorithm with linear correlation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper reviews the design concepts of the MPEG-DASH standard and how it can be employed for the delivery of
live multimedia content over the Internet. MPEG-DASH is the MPEG's newest standard for streaming of multimedia
content and is designed to leverage the extensive HTTP infrastructure that has evolved with the growth of the World
Wide Web. The standard defines specifications for manifest and segment formats used between the content servers and
the client devices. This paper focuses on the live streaming of video content, its challenges and how MPEG-DASH can
be used for this application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents the argument for the development of an Internet video compression standard. The history and
science of video compression is reviewed with a focus on identifying how network technology has influenced video
compression technology. It is argued that the use of the Internet to deliver video content, an application it was not
designed for, and the fact that the Internet is here to stay, calls for a critical look at existing video compression standards.
An analysis of the performance of these standards in delivering content over the Internet is provided with an explanation
of why these standards have shortcomings in this application domain. Because of this, it is argued, video compression
technology for the Internet will need to be different to what is used in other application domains. The paper further
presents a discussion on what the technical characteristics of video compression technology would need to be for it to
deliver high quality video over the Internet in an interoperable manner, thereby concluding that a new video compression
standard is needed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present a solution to improve the performance of adaptive HTTP streaming services. The proposed
approach uses a content aware method to determine whether switching to a higher bitrate can improve video quality. The
proposed solution can be implemented as a new parameter in segment description to enable content switching only in
cases with meaningful increase in quality. Results of our experiments show clear advantages of using additional
parameter in DASH implementation. The proposed approach enables significant bandwidth savings with minimal
decrease in quality. It guarantees optimal path of adaptation in various scenarios that can be beneficial both for network
providers and end users.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Switching the backlight of handheld devices to low power mode saves energy but affects the color appearance of an
image. In this paper, we consider the chroma degradation problem and propose an enhancement algorithm that
incorporates the CIECAM02 appearance model to quantitatively characterize the problem. In the proposed algorithm, we
enhance the color appearance of the image in low power mode by weighted linear superposition of the chroma of the
image and that of the estimated dim-backlight image. Subjective tests are carried out to determine the perceptually
optimal weighting and prove the effectiveness of our framework.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper a new method for the autostereoscopic display, named the Dual Layer Parallax Barrier (DLPB) method, is
introduced to overcome the limitation of the fixed viewing zone. Compared with the conventional parallax barrier
methods, the proposed DLPB method uses moving parallax barriers to make the stereoscopic view changed according to
the movement of viewer. In addition it provides seamless stereoscopic views without abrupt change of 3D depth feeling
at any eye position. We implement a prototype of the DLPB system which consists of a switchable dual-layered Twisted
Nematic Liquid Crystal Display (TN-LCD) and a head-tracker. The head tracker employs a video camera for capturing
images, and is used to calculate the angle between the eye gazing direction and the projected direction onto the display
plane. According to the head-tracker's control signal, the dual-layered TN-LCD is able to alternate the direction of
viewing zone adaptively by a solid-state analog switch. The experimental results demonstrate that the proposed
autostereoscopic display maintains seamless 3D views even when a viewer's head is moving. Moreover, its extended use
towards mobile devices such as portable multimedia player (PMP), smartphone, and cellular phone is discussed as well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We design a distributed multi-channel P2P Video-on-Demand (VoD) system using "plug-and-play" helpers.
Helpers are heterogenous "micro-servers" with limited storage, bandwidth and number of users they can serve
simultaneously. Our proposed system has the following salient features: (1) it jointly optimizes over helper-user
connection topology, video storage distribution and transmission bandwidth allocation; (2) it minimizes server
load, and is adaptable to varying supply and demand patterns across multiple video channels irrespective of video
popularity; and (3) it is fully distributed and requires little or no maintenance overhead. The combinatorial nature
of the problem and the system demand for distributed algorithms makes the problem uniquely challenging. By
utilizing Lagrangian decomposition and Markov chain approximation based arguments, we address this challenge
by designing two distributed algorithms running in tandem: a primal-dual storage and bandwidth allocation
algorithm and a "soft-worst-neighbor-choking" topology-building algorithm. Our scheme provably converges to
a near-optimal solution, and is easy to implement in practice. Packet-level simulation results show that the
proposed scheme achieves minimum sever load under highly heterogeneous combinations of supply and demand
patterns, and is robust to system dynamics of user/helper churn, user/helper asynchrony, and random delays in
the network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Scalability features embedded within the video sequences allows for streaming over heterogeneous networks to a
variety of end devices. Compressive sensing techniques that will allow for lowering the complexity increase the
robustness of the video scalability are reviewed. Human visual system models are often used in establishing
perceptual metrics that would evaluate quality of video. Combining of perceptual and compressive sensing
approach outlined from recent investigations. The performance and the complexity of different scalability
techniques are evaluated. Application of perceptual models to evaluation of the quality of compressive sensing
scalability is considered in the near perceptually lossless case and to the appropriate coding schemes is reviewed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Super-resolution (SR) is the process of obtaining a higher resolution image from a set of lower resolution (LR)
blurred and noisy images. One may, then, envision a scenario where a set of LR images is acquired with a
sensor on a moving platform. In such a case, an SR image can be reconstructed in an area of sufficient overlap
between the LR images which generally have a relative shift with respect to each other by subpixel amounts.
The visual quality of the SR image is affected by many factors such as the optics blur, the inherent signalto-
noise ratio of the system, quantization artifacts, the number of scenels (scene elements) i.e., the number of
overlapped images used for SR reconstruction within the SR grid and their relative arrangement. In most cases
of microscanning, the subpixel shifts between the LR images are pre-determined: hence the number of the scenels
within the SR grid and their relative positions with respect to each other are known and, as a result, can be used
in obtaining the reconstructed SR image with high quality. However, the LR images may have relative shifts
that are unknown. This random pattern of subpixel shifts can lead to unpleasant visual quality, especially at
the edges of the reconstructed SR image. Also, depending on the available number of the LR images and their
relative positions, it may be possible to produce SR only along a single dimension diagonal, horizontal or vertical
and use interpolation in the orthogonal dimension because there isn't sufficient information to produce a full 2D
image. We investigate the impact of the number of overlapped regions and their relative arrangement on the
quality of the SR images, and propose a technique that optimally allocates the available LR scenels to the SR
grid in order to minimize the expected unpleasant visual artifacts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the field of identifying regions-of-interest (ROI) in digital images, several image-sets are referenced in the literature;
the open-source ones typically present a single main object (usually located at or near the image center as a pop-out). In
this paper, we present a comprehensive image-set (with its ground-truth) which will be made publically available. The
database consists of images that demonstrate multiple-regions-of-interest (MROI) or multiple-levels-of-interest (MLOI).
The former terminology signifies that the scene has a group of subjects/objects (not necessarily spatially-connected
regions) that share the same level of perceptual priority to the human observer while the latter indicates that the scene is
complex enough to have primary, secondary, and background objects. The methodology for developing the proposed
image-set is described. A psychophysical experiment to identify MROI and MLOI was conducted, the results of which
are also presented. The image-set has been developed to be used in training and evaluation of ROI detection algorithms.
Applications include image compression, thumbnailing, summarization, and mobile phone imagery.
fluor
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A fused image using multispectral images can increase the reliability of interpretation because it combines the
complimentary information apparent in multispectral images. While a color image can be easily interpreted by human
users (for visual analysis), and thus improves observer performance and reaction times. We propose a fast color fusion
method, termed as channel-based color fusion, which is efficient for real time applications. Notice that the term of "color
fusion" means combing multispectral images into a color-version image with the purpose of resembling natural scenes.
On the other hand, false coloring technique usually has no intention of resembling natural scenery. The framework of
channel-based color fusion is as follows, (1) prepare for color fusion by preprocessing, image registration and fusion; (2)
form a color fusion image by properly assigning multispectral images to red, green, and blue channels; (3) fuse
multispectral images (gray fusion) using a wavelet-based fusion algorithm; and (4) replace the value component of color
fusion in HSV color space with the gray-fusion image, and finally transform back to RGB space. In night vision
imaging, there may be two or several bands of images available, for example, visible (RGB), image intensified (II), near
infrared (NIR), medium wave infrared (MWIR), long wave infrared (LWIR). The proposed channel-wise color fusions
were tested with two-band (e.g., NIR + LWIR, II + LWIR, RGB + LWIR) or three-band (e.g., RGB + NIR + LWIR)
multispectral images. Experimental results show that the colors in the fused images by the proposed method are vivid
and comparable with that of the segmentation-based colorization. The processing speed of new method is much faster
than any segmentation-based method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
While traditional image quality metrics like MSE are mathematically well understood and tractable, they are known
to correlate weakly to image distortion as observed by human observers. To address this situation, many full reference
quality indices have been suggested over the years that correlate better to human perception, one of them being the
well-known Structural Similarity Index by Wang and Bovik. However, while these expressions show higher correlations,
they are often not very tractable mathematically, and in specific are rarely metrics in the strict mathematical
sense. Specifically, the triangle inequality is often not satisfied, which could either be seen as an effect of the human visual system being unable to compare images that are visually too different, or as a defect of the index capturing the global situation correctly. In this article, the latter position is taken, and it is shown how the SSIM can be understood as a local approximation of a global metric, namely the geodesic distance on a manifold. While the metric cannot be computed explicitly in most cases, it is nevertheless shown that in specific cases its expression is identical to Weber's
Law of luminance sensitivity of the human eye.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Equal-Expectation Magnitude Quantization (EEM) aims at minimizing the distortion of a quantizer with defined reconstruction points by shifting the deadzone parameter such that the expectation value of the signal equals the reconstructed value. While intuitively clear, this argument is not sufficient to prove rate-distortion optimality.
In this work, it is show that the EEM quantizer is rate-distortion optimal up to third order in an expansion in powers of the quantization bucket size in the high-bitrate appoximation, and the approximating series for the optimal quantizer is computed.
This result is compared to an even simpler quantization strategy based on the LLoyd-Max quantizer which selectively sets coefficients to zero. It is shown that both strategies lead to the same asymptotic expansion for the threshold parameter, but zeroing coefficients provides optimality in one additional order in the quantization bucket size.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The analysis and measurement of the wavefront aberration function are very important tools in the field of visual optics;
they are used to understand the performance of the human eye in terms of its optical aberrations. In recent years, we have
compared, through two different methods, the wavefront aberration function of a reference refractive surface of 5 mm in
diameter and we have demonstrated its equivalence1. Now, we want to extend these results to a set of hard contact
lenses. These hard contact lenses have been subjected to different laser ablation techniques which are typically used in
refractive surgery. Our goal is to characterize the resultant ablation profile. We show our results obtained for both, a nonablated
hard contact lens and the corresponding ablated samples.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This article mainly focuses on image processing of radial imaging capsule endoscope (RICE). First, it used the radial
imaging capsule endoscope (RICE) to take the images, the experimental used a piggy to get the intestines and captured
the images, but the images captured by RICE were blurred due to the RICE has aberration problems in the image center
and lower light uniformity affect the image quality. To solve the problems, image processing can use to improve it.
Therefore, the images captured by different time can use Person correlation coefficient algorithm to connect all the
images, and using the color temperature mapping way to improve the discontinuous problem in the connection region.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper reports on recent work by MPEG committee towards defining a standard for visual search applications. This
standardization initiative is referred to as Compact Descriptors for Visual Search (CDVS). The call for proposals for this
standard was issued by MPEG in March 2011, with responses due in October 2011. When completed, it is envisioned
that this standard will ensure high performance and interoperability of visual search applications, will simplify their
design, and will reduce amounts of visual search-related data that need to be stored or transmitted over networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
State-of-the-art image retrieval pipelines are based on "bag-of-words" matching. We note that the original order in which
features are extracted from the image is discarded in the "bag-of-words" matching pipeline. As a result, a set of features
extracted from a query image can be transmitted in any order. A set ofm unique features has m! orderings, and if the order
of transmission can be discarded, one can reduce the query size by an additional log2(m!) bits. In this work, we compare
two schemes for discarding ordering: one based on Digital Search Trees, and another based on location histograms. We
apply the two schemes to a set of low bitrate Compressed Histogram of Gradient (CHoG) features, and compare their
performance. Both schemes achieve approximately log2(m!) reduction in query size for a set of m features.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
3D face modeling has been one of the greatest challenges for researchers in computer graphics for many years. Various
methods have been used to model the shape and texture of faces under varying illumination and pose conditions from a
single given image. In this paper, we propose a novel method for the 3D face synthesis and reconstruction by using a
simple and efficient global optimizer. A 3D-2D matching algorithm which employs the integration of the 3D morphable
model (3DMM) and the differential evolution (DE) algorithm is addressed. In 3DMM, the estimation process of fitting
shape and texture information into 2D images is considered as the problem of searching for the global minimum in a
high dimensional feature space, in which optimization is apt to have local convergence. Unlike the traditional scheme
used in 3DMM, DE appears to be robust against stagnation in local minima and sensitiveness to initial values in face
reconstruction. Benefitting from DE's successful performance, 3D face models can be created based on a single 2D
image with respect to various illuminating and pose contexts. Preliminary results demonstrate that we are able to
automatically create a virtual 3D face from a single 2D image with high performance. The validation process shows that
there is only an insignificant difference between the input image and the 2D face image projected by the 3D model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A land-based early forest fire detection scheme which exploits the infrared (IR) temporal signature of fire
plume is described. Unlike common land-based and/or satellite-based techniques which rely on
measurement and discrimination of fire plume directly from its infrared and/or visible reflectance
imagery, this scheme is based on exploitation of fire plume temporal signature, i.e., temperature
fluctuations over the observation period. The method is simple and relatively inexpensive to implement.
The false alarm rate is expected to be lower that of the existing methods. Land-based infrared (IR)
cameras are installed in a step-stare-mode configuration in potential fire-prone areas. The sequence of IR
video frames from each camera is digitally processed to determine if there is a fire within camera's field
of view (FOV). The process involves applying a principal component transformation (PCT) to each nonoverlapping
sequence of video frames from the camera to produce a corresponding sequence of
temporally-uncorrelated principal component (PC) images. Since pixels that form a fire plume exhibit
statistically similar temporal variation (i.e., have a unique temporal signature), PCT conveniently renders
the footprint/trace of the fire plume in low-order PC images. The PC image which best reveals the trace of
the fire plume is then selected and spatially filtered via simple threshold and median filter operations to
remove the background clutter, such as traces of moving tree branches due to wind.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
License plate recognition (LPR) system is to help alert relevant personnel of any passing vehicle in the surveillance area.
In order to test algorithms for license plate recognition, it is necessary to have input frames in which the ground truth is
determined. The purpose of ground truth data is here to provide an absolute reference for performance evaluation or
training purposes. However, annotating ground truth data for real-life inputs is very disturbing task because of timeconsuming
manual. In this paper, we proposed a method of semi-automatic ground truth generation for license plate
recognition in video sequences. The method started from region of interesting detection to rapidly extract characters lines
followed by a license plate recognition system to verify the license plate regions and recognized the numbers. On the top
of the LPR system, we incorporated a tracking-validation mechanism to detect the time interval of passing vehicles in
input sequences. The tracking mechanism is initialized by a single license plate region in one frame. Moreover, in order
to tolerate the variation of the license plate appearances in the input sequences, the validator would be updated by
capturing positive and negatives samples during tracking. Experimental results show that the proposed method can
achieve promising results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The phase perturbations due to propagation effects can destroy the high resolution imagery of Synthetic Aperture
Imaging Ladar (SAIL). Some autofocus algorithms for Synthetic Aperture Radar (SAR) were developed and
implemented. Phase Gradient Algorithm (PGA) is a well-known one for its robustness and wide application, and Phase
Curvature Algorithm (PCA) as a similar algorithm expands its applied field to strip map mode. In this paper the
autofocus algorithms utilized in optical frequency domain are proposed, including optical PGA and PCA respectively
implemented in spotlight and strip map mode. Firstly, the mathematical flows of optical PGA and PCA in SAIL are
derived. The simulations model of the airborne SAIL is established, and the compensation simulations of the synthetic
aperture laser images corrupted by the random errors, linear phase errors and quadratic phase errors are executed. The
compensation effect and the cycle index of the simulation are discussed. The simulation results show that both the two
optical autofocus algorithms are effective while the optical PGA outperforms the optical PCA, which keeps consistency
with the theory.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, nonlinear correlation filters have been proposed for distortion-invariant pattern recognition. The design of the
filters is based on rank-order, logical operations and nonlinear correlation. These kinds of filters are robust to non
Gaussian noise and non-homogeneous illumination. A drawback of nonlinear filters is its high computational cost;
however, the computation of nonlinear correlation can be parallelized. In this paper a hardware implementation of
nonlinear filtering is presented. The hardware coprocessor is based on a Field Programmable Gate Array (FPGA) device.
Simulation results are provided and discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Various techniques for image recovery from degraded observed images using microscanning image system were
proposed. The methods deal with additive, multiplicative interferences, and sensor's noise. Basically, they use several
observed images captured with small spatial shifts. In this paper, we analyze the tolerance of restoration methods to shift
errors during camera microscanning. Computer simulation results obtained with the restoration methods using degraded
images from an imperfect microscanning system are compared with those of ideal microscanning system in terms of
restoration criteria.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital image processing methods represent a viable and well acknowledged alternative to strain gauges and
interferometric techniques for determining full-field displacements and strains in materials under stress. This
paper presents an image adaptive technique for dense motion and strain estimation using high-resolution speckle
images that show the analyzed material in its original and deformed states. The algorithm starts by dividing the
speckle image showing the original state into irregular cells taking into consideration both spatial and gradient
image information present. Subsequently the Newton-Raphson digital image correlation technique is applied
to calculate the corresponding motion for each cell. Adaptive spatial regularization in the form of the Geman-
McClure robust spatial estimator is employed to increase the spatial consistency of the motion components
of a cell with respect to the components of neighbouring cells. To obtain the final strain information, local
least-squares fitting using a linear displacement model is performed on the horizontal and vertical displacement
fields. To evaluate the presented image partitioning and strain estimation techniques two numerical and two real
experiments are employed. The numerical experiments simulate the deformation of a specimen with constant
strain across the surface as well as small rigid-body rotations present while real experiments consist specimens
that undergo uniaxial stress. The results indicate very good accuracy of the recovered strains as well as better
rotation insensitivity compared to classical techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we proposed a novel method for correcting the 2D calibration target. Firstly, we captured
multiple images of the inaccurate calibration target from multi-views and located the coordinates of
those circle landmarks in these images. Secondly, homonymous landmarks in different images could be
detected by a scheme for a special topology relation. Thirdly, we could accurately reconstruct the 3D
coordinates of landmarks with a scale constraint using bundle adjustment strategy. And finally, the
scale was computed from an accurate distance between any two landmarks. Then we could obtain the
truly coordinates of landmarks, which multiplied by the scale. The experimental results validated that
our method is efficient, high-precision, low-cost and easy-implementation, which can be widely
applied in vision measurement and system calibration.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is necessary to apply the spectral-domain optical coherence tomography (SD-OCT) to image the whole eye segment
for practically iatrical application, but the imaging depth of SD-OCT is limited by the spectral resolution of the
spectrometer. By now, no result about this research has been reported. In our study, a new dual channel dual focus OCT
system is adopted to image the whole eye segment. The cornea and the crystalline lens are simultaneously imaged by
using full range complex spectral-domain OCT in one channel, the retina is detected by the other. The new system was
successfully tested in imaging of the volunteer' eye in vivo. The preliminary results presented in this paper
demonstrated the feasibility of this approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present an efficient and robust lane markers detection algorithm using the log-polar transform and the
random sample consensus (RANSAC). To extract the optimal lane marker points, we set firstly the regions of interest
(ROIs) with variable block size and perform the preprocessing steps within ROIs. Then, to fit the lane model, these
points are transformed to the log-polar space and then we use the RANSAC curve fitting algorithm to detect the exact
lane markers of road. The various real road experimental results are presented to evaluate the effectiveness of the
proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this work is to classify and quantify synapses and their properties in the cultures of a mouse's
hippocampus, from images acquired by a fluorescent microscope. Quantification features include the number of
synapses, their intensity and their size characteristics. The images obtained by the microscope contain hundreds to
several thousands of synapses with various elliptic-like shape features and intensities. These images also include other
features such as glia cells and other biological objects beyond the focus plane; those features reduce the visibility of the
synapses and interrupt the segmentation process. The proposed method comprises several steps, including background
subtraction, identification of suspected centers of synapses as local maxima of small neighborhoods, evaluation of the
tendency of objects to be synapses according to intensity properties at their larger neighborhoods, classification of
detected synapses into categories as bulks or single synapses and finally, delimiting the borders of each synapse.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose an efficient approach for achieving improved localization of edges detected in remotely sensed imagery
wherein the improvement is in the localization of the detected edges. This work is based on the notion that the partial
derivatives of individual image components used for vector gradient computation often yields thick edges, and
consequently optimizing them to only constitute contributions towards their local scalar gradient maxima before being
employed in a vector field gradient calculation can yield significantly localized edges in the final edge map. Our
approach was tested on several remotely sensed multispectral and hyperspectral datasets with favorable results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The properties of wide-field astronomical systems along with specific visual data in astronomical images contribute to
complicated evaluation of acquired image data. The main goal of this paper is to present the advanced processing of
the images obtained from wide-field astronomical systems and to introduce the way how to enhance the accuracy of
astronomical measurements on these systems. The paper also deals with the modelling of the space variant high order
optical aberrations which increase towards margins of the field of view and which distort the point spread function of
optical system and negatively affect the image quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a scheme for finding the correspondence between uniformly spaced locations on the images of
human face captured from different viewpoints at the same instant. The correspondence is dedicated for 3D
reconstruction to be used in the registration procedure for neurosurgery where the exposure to projectors must be
seriously restricted. The approach utilizes structured light to enhance patterns on the images and is initialized with the
scale-invariant feature transform (SIFT). Successive locations are found according to spatial order using a parallel
version of the particle swarm optimization algorithm. Furthermore, false locations are singled out for correction by
searching for outliers from fitted curves. Case studies show that the scheme is able to correctly generate 456 evenly
spaced 3D coordinate points in 23 seconds from a single shot of projected human face using a PC with 2.66 GHz Intel
Q9400 CPU and 4GB RAM.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a novel unsupervised method for facial recognition using hyperspectral imaging and decision fusion. In
previous work we have separately investigated the use of spectra matching and image based matching. In spectra
matching, face spectra are being classified based on spectral similarities. In image based matching, we investigated
various approaches based on orthogonal subspaces (such as PCA and OSP). In the current work we provide an
automated unsupervised method that starts by detecting the face in the image and then proceeds to performs both
spectral and image based matching. The results are fused in a single classification decision. The algorithm is tested on an
experimental hyperspectral image database of 17 subjects each with five different facial expressions and viewing angles.
Our results show that the decision fusion leads to improvement of recognition accuracy when compared to the individual
approaches as well as to recognition based on regular imaging.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Segmentation of the retinal nerve fiber layer (RNFL) from swept source polarization-sensitive optical coherence
tomography (SS-PSOCT) images is required to determine RNFL thickness and calculate birefringence. Traditional
RNFL segmentation methods based on image processing and boundary detection algorithms utilize only optical
reflectivity contrast information, which is strongly affected by speckle noise. We present a novel approach to segment
the retinal nerve fiber layer (RNFL) using SS-PSOCT images including both optical reflectivity and phase retardation
information. The RNFL anterior boundary is detected based on optical reflectivity change due to refractive index
difference between the vitreous and inner limiting membrane. The posterior boundary of the RNFL is a transition zone
composed of birefringent axons extending from retinal ganglion cells and may be detected by a change in birefringence.
A posterior boundary detection method is presented that segments the RNFL by minimizing the uncertainty of RNFL
birefringence determined by a Levenberg-Marquardt nonlinear fitting algorithm. Clinical results from a healthy
volunteer show that the proposed segmentation method estimates RNFL birefringence and phase retardation with lower
uncertainty and higher continuity than traditional intensity-based approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present current progress in development of new observational instruments for the double station video
experiment called MAIA (Meteor Automatic Imager and Analyzer). The main goal of the MAIA project is to monitor
activity of the meteor showers and sporadic meteors. This paper presents detailed analysis of imaging parameters based
on acquisition of testing video sequences at different light conditions. Among the most important results belong the
analysis of opto-electronic conversion function and noise characteristics. Based on these results, requirements for image
preprocessing algorithms are proposed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present current progress in the project DEIMOS (Database of Images: Open Source). The DEIMOS
database is an open-source database of images and videos for testing, verification and comparison of various image
and/or video processing techniques. This paper presents additionally measured camera data available with high dynamic
range image content and description of stereoscopic content available in the database. The database of stereoscopic
images with various parameters in acquisition and image processing is intended for testing and optimization of metrics
for objective image quality assessment. An example experiment of perceived image quality assessment depending on
particular testing condition in stereoscopic image acquisition is presented. The database will be gradually annotated with
mean opinion scores of perceived image quality from human observes for each testing condition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel approach to automatically detect vehicles in road tunnels is presented in this paper. Non-uniform and
poor illumination conditions prevail in road tunnels making difficult to achieve robust vehicle detection. In order
to cope with the illumination issues, we propose a local higher-order statistic filter to make the vehicle detection
invariant to illumination changes, whereas a morphological-based background subtraction is used to generate a
convex hull segmentation of the vehicles. An evaluation test comparing our approach with a benchmark object
detector shows that our approach outperforms in terms of false detection rate and overlap area detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The fluorescent reaction is that an organism or dye, excited by UV light (200-405 nm), emits a specific frequency of
light; the light is usually a visible or near infrared light (405-900 nm). During the UV light irradiation, the photosensitive
agent will be induced to start the photochemical reaction. In addition, the fluorescence image can be used for
fluorescence diagnosis and then photodynamic therapy can be given to dental diseases and skin cancer, which has
become a useful tool to provide scientific evidence in many biomedical researches. However, most of the methods on
acquiring fluorescence biology traces are still stay in primitive stage, catching by naked eyes and researcher's subjective
judgment. This article presents a portable camera to obtain the fluorescence image and to make up a deficit from
observer competence and subjective judgment. Furthermore, the portable camera offers the 375nm UV-LED exciting
light source for user to record fluorescence image and makes the recorded image become persuasive scientific evidence.
In addition, when the raising the rate between signal and noise, the signal processing module will not only amplify the
fluorescence signal up to 70 %, but also decrease the noise significantly from environmental light on bill and nude mouse
testing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Compressed Sensing (CS) theory has gained so much attention recently in the areas of signal processing. The
sparsity of the transform coefficients has been widely employed in the early CS recovery techniques. However,
except for the sparsity, there are other priors about transform coefficients such as the tree structure and the statistical
dependencies that could be employed in CS reconstruction. In this paper, we propose to introduce the Gaussian
Scale Mixtures (GSM) model into the tree structure based Orthogonal Matching Pursuit (TSOMP) reconstruction
algorithm. This GSM model can efficiently denote the statistical dependencies between wavelet coefficients. And
these statistical dependencies will improve the accuracy of the searching of the tree structure subspace in TSOMP
algorithm. When both the inter-scale dependences (such as GSM model) of the coefficients and the intra-scale
dependences (such as tree structure) of the coefficients are combined into the Orthogonal Matching Pursuit
reconstruction algorithm, the noise and instability in TSOMP reconstruction are well reduced. Some state-of-the-art
methods are compared with the proposed method. Experimental results show that the proposed method improves
reconstruction accuracy for a given number of measurements and more image details are recovered.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A method to reliably extract object profiles even with height discontinuities (that leads to 2nπ phase jumps) is
proposed. Along with the images needed to extract wrapped phase, proposed method uses an additional image formed by
illuminating the object of interest by a gray coded pattern for phase unwrapping. Theory and experiments suggest that
the proposed approach not only retains the advantages of the original method, but also contributes significantly in the
enhancement of its performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A side illumination based approach capable of assigning fringe order to moiré topography is
proposed. Along with the three basic components used to generate moiré fringes (camera/observer, grating
and light source), current proposal uses additional light source which rules the object parallel to the grating
plane with colored bands. This article deals with the theory, simulations and experimentation of the proposed
method. This analysis is not only useful to assign fringe orders to Moiré phase map, but also helps one to
reliably extract object surface profile in the presence of surface discontinuities (which will leads to
2Nπphase jumps).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We describe a tool (AETC: Advanced Exposure Time Calculator) to simulate images of astrophysical objects obtained
with any combination of telescope, instrument and given pass band using a suitable set of parameters that define the
configuration of the equipment used for the observations. The tool provides count rates and its distribution over the focal
plane, through the proper definition of the PSF, of virtually any telescope equipped with an imager provided that its
configuration is assessed. Effects of non-uniform PSF over the Field of View can also be modeled. Moreover, detailed
simulation of the observed fields can be simulated including stars, galaxies and more complex objects providing template
of the targets. The tool is available at: http://aetc.oapd.inaf.it/
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
AIDA (Astronomical Image Decomposition and Analysis) is a software package originally developed to analyze images
of galaxies with a bright nucleus and to perform the decomposition into the nuclear and the galaxy components. The
package is able to perform photometrical and morphological study of faint galaxies as well as standard photometry of
stellar fields. With the use of graphical interfaces, AIDA assists interactively the user in selecting sources and preparing
them for the analysis. Since the decomposition into the galactic and nuclear components, in particular in the case where
the nucleus is dominant, requires a careful characterization of the PSF, AIDA has been designed to manage complex 2-D
models (both analytical and empirical, or a combinations of them), even variable in the field of view, making it suitable
for Adaptive Optics observations. PSF models can be provided by the user or modeled by AIDA itself using reference
stars in the images. Relevant parameters of the target sources are then extracted by fitting source models convolved with
the PSF.
In addition to the standard (interactive) mode AIDA can also perform automatic processing of a large numbers of
images, extracting the PSF model from each image and evaluating source parameters for targets (stars, galaxies, AGN or
QSO) by model fitting. With this automatic mode, AIDA can process, in a fully automatic way, large datasets of targets
distributed in several images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.