Gør som tusindvis af andre bogelskere
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.Du kan altid afmelde dig igen.
This book focuses on a class of single-channel noise reduction methods that are performed in the frequency domain via the short-time Fourier transform (STFT). The simplicity and relative effectiveness of this class of approaches make them the dominant choice in practical systems. Even though many popular algorithms have been proposed through more than four decades of continuous research, there are a number of critical areas where our understanding and capabilities still remain quite rudimentary, especially with respect to the relationship between noise reduction and speech distortion. All existing frequency-domain algorithms, no matter how they are developed, have one feature in common: the solution is eventually expressed as a gain function applied to the STFT of the noisy signal only in the current frame. As a result, the narrowband signal-to-noise ratio (SNR) cannot be improved, and any gains achieved in noise reduction on the fullband basis come with a price to pay, which is speechdistortion. In this book, we present a new perspective on the problem by exploiting the difference between speech and typical noise in circularity and interframe self-correlation, which were ignored in the past. By gathering the STFT of the microphone signal of the current frame, its complex conjugate, and the STFTs in the previous frames, we construct several new, multiple-observation signal models similar to a microphone array system: there are multiple noisy speech observations, and their speech components are correlated but not completely coherent while their noise components are presumably uncorrelated. Therefore, the multichannel Wiener filter and the minimum variance distortionless response (MVDR) filter that were usually associated with microphone arrays will be developed for single-channel noise reduction in this book. This might instigate a paradigm shift geared toward speech distortionless noise reduction techniques. Table of Contents: Introduction / Problem Formulation / Performance Measures / Linear and Widely Linear Models / Optimal Filters with Model 1 / Optimal Filters with Model 2 / Optimal Filters with Model 3 / Optimal Filters with Model 4 / Experimental Study
This book introduces the theory, algorithms, and implementation techniques for efficient decoding in speech recognition mainly focusing on the Weighted Finite-State Transducer (WFST) approach. The decoding process for speech recognition is viewed as a search problem whose goal is to find a sequence of words that best matches an input speech signal. Since this process becomes computationally more expensive as the system vocabulary size increases, research has long been devoted to reducing the computational cost. Recently, the WFST approach has become an important state-of-the-art speech recognition technology, because it offers improved decoding speed with fewer recognition errors compared with conventional methods. However, it is not easy to understand all the algorithms used in this framework, and they are still in a black box for many people. In this book, we review the WFST approach and aim to provide comprehensive interpretations of WFST operations and decoding algorithms to help anyone who wants to understand, develop, and study WFST-based speech recognizers. We also mention recent advances in this framework and its applications to spoken language processing. Table of Contents: Introduction / Brief Overview of Speech Recognition / Introduction to Weighted Finite-State Transducers / Speech Recognition by Weighted Finite-State Transducers / Dynamic Decoders with On-the-fly WFST Operations / Summary and Perspective
Micro-videos, a new form of user-generated contents, have been spreading widely across various social platforms, such as Vine, Kuaishou, and Tik Tok. Different from traditional long videos, micro-videos are usually recorded by smart mobile devices at any place within a few seconds. Due to its brevity and low bandwidth cost, micro-videos are gaining increasing user enthusiasm. The blossoming of micro-videos opens the door to the possibility of many promising applications, ranging from network content caching to online advertising. Thus, it is highly desirable to develop an effective scheme for the high-order micro-video understanding.Micro-video understanding is, however, non-trivial due to the following challenges:(1) how to represent micro-videos that only convey one or few high-level themes or concepts; (2) how to utilize the hierarchical structure of the venue categories to guide the micro-video analysis; (3) how to alleviate the influence of low-quality caused by complex surrounding environments and the camera shake; (4) how to model the multimodal sequential data, {i.e.}, textual, acoustic, visual, and social modalities, to enhance the micro-video understanding; and (5) how to construct large-scale benchmark datasets for the analysis? These challenges have been largely unexplored to date. In this book, we focus on addressing the challenges presented above by proposing some state-of-the-art multimodal learning theories. To demonstrate the effectiveness of these models, we apply them to three practical tasks of micro-video understanding: popularity prediction, venue category estimation, and micro-video routing. Particularly, we first build three large-scale real-world micro-video datasets for these practical tasks. We then present a multimodal transductive learning framework for micro-video popularity prediction. Furthermore, we introduce several multimodal cooperative learning approaches and a multimodal transfer learning scheme for micro-video venue category estimation. Meanwhile, we develop a multimodal sequential learning approach for micro-video recommendation. Finally, we conclude the book and figure out the future research directions in multimodal learning toward micro-video understanding.
Current vision systems are designed to perform in normal weather condition. However, no one can escape from severe weather conditions. Bad weather reduces scene contrast and visibility, which results in degradation in the performance of various computer vision algorithms such as object tracking, segmentation and recognition. Thus, current vision systems must include some mechanisms that enable them to perform up to the mark in bad weather conditions such as rain and fog. Rain causes the spatial and temporal intensity variations in images or video frames. These intensity changes are due to the random distribution and high velocities of the raindrops. Fog causes low contrast and whiteness in the image and leads to a shift in the color. This book has studied rain and fog from the perspective of vision. The book has two main goals: 1) removal of rain from videos captured by a moving and static camera, 2) removal of the fog from images and videos captured by a moving single uncalibrated camera system. The book begins with a literature survey. Pros and cons of the selected prior art algorithms are described, and a general framework for the development of an efficient rain removal algorithm is explored. Temporal and spatiotemporal properties of rain pixels are analyzed and using these properties, two rain removal algorithms for the videos captured by a static camera are developed. For the removal of rain, temporal and spatiotemporal algorithms require fewer numbers of consecutive frames which reduces buffer size and delay. These algorithms do not assume the shape, size and velocity of raindrops which make it robust to different rain conditions (i.e., heavy rain, light rain and moderate rain). In a practical situation, there is no ground truth available for rain video. Thus, no reference quality metric is very useful in measuring the efficacy of the rain removal algorithms. Temporal variance and spatiotemporal variance are presented in this book as no reference quality metrics. An efficient rain removal algorithm using meteorological properties of rain is developed. The relation among the orientation of the raindrops, wind velocity and terminal velocity is established. This relation is used in the estimation of shape-based features of the raindrop. Meteorological property-based features helped to discriminate the rain and non-rain pixels. Most of the prior art algorithms are designed for the videos captured by a static camera. The use of global motion compensation with all rain removal algorithms designed for videos captured by static camera results in better accuracy for videos captured by moving camera. Qualitative and quantitative results confirm that probabilistic temporal, spatiotemporal and meteorological algorithms outperformed other prior art algorithms in terms of the perceptual quality, buffer size, execution delay and system cost. The work presented in this book can find wide application in entertainment industries, transportation, tracking and consumer electronics. Table of Contents: Acknowledgments / Introduction / Analysis of Rain / Dataset and Performance Metrics / Important Rain Detection Algorithms / Probabilistic Approach for Detection and Removal of Rain / Impact of Camera Motion on Detection of Rain / Meteorological Approach for Detection and Removal of Rain from Videos / Conclusion and Scope of Future Work / Bibliography / Authors' Biographies
Every year lives and properties are lost in road accidents. About one-fourth of these accidents are due to low vision in foggy weather. At present, there is no algorithm that is specifically designed for the removal of fog from videos. Application of a single-image fog removal algorithm over each video frame is a time-consuming and costly affair. It is demonstrated that with the intelligent use of temporal redundancy, fog removal algorithms designed for a single image can be extended to the real-time video application. Results confirm that the presented framework used for the extension of the fog removal algorithms for images to videos can reduce the complexity to a great extent with no loss of perceptual quality. This paves the way for the real-life application of the video fog removal algorithm. In order to remove fog, an efficient fog removal algorithm using anisotropic diffusion is developed. The presented fog removal algorithm uses new dark channel assumption and anisotropic diffusion for the initialization and refinement of the airlight map, respectively. Use of anisotropic diffusion helps to estimate the better airlight map estimation. The said fog removal algorithm requires a single image captured by uncalibrated camera system. The anisotropic diffusion-based fog removal algorithm can be applied in both RGB and HSI color space. This book shows that the use of HSI color space reduces the complexity further. The said fog removal algorithm requires pre- and post-processing steps for the better restoration of the foggy image. These pre- and post-processing steps have either data-driven or constant parameters that avoid the user intervention. Presented fog removal algorithm is independent of the intensity of the fog, thus even in the case of the heavy fog presented algorithm performs well. Qualitative and quantitative results confirm that the presented fog removal algorithm outperformed previous algorithms in terms of perceptual quality, color fidelity and execution time. The work presented in this book can find wide application in entertainment industries, transportation, tracking and consumer electronics.
Image understanding has been playing an increasingly crucial role in several inverse problems and computer vision. Sparse models form an important component in image understanding, since they emulate the activity of neural receptors in the primary visual cortex of the human brain. Sparse methods have been utilized in several learning problems because of their ability to provide parsimonious, interpretable, and efficient models. Exploiting the sparsity of natural signals has led to advances in several application areas including image compression, denoising, inpainting, compressed sensing, blind source separation, super-resolution, and classification. The primary goal of this book is to present the theory and algorithmic considerations in using sparse models for image understanding and computer vision applications. To this end, algorithms for obtaining sparse representations and their performance guarantees are discussed in the initial chapters. Furthermore, approaches for designing overcomplete, data-adapted dictionaries to model natural images are described. The development of theory behind dictionary learning involves exploring its connection to unsupervised clustering and analyzing its generalization characteristics using principles from statistical learning theory. An exciting application area that has benefited extensively from the theory of sparse representations is compressed sensing of image and video data. Theory and algorithms pertinent to measurement design, recovery, and model-based compressed sensing are presented. The paradigm of sparse models, when suitably integrated with powerful machine learning frameworks, can lead to advances in computer vision applications such as object recognition, clustering, segmentation, and activity recognition. Frameworks that enhance the performance of sparse models in such applications by imposing constraints based on the prior discriminatory information and the underlying geometrical structure, and kernelizing the sparse coding and dictionary learning methods are presented. In addition to presenting theoretical fundamentals in sparse learning, this book provides a platform for interested readers to explore the vastly growing application domains of sparse representations.
This book explains the stages necessary to create a wavelet compression system for images and describes state-of-the-art systems used in image compression standards and current research. It starts with a high level discussion of the properties of the wavelet transform, especially the decomposition into multi-resolution subbands. It continues with an exposition of the null-zone, uniform quantization used in most subband coding systems and the optimal allocation of bitrate to the different subbands. Then the image compression systems of the FBI Fingerprint Compression Standard and the JPEG2000 Standard are described in detail. Following that, the set partitioning coders SPECK and SPIHT, and EZW are explained in detail and compared via a fictitious wavelet transform in actions and number of bits coded in a single pass in the top bit plane. The presentation teaches that, besides producing efficient compression, these coding systems, except for the FBI Standard, are capable of writing bit streams that have attributes of rate scalability, resolution scalability, and random access decoding. Many diagrams and tables accompany the text to aid understanding. The book is generous in pointing out references and resources to help the reader who wishes to expand his knowledge, know the origins of the methods, or find resources for running the various algorithms or building his own coding system. Table of Contents: Introduction / Characteristics of the Wavelet Transform / Generic Wavelet-based Coding Systems / The FBI Fingerprint Image Compression Standard / Set Partition Embedded Block (SPECK) Coding / Tree-based Wavelet Transform Coding Systems / Rate Control for Embedded Block Coders / Conclusion
This book presents an overview of the guidelines and strategies for transitioning an image or video processing algorithm from a research environment into a real-time constrained environment. Such guidelines and strategies are scattered in the literature of various disciplines including image processing, computer engineering, and software engineering, and thus have not previously appeared in one place. By bringing these strategies into one place, the book is intended to serve the greater community of researchers, practicing engineers, industrial professionals, who are interested in taking an image or video processing algorithm from a research environment to an actual real-time implementation on a resource constrained hardware platform. These strategies consist of algorithm simplifications, hardware architectures, and software methods. Throughout the book, carefully selected representative examples from the literature are presented to illustrate the discussed concepts. After reading the book, the readers are exposed to a wide variety of techniques and tools, which they can then employ to design a real-time image or video processing system.
This lecture describes the author's approach to the representation of color spaces and their use for color image processing. The lecture starts with a precise formulation of the space of physical stimuli (light). The model includes both continuous spectra and monochromatic spectra in the form of Dirac deltas. The spectral densities are considered to be functions of a continuous wavelength variable. This leads into the formulation of color space as a three-dimensional vector space, with all the associated structure. The approach is to start with the axioms of color matching for normal human viewers, often called Grassmann's laws, and developing the resulting vector space formulation. However, once the essential defining element of this vector space is identified, it can be extended to other color spaces, perhaps for different creatures and devices, and dimensions other than three. The CIE spaces are presented as main examples of color spaces. Many properties of the color space are examined. Once the vector space formulation is established, various useful decompositions of the space can be established. The first such decomposition is based on luminance, a measure of the relative brightness of a color. This leads to a direct-sum decomposition of color space where a two-dimensional subspace identifies the chromatic attribute, and a third coordinate provides the luminance. A different decomposition involving a projective space of chromaticity classes is then presented. Finally, it is shown how the three types of color deficiencies present in some groups of humans leads to a direct-sum decomposition of three one-dimensional subspaces that are associated with the three types of cone photoreceptors in the human retina. Next, a few specific linear and nonlinear color representations are presented. The color spaces of two digital cameras are also described. Then the issue of transformations between different color spaces is addressed. Finally, these ideas are applied to signal and system theory for color images. This is done using a vector signal approach where a general linear system is represented by a three-by-three system matrix. The formulation is applied to both continuous and discrete space images, and specific problems in color filter array sampling and displays are presented for illustration. The book is mainly targeted to researchers and graduate students in fields of signal processing related to any aspect of color imaging.
This book introduces basic machine learning concepts and applications for a broad audience that includes students, faculty, and industry practitioners. We begin by describing how machine learning provides capabilities to computers and embedded systems to learn from data. A typical machine learning algorithm involves training, and generally the performance of a machine learning model improves with more training data. Deep learning is a sub-area of machine learning that involves extensive use of layers of artificial neural networks typically trained on massive amounts of data. Machine and deep learning methods are often used in contemporary data science tasks to address the growing data sets and detect, cluster, and classify data patterns. Although machine learning commercial interest has grown relatively recently, the roots of machine learning go back to decades ago. We note that nearly all organizations, including industry, government, defense, and health, are using machine learning to address a variety of needs and applications. The machine learning paradigms presented can be broadly divided into the following three categories: supervised learning, unsupervised learning, and semi-supervised learning. Supervised learning algorithms focus on learning a mapping function, and they are trained with supervision on labeled data. Supervised learning is further sub-divided into classification and regression algorithms. Unsupervised learning typically does not have access to ground truth, and often the goal is to learn or uncover the hidden pattern in the data. Through semi-supervised learning, one can effectively utilize a large volume of unlabeled data and a limited amount of labeled data to improve machine learning model performances. Deep learning and neural networks are also covered in this book. Deep neural networks have attracted a lot of interest during the last ten years due to the availability of graphics processing units (GPU) computational power, big data, and new software platforms. They have strong capabilities in terms of learning complex mapping functions for different types of data. We organize the book as follows. The book starts by introducing concepts in supervised, unsupervised, and semi-supervised learning. Several algorithms and their inner workings are presented within these three categories. We then continue with a brief introduction to artificial neural network algorithms and their properties. In addition, we cover an array of applications and provide extensive bibliography. The book ends with a summary of the key machine learning concepts.
Orthogonal Frequency Division Multiplexing (OFDM) systems are widely used in the standards for digital audio/video broadcasting, WiFi and WiMax. Being a frequency-domain approach to communications, OFDM has important advantages in dealing with the frequency-selective nature of high data rate wireless communication channels. As the needs for operating with higher data rates become more pressing, OFDM systems have emerged as an effective physical-layer solution. This short monograph is intended as a tutorial which highlights the deleterious aspects of the wireless channel and presents why OFDM is a good choice as a modulation that can transmit at high data rates. The system-level approach we shall pursue will also point out the disadvantages of OFDM systems especially in the context of peak to average ratio, and carrier frequency synchronization. Finally, simulation of OFDM systems will be given due prominence. Simple MATLAB programs are provided for bit error rate simulation using a discrete-time OFDM representation. Software is also provided to simulate the effects of inter-block-interference, inter-carrier-interference and signal clipping on the error rate performance. Different components of the OFDM system are described, and detailed implementation notes are provided for the programs. The program can be downloaded here. Table of Contents: Introduction / Modeling Wireless Channels / Baseband OFDM System / Carrier Frequency Offset / Peak to Average Power Ratio / Simulation of the Performance of OFDM Systems / Conclusions
It is well known that speckle is a multiplicative noise that degrades image and video quality and the visual expert's evaluation in ultrasound imaging and video. This necessitates the need for robust despeckling image and video techniques for both routine clinical practice and tele-consultation. The goal for this book (book 1 of 2 books) is to introduce the problem of speckle occurring in ultrasound image and video as well as the theoretical background (equations), the algorithmic steps, and the MATLABTM code for the following group of despeckle filters: linear filtering, nonlinear filtering, anisotropic diffusion filtering, and wavelet filtering. This book proposes a comparative evaluation framework of these despeckle filters based on texture analysis, image quality evaluation metrics, and visual evaluation by medical experts. Despeckle noise reduction through the application of these filters will improve the visual observation quality or it may be used as a pre-processing step for further automated analysis, such as image and video segmentation, and texture characterization in ultrasound cardiovascular imaging, as well as in bandwidth reduction in ultrasound video transmission for telemedicine applications. The aforementioned topics will be covered in detail in the companion book to this one. Furthermore, in order to facilitate further applications we have developed in MATLABTM two different toolboxes that integrate image (IDF) and video (VDF) despeckle filtering, texture analysis, and image and video quality evaluation metrics. The code for these toolsets is open source and these are available to download complementary to the two books. Table of Contents: Preface / Acknowledgments / List of Symbols / List of Abbreviations / Introduction to Speckle Noise in Ultrasound Imaging and Video / Basics of Evaluation Methodology / Linear Despeckle Filtering / Nonlinear Despeckle Filtering / Diffusion Despeckle Filtering / Wavelet Despeckle Filtering / Evaluation of Despeckle Filtering / Summary and Future Directions / References / Authors' Biographies
This book describes several modules of the Code Excited Linear Prediction (CELP) algorithm. The authors use the Federal Standard-1016 CELP MATLAB® software to describe in detail several functions and parameter computations associated with analysis-by-synthesis linear prediction. The book begins with a description of the basics of linear prediction followed by an overview of the FS-1016 CELP algorithm. Subsequent chapters describe the various modules of the CELP algorithm in detail. In each chapter, an overall functional description of CELP modules is provided along with detailed illustrations of their MATLAB® implementation. Several code examples and plots are provided to highlight some of the key CELP concepts. Link to MATLAB® code found within the book Table of Contents: Introduction to Linear Predictive Coding / Autocorrelation Analysis and Linear Prediction / Line Spectral Frequency Computation / Spectral Distortion / The Codebook Search / The FS-1016 Decoder
Gaussian quadrature is a powerful technique for numerical integration that falls under the broad category of spectral methods. The purpose of this work is to provide an introduction to the theory and practice of Gaussian quadrature. We study the approximation theory of trigonometric and orthogonal polynomials and related functions and examine the analytical framework of Gaussian quadrature. We discuss Gaussian quadrature for bandlimited functions, a topic inspired by some recent developments in the analysis of prolate spheroidal wave functions. Algorithms for the computation of the quadrature nodes and weights are described. Several applications of Gaussian quadrature are given, ranging from the evaluation of special functions to pseudospectral methods for solving differential equations. Software realization of select algorithms is provided. Table of Contents: Introduction / Approximating with Polynomials and Related Functions / Gaussian Quadrature / Applications / Links to Mathematical Software
Bandwidth extension of speech is used in the International Telecommunication Union G.729.1 standard in which the narrowband bitstream is combined with quantized high-band parameters. Although this system produces high-quality wideband speech, the additional bits used to represent the high band can be further reduced. In addition to the algorithm used in the G.729.1 standard, bandwidth extension methods based on spectrum prediction have also been proposed. Although these algorithms do not require additional bits, they perform poorly when the correlation between the low and the high band is weak. In this book, two wideband speech coding algorithms that rely on bandwidth extension are developed. The algorithms operate as wrappers around existing narrowband compression schemes. More specifically, in these algorithms, the low band is encoded using an existing toll-quality narrowband system, whereas the high band is generated using the proposed extension techniques. The first method relies only on transmitted high-band information to generate the wideband speech. The second algorithm uses a constrained minimum mean square error estimator that combines transmitted high-band envelope information with a predictive scheme driven by narrowband features. Both algorithms make use of novel perceptual models based on loudness that determine optimum quantization strategies for wideband recovery and synthesis. Objective and subjective evaluations reveal that the proposed system performs at a lower average bit rate while improving speech quality when compared to other similar algorithms.
Software development is hard, but creating good software is even harder, especially if your main job is something other than developing software. Engineer Your Software! opens the world of software engineering, weaving engineering techniques and measurement into software development activities. Focusing on architecture and design, Engineer Your Software! claims that no matter how you write software, design and engineering matter and can be applied at any point in the process. Engineer Your Software! provides advice, patterns, design criteria, measures, and techniques that will help you get it right the first time. Engineer Your Software! also provides solutions to many vexing issues that developers run into time and time again. Developed over 40 years of creating large software applications, these lessons are sprinkled with real-world examples from actual software projects. Along the way, the author describes common design principles and design patterns that can make life a lot easier for anyone tasked with writing anything from a simple script to the largest enterprise-scale systems.
Motion estimation is a long-standing cornerstone of image and video processing. Most notably, motion estimation serves as the foundation for many of today's ubiquitous video coding standards including H.264. Motion estimators also play key roles in countless other applications that serve the consumer, industrial, biomedical, and military sectors. Of the many available motion estimation techniques, optical flow is widely regarded as most flexible. The flexibility offered by optical flow is particularly useful for complex registration and interpolation problems, but comes at a considerable computational expense. As the volume and dimensionality of data that motion estimators are applied to continue to grow, that expense becomes more and more costly. Control grid motion estimators based on optical flow can accomplish motion estimation with flexibility similar to pure optical flow, but at a fraction of the computational expense. Control grid methods also offer the added benefit of representing motion far more compactly than pure optical flow. This booklet explores control grid motion estimation and provides implementations of the approach that apply to data of multiple dimensionalities. Important current applications of control grid methods including registration and interpolation are also developed. Table of Contents: Introduction / Control Grid Interpolation (CGI) / Application of CGI to Registration Problems / Application of CGI to Interpolation Problems / Discussion and Conclusions
The availability of inexpensive, custom, highly integrated circuits is enabling some very powerful systems that bring together sensors, smart phones, wearables, cloud computing, and other technologies. To design these types of complex systems we are advocating a top-down simulation methodology to identify problems early. This approach enables software development to start prior to expensive chip and hardware development. We call the overall approach virtual design. This book explains why simulation has become important for chip design and provides an introduction to some of the simulation methods used. The audio lifelogging research project demonstrates the virtual design process in practice. The goals of this book are to:explain how silicon design has become more closely involved with system design;show how virtual design enables top down design;explain the utility of simulation at different abstraction levels;show how open source simulation software was used in audio lifelogging.The target audience for this book are faculty, engineers, and students who are interested in developing digital devices for Internet of Things (IoT) types of products.
The sensor cloud is a new model of computing paradigm for Wireless Sensor Networks (WSNs), which facilitates resource sharing and provides a platform to integrate different sensor networks where multiple users can build their own sensing applications at the same time. It enables a multi-user on-demand sensory system, where computing, sensing, and wireless network resources are shared among applications. Therefore, it has inherent challenges for providing security and privacy across the sensor cloud infrastructure. With the integration of WSNs with different ownerships, and users running a variety of applications including their own code, there is a need for a risk assessment mechanism to estimate the likelihood and impact of attacks on the life of the network. The data being generated by the wireless sensors in a sensor cloud need to be protected against adversaries, which may be outsiders as well as insiders. Similarly, the code disseminated to the sensors within the sensor cloud needs to be protected against inside and outside adversaries. Moreover, since the wireless sensors cannot support complex and energy-intensive measures, the lightweight schemes for integrity, security, and privacy of the data have to be redesigned.The book starts with the motivation and architecture discussion of a sensor cloud. Due to the integration of multiple WSNs running user-owned applications and code, the possibility of attacks is more likely. Thus, next, we discuss a risk assessment mechanism to estimate the likelihood and impact of attacks on these WSNs in a sensor cloud using a framework that allows the security administrator to better understand the threats present and take necessary actions. Then, we discuss integrity and privacy preserving data aggregation in a sensor cloud as it becomes harder to protect data in this environment. Integrity of data can be compromised as it becomes easier for an attacker to inject false data in a sensor cloud, and due to hop by hop nature, privacy of data could be leaked as well. Next, the book discusses a fine-grained access control scheme which works on the secure aggregated data in a sensor cloud. This scheme uses Attribute Based Encryption (ABE) to achieve the objective. Furthermore, to securely and efficiently disseminate application code in sensor cloud, we present a secure code dissemination algorithm which first reduces the amount of code to be transmitted from the base station to the sensor nodes. It then uses Symmetric Proxy Re-encryption along with Bloom filters and Hash-based Message Authentication Code (HMACs) to protect the code against eavesdropping and false code injection attacks.
Augmented reality (AR) systems are often used to superimpose virtual objects or information on a scene to improve situational awareness. Delays in the display system or inaccurate registration of objects destroy the sense of immersion a user experiences when using AR systems. AC electromagnetic trackers are ideal for these applications when combined with head orientation prediction to compensate for display system delays. Unfortunately, these trackers do not perform well in environments that contain conductive or ferrous materials due to magnetic field distortion without expensive calibration techniques. In our work we focus on both the prediction and distortion compensation aspects of this application, developing a "e;small footprint"e; predictive filter for display lag compensation and a simplified calibration system for AC magnetic trackers. In the first phase of our study we presented a novel method of tracking angular head velocity from quaternion orientation using an Extended Kalman Filter in both single model (DQEKF) and multiple model (MMDQ) implementations. In the second phase of our work we have developed a new method of mapping the magnetic field generated by the tracker without high precision measurement equipment. This method uses simple fixtures with multiple sensors in a rigid geometry to collect magnetic field data in the tracking volume. We have developed a new algorithm to process the collected data and generate a map of the magnetic field distortion that can be used to compensate distorted measurement data. Table of Contents: List of Tables / Preface / Acknowledgments / Delta Quaternion Extended Kalman Filter / Multiple Model Delta Quaternion Filter / Interpolation Volume Calibration / Conclusion / References / Authors' Biographies
The adaptive configuration of nodes in a sensor network has the potential to improve sequential estimation performance by intelligently allocating limited sensor network resources. In addition, the use of heterogeneous sensing nodes provides a diversity of information that also enhances estimation performance. This work reviews cognitive systems and presents a cognitive fusion framework for sequential state estimation using adaptive configuration of heterogeneous sensing nodes and heterogeneous data fusion. This work also provides an application of cognitive fusion to the sequential estimation problem of target tracking using foveal and radar sensors.
From the early pulse code modulation-based coders to some of the recent multi-rate wideband speech coding standards, the area of speech coding made several significant strides with an objective to attain high quality of speech at the lowest possible bit rate. This book presents some of the recent advances in linear prediction (LP)-based speech analysis that employ perceptual models for narrow- and wide-band speech coding. The LP analysis-synthesis framework has been successful for speech coding because it fits well the source-system paradigm for speech synthesis. Limitations associated with the conventional LP have been studied extensively, and several extensions to LP-based analysis-synthesis have been proposed, e.g., the discrete all-pole modeling, the perceptual LP, the warped LP, the LP with modified filter structures, the IIR-based pure LP, all-pole modeling using the weighted-sum of LSP polynomials, the LP for low frequency emphasis, and the cascade-form LP. These extensions can be classified as algorithms that either attempt to improve the LP spectral envelope fitting performance or embed perceptual models in the LP. The first half of the book reviews some of the recent developments in predictive modeling of speech with the help of Matlab(TM) Simulation examples. Advantages of integrating perceptual models in low bit rate speech coding depend on the accuracy of these models to mimic the human performance and, more importantly, on the achievable "e;coding gains"e; and "e;computational overhead"e; associated with these physiological models. Methods that exploit the masking properties of the human ear in speech coding standards, even today, are largely based on concepts introduced by Schroeder and Atal in 1979. For example, a simple approach employed in speech coding standards is to use a perceptual weighting filter to shape the quantization noise according to the masking properties of the human ear. The second half of the book reviews some of the recent developments in perceptual modeling of speech (e.g., masking threshold, psychoacoustic models, auditory excitation pattern, and loudness) with the help of Matlab simulations. Supplementary material including Matlab programs and simulation examples presented in this book can also be accessed here. Table of Contents: Introduction / Predictive Modeling of Speech / Perceptual Modeling of Speech
In ultrasound imaging and video visual perception is hindered by speckle multiplicative noise that degrades the quality. Noise reduction is therefore essential for improving the visual observation quality or as a pre-processing step for further automated analysis, such as image/video segmentation, texture analysis and encoding in ultrasound imaging and video. The goal of the first book (book 1 of 2 books) was to introduce the problem of speckle in ultrasound image and video as well as the theoretical background, algorithmic steps, and the MatlabTM for the following group of despeckle filters: linear despeckle filtering, non-linear despeckle filtering, diffusion despeckle filtering, and wavelet despeckle filtering. The goal of this book (book 2 of 2 books) is to demonstrate the use of a comparative evaluation framework based on these despeckle filters (introduced on book 1) on cardiovascular ultrasound image and video processing and analysis. More specifically, the despeckle filtering evaluation framework is based on texture analysis, image quality evaluation metrics, and visual evaluation by experts. This framework is applied in cardiovascular ultrasound image/video processing on the tasks of segmentation and structural measurements, texture analysis for differentiating between two classes (i.e. normal vs disease) and for efficient encoding for mobile applications. It is shown that despeckle noise reduction improved segmentation and measurement (of tissue structure investigated), increased the texture feature distance between normal and abnormal tissue, improved image/video quality evaluation and perception and produced significantly lower bitrates in video encoding. Furthermore, in order to facilitate further applications we have developed in MATLABTM two different toolboxes that integrate image (IDF) and video (VDF) despeckle filtering, texture analysis, and image and video quality evaluation metrics. The code for these toolsets is open source and these are available to download complementary to the two monographs.
The MPEG-1 Layer III (MP3) algorithm is one of the most successful audio formats for consumer audio storage and for transfer and playback of music on digital audio players. The MP3 compression standard along with the AAC (Advanced Audio Coding) algorithm are associated with the most successful music players of the last decade. This book describes the fundamentals and the MATLAB implementation details of the MP3 algorithm. Several of the tedious processes in MP3 are supported by demonstrations using MATLAB software. The book presents the theoretical concepts and algorithms used in the MP3 standard. The implementation details and simulations with MATLAB complement the theoretical principles. The extensive list of references enables the reader to perform a more detailed study on specific aspects of the algorithm and gain exposure to advancements in perceptual coding. Table of Contents: Introduction / Analysis Subband Filter Bank / Psychoacoustic Model II / MDCT / Bit Allocation, Quantization and Coding / Decoder
Recent innovations in modern radar for designing transmitted waveforms, coupled with new algorithms for adaptively selecting the waveform parameters at each time step, have resulted in improvements in tracking performance. Of particular interest are waveforms that can be mathematically designed to have reduced ambiguity function sidelobes, as their use can lead to an increase in the target state estimation accuracy. Moreover, adaptively positioning the sidelobes can reveal weak target returns by reducing interference from stronger targets. The manuscript provides an overview of recent advances in the design of multicarrier phase-coded waveforms based on Bjorck constant-amplitude zero-autocorrelation (CAZAC) sequences for use in an adaptive waveform selection scheme for mutliple target tracking. The adaptive waveform design is formulated using sequential Monte Carlo techniques that need to be matched to the high resolution measurements. The work will be of interest to both practitioners and researchers in radar as well as to researchers in other applications where high resolution measurements can have significant benefits. Table of Contents: Introduction / Radar Waveform Design / Target Tracking with a Particle Filter / Single Target tracking with LFM and CAZAC Sequences / Multiple Target Tracking / Conclusions
Blurring is almost an omnipresent effect on natural images. The main causes of blurring in images include: (a) the existence of objects at different depths within the scene which is known as defocus blur; (b) blurring due to motion either of objects in the scene or the imaging device; and (c) blurring due to atmospheric turbulence.Automatic estimation of spatially varying sharpness/blurriness has several applications including depth estimation, image quality assessment, information retrieval, image restoration, among others.There are some cases in which blur is intentionally introduced or enhanced; for example, in artistic photography and cinematography in which blur is intentionally introduced to emphasize a certain image region. Bokeh is a technique that introduces defocus blur with aesthetic purposes. Additionally, in trending applications like augmented and virtual reality usually, blur is introduced in order to provide/enhance depth perception.Digital images and videos are produced every day in astonishing amounts and the demand for higher quality is constantly rising which creates a need for advanced image quality assessment. Additionally, image quality assessment is important for the performance of image processing algorithms. It has been determined that image noise and artifacts can affect the performance of algorithms such as face detection and recognition, image saliency detection, and video target tracking. Therefore, image quality assessment (IQA) has been a topic of intense research in the fields of image processing and computer vision. Since humans are the end consumers of multimedia signals, subjective quality metrics provide the most reliable results; however, their cost in addition to time requirements makes them unfeasible for practical applications. Thus, objective quality metrics are usually preferred.
Although the field of sparse representations is relatively new, research activities in academic and industrial research labs are already producing encouraging results. The sparse signal or parameter model motivated several researchers and practitioners to explore high complexity/wide bandwidth applications such as Digital TV, MRI processing, and certain defense applications. The potential signal processing advancements in this area may influence radar technologies. This book presents the basic mathematical concepts along with a number of useful MATLAB(R) examples to emphasize the practical implementations both inside and outside the radar field. Table of Contents: Radar Systems: A Signal Processing Perspective / Introduction to Sparse Representations / Dimensionality Reduction / Radar Signal Processing Fundamentals / Sparse Representations in Radar
This short book is for students, professors and professionals interested in signal processing of seismic data using MATLAB(TM). The step-by-step demo of the full reflection seismic data processing workflow using a complete real seismic data set places itself as a very useful feature of the book. This is especially true when students are performing their projects, and when professors and researchers are testing their new developed algorithms in MATLAB(TM) for processing seismic data. The book provides the basic seismic and signal processing theory required for each chapter and shows how to process the data from raw field records to a final image of the subsurface all using MATLAB(TM). The MATLAB(TM) codes and seismic data can be downloaded here. Table of Contents: Seismic Data Processing: A Quick Overview / Examination of A Real Seismic Data Set / Quality Control of Real Seismic Data / Seismic Noise Attenuation / Seismic Deconvolution / Carrying the Processing Forward / Static Corrections / Seismic Migration / Concluding Remarks
Linear algebra is one of the most basic foundations of a wide range of scientific domains, and most textbooks of linear algebra are written by mathematicians. However, this book is specifically intended to students and researchers of pattern information processing, analyzing signals such as images and exploring computer vision and computer graphics applications. The author himself is a researcher of this domain. Such pattern information processing deals with a large amount of data, which are represented by high-dimensional vectors and matrices. There, the role of linear algebra is not merely numerical computation of large-scale vectors and matrices. In fact, data processing is usually accompanied with "e;geometric interpretation."e; For example, we can think of one data set being "e;orthogonal"e; to another and define a "e;distance"e; between them or invoke geometric relationships such as "e;projecting"e; some data onto some space. Such geometric concepts not only help us mentally visualize abstract high-dimensional spaces in intuitive terms but also lead us to find what kind of processing is appropriate for what kind of goals. First, we take up the concept of "e;projection"e; of linear spaces and describe "e;spectral decomposition,"e; "e;singular value decomposition,"e; and "e;pseudoinverse"e; in terms of projection. As their applications, we discuss least-squares solutions of simultaneous linear equations and covariance matrices of probability distributions of vector random variables that are not necessarily positive definite. We also discuss fitting subspaces to point data and factorizing matrices in high dimensions in relation to motion image analysis. Finally, we introduce a computer vision application of reconstructing the 3D location of a point from three camera views to illustrate the role of linear algebra in dealing with data with noise. This book is expected to help students and researchers of pattern information processing deepen the geometric understanding of linear algebra.
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.