Gør som tusindvis af andre bogelskere
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.Du kan altid afmelde dig igen.
Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issues? This practical book introduces techniques for generating synthetic datafake data generated from real dataso you can perform secondary analysis to do research, understand customer behaviors, develop new products, or generate new revenue.Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution.This book describes:Steps for generating synthetic data using multivariate normal distributionsMethods for distribution fitting covering different goodness-of-fit metricsHow to replicate the simple structure of original dataAn approach for modeling data structure to consider complex relationshipsMultiple approaches and metrics you can use to assess data utilityHow analysis performed on real data can be replicated with synthetic dataPrivacy implications of synthetic data and methods to assess identity disclosure
Updated as of August 2014, this practical book will demonstrate proven methods for anonymizing health data to help your organization share meaningful datasets, without exposing patient identity. Leading experts Khaled El Emam and Luk Arbuckle walk you through a risk-based methodology, using case studies from their efforts to de-identify hundreds of datasets.Clinical data is valuable for research and other types of analytics, but making it anonymous without compromising data quality is tricky. This book demonstrates techniques for handling different data types, based on the authors experiences with a maternal-child registry, inpatient discharge abstracts, health insurance claims, electronic medical record databases, and the World Trade Center disaster registry, among others.Understand different methods for working with cross-sectional and longitudinal datasetsAssess the risk of adversaries who attempt to re-identify patients in anonymized datasetsReduce the size and complexity of massive datasets without losing key information or jeopardizing privacyUse methods to anonymize unstructured free-form text dataMinimize the risks inherent in geospatial data, without omitting critical location-based health informationLook at ways to anonymize coding information in health dataLearn the challenge of anonymously linking related datasets
Elements of Software Process Assessment and Improvement reviews current assessment practices, experiences, and new research trends in software process improvement. Revised chapters expanded from articles in The Software Process Newsletter of the IEEE Computer Society Technical Council on Software Engineering, describe the improvement cycle in detail: from diagnosing an organization, establishing a business case, and changing elements within a process to final evaluation. This book's thorough examination of contemporary models evaluates an organization's processes and capabilities, covers the business argument for assessment and improvement, and illustrates expected improvements and assessment reliability methods. Additional information includes application guidelines covering critical success factors including tools and techniques and important developments that enhance the reader's understanding of organizational processes in practice.
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.