Gør som tusindvis af andre bogelskere
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.Du kan altid afmelde dig igen.
This volume LNCS-IFIP constitutes the refereed proceedings of the 7th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2023 in Benevento, Italy, during August 28 ¿ September 1, 2023. The 18 full papers presented together were carefully reviewed and selected from 30 submissions. The conference focuses on integrative machine learning approach, considering the importance of data science and visualization for the algorithmic pipeline with a strong emphasis on privacy, data protection, safety and security.
This book constitutes the refereed proceedings of the 29th International Conference on Collaboration Technologies and Social Computing, CollabTech 2023, held in Osaka, Japan, during August 29¿September 1, 2023, in hybrid mode.The 8 full papers presented in this book together with 12 short papers were carefully reviewed and selected from 31 submissions. The papers focus on innovative technical, human and organizational approaches to expand collaboration support including computer science, management science, design science, cognitive and social science.
This book is a major update to the very successful first and second editions (2005 and 2010) of Data Mining and Knowledge Discovery Handbook. Since the last edition, this field has continued to evolve and to gain popularity. Existing methods are constantly being improved and new methods, applications and aspects are introduced. The new title of this handbook and its content reflect these changes thoroughly. Some existing chapters have been brought up to date. In addition to major revision of the existing chapters, the new edition includes totally new topics, such as: deep learning, explainable AI, human factors and social issues and advanced methods for big-data. The significant enhancement to the content reflects the growth in importance of data science. The third edition is also a timely opportunity to incorporate many other changes based on peers and students¿ feedback.This comprehensive handbook also presents a coherent and unified repository of data science major concepts, theories, methods, trends, challenges and applications. It covers all the crucial important machine learning methods used in data science.Today's accessibility and abundance of data make data science matters of considerable importance and necessity. Given the field's recent growth, it's not surprising that researchers and practitioners now have a wide range of methods and tools at their disposal. While statistics is fundamental for data science, methods originated from artificial intelligence, particularly machine learning, are also playing a significant role.This handbook aims to serve as the main reference for researchers in the fields of information technology, e-Commerce, information retrieval, data science, machine learning, data mining, databases and statistics as well as advanced level students studying computer science or electrical engineering. Practitioners working within these related fields and data scientists will also want to purchase this handbook as a reference.
Traditional data architecture patterns are severely limited. To use these patterns, you have to ETL data into each tool--a cost-prohibitive process for making warehouse features available to all of your data. The lack of flexibility with these patterns requires you to lock into a set of priority tools and formats, which creates data silos and data drift. This practical book shows you a better way. Apache Iceberg provides the capabilities, performance, scalability, and savings that fulfill the promise of an open data lakehouse. By following the lessons in this book, you'll be able to achieve interactive, batch, machine learning, and streaming analytics with this high-performance open source format. Authors Tomer Shiran, Jason Hughes, and Alex Merced from Dremio show you how to get started with Iceberg. With this book, you'll learn: The architecture of Apache Iceberg tables What happens under the hood when you perform operations on Iceberg tables How to further optimize Iceberg tables for maximum performance How to use Iceberg with popular data engines such as Apache Spark, Apache Flink, and Dremio Discover why Apache Iceberg is a foundational technology for implementing an open data lakehouse.
This book constitutes the refereed post proceedings of the 16th Research Conference onMetadata and Semantic Research, MTSR 2022, held in London, UK, during November 7¿11, 2022.The 21 full papers and 4 short papers included in this book were carefully reviewed andselected from 79 submissions. They were organized in topical sections as follows: metadata, linked data, semantics and ontologies - general session, and track on Knowledge IT Artifacts (KITA), Track on digital humanities and digital curation, and track on cultural collections and applications, track on digital libraries, information retrieval, big, linked, social & open data, and metadata, linked data, semantics and ontologies - general session, track on agriculture, food & environment, and metadata, linked Data, semantics and ontologies - general, track on open repositories, research information systems & data infrastructures, and metadata, linked data, semantics andontologies - general, metadata, linked data, semantics and ontologies - general session, and track on european and national projects.
This book systemically presents key concepts of multi-modal hashing technology, recent advances on large-scale efficient multimedia search and recommendation, and recent achievements in multimedia indexing technology. With the explosive growth of multimedia contents, multimedia retrieval is currently facing unprecedented challenges in both storage cost and retrieval speed. The multi-modal hashing technique can project high-dimensional data into compact binary hash codes. With it, the most time-consuming semantic similarity computation during the multimedia retrieval process can be significantly accelerated with fast Hamming distance computation, and meanwhile the storage cost can be reduced greatly by the binary embedding. The authors introduce the categorization of existing multi-modal hashing methods according to various metrics and datasets. The authors also collect recent multi-modal hashing techniques and describe the motivation, objective formulations, and optimization steps for context-aware hashing methods based on the tag-semantics transfer.
This two-volume set LNAI 13995 and LNAI 13996 constitutes the refereed proceedings of the 15th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2023, held in Phuket, Thailand, during July 24-26, 2023.The 65 full papers presented in these proceedings were carefully reviewed and selected from 224 submissions. The papers of the 2 volume-set are organized in the following topical sections: Case-Based Reasoning and Machine Comprehension; Computer Vision; Data Mining and Machine Learning; Knowledge Integration and Analysis; Speech and Text Processing; and Resource Management and Optimization.
Design and implement a data lakehouse using technology-driven simplifications and generalizations. ¿The approach you will learn enables consolidating even incoherent data from multiple source systems across complex enterprise environments. The precise business question does not need to be known in advance and can even change over time. The approach lends itself well to federated, cooperating data mesh nodes. The individual components, called mini-marts, are like the "data part" of a data quantum and are interoperable. We describe data model blueprints to generalize dimensions with synonyms and facts at different granularities. Includes code examples using complex hierarchies as they exist in heterogenous real-world go-to-market organizations.
5G and related digital revolutions will require tens of thousands of edge data centers. This book tells you how they work and how to get them built.We are in the middle of the edge computing revolution. Responding to demand for lower latency, telcos and others are moving servers and storage closer to end users-away from the "core" to "the edge." This requires the deployment of many thousands of tiny edge data centers.The edge is a big, growing business. Driven by 5G, connected vehicles, and industrial automation, the "edge economy" is projected to reach $4.1 trillion by 2030, with investment in edge data centers set to exceed $140 billion by 2028.What exactly is an edge data center? This book explains what they are and how they work. It's early in the edge computing life cycle, so there's time to get prepared for what's coming.If you work in an industry that's transforming through mobility, or any field that will leverage the edge for competitive advantage, this book will help you understand how the edge data center advances your strategic agenda.
With this textbook, Vaisman and Zimanyi deliver excellent coverage of data warehousing and business intelligence technologies ranging from the most basic principles to recent findings and applications. To this end, their work is structured into three parts. Part I describes "e;Fundamental Concepts"e; including conceptual and logical data warehouse design, as well as querying using MDX, DAX and SQL/OLAP. This part also covers data analytics using Power BI and Analysis Services. Part II details "e;Implementation and Deployment,"e; including physical design, ETL and data warehouse design methodologies. Part III covers "e;Advanced Topics"e; and it is almost completely new in this second edition. This part includes chapters with an in-depth coverage of temporal, spatial, and mobility data warehousing. Graph data warehouses are also covered in detail using Neo4j. The last chapter extensively studies big data management and the usage of Hadoop, Spark, distributed, in-memory, columnar, NoSQL and NewSQL database systems, and data lakes in the context of analytical data processing. As a key characteristic of the book, most of the topics are presented and illustrated using application tools. Specifically, a case study based on the well-known Northwind database illustrates how the concepts presented in the book can be implemented using Microsoft Analysis Services and Power BI. All chapters have been revised and updated to the latest versions of the software tools used. KPIs and Dashboards are now also developed using DAX and Power BI, and the chapter on ETL has been expanded with the implementation of ETL processes in PostgreSQL. Review questions and exercises complement each chapter to support comprehensive student learning. Supplemental material to assist instructors using this book as a course text is available online and includes electronic versions of the figures, solutions to all exercises, and a set of slides accompanying each chapter. Overall, students, practitioners and researchers alike will find this book the most comprehensive reference work on data warehouses, with key topics described in a clear and educational style."e;I can only invite you to dive into the contents of the book, feeling certain that once you have completed its reading (or maybe, targeted parts of it), you will join me in expressing our gratitude to Alejandro and Esteban, for providing such a comprehensive textbook for the field of data warehousing in the first place, and for keeping it up to date with the recent developments, in this current second edition."e;From the foreword by Panos Vassiliadis, University of Ioannina, Greece.
This book includes high-quality papers presented at the Second International Symposium on Computer Vision and Machine Intelligence in Medical Image Analysis (ISCMM 2021), organized by Computer Applications Department, SMIT in collaboration with Department of Pathology, SMIMS, Sikkim, India, and funded by Indian Council of Medical Research, during 11 - 12 November 2021. It discusses common research problems and challenges in medical image analysis, such as deep learning methods. It also discusses how these theories can be applied to a broad range of application areas, including lung and chest x-ray, breast CAD, microscopy and pathology. The studies included mainly focus on the detection of events from biomedical signals.
This book explores the rich history of the keyword from its earliest manifestations (long before it appeared anywhere in Google Trends or library cataloging textbooks) in order to illustrate its implicit and explicit mediation of human cognition and communication processes. The author covers the concept of the keyword from its deictic origins in primate and proto-speech communities, through its development within oral traditions, to its initial appearances in numerous graphical forms and its workings over time within a variety of indexing traditions and technologies. The book follows the history all the way to its role in search engine optimization and social media strategies and its potential as an element in the slowly emerging semantic web, as well as in multiple voice search applications. The author synthesizes different perspectives on the significance of this often-invisible intermediary, both in and out of the library and information science context, helping readers to understand how it has come to be so embedded in our daily life.This book: Provides a thorough history of the keyword, from primate and proto-speech communities to current timesExplains how the concept of the keyword relates to human cognition and communication processesHighlights the applications of the keyword, both in and out of the library and information science context
The data lakehouse is the next generation of the data warehouse and data lake, designed to meet today's complex and ever-changing modern information systems. This book shows you how to construct your data lakehouse as the foundation for your artificial intelligence (AI), machine learning (ML), and data mesh initiatives. Know the pitfalls and techniques for maximizing business value of your data lakehouse.In addition, be able to explain the core characteristics and critical success factors of a data lakehouse. By reviewing entry errors, key incompatibility, and ensuring good documentation, we can improve the data quality and believability of your lakehouse. Evaluate criteria for data quality, including accuracy, completeness, reliability, relevance, and timeliness. Understand the different types of storage for the lakehouse, including the under-utilized yet extremely valuable bulk storage. There are three data types in the data lakehouse (structured, textual, and analog/ IoT), and for each, learn how to build a robust foundation for artificial intelligence (AI), machine learning (ML), and data mesh. Leverage data models for structured data, ontologies and taxonomies for textual data, and distillation algorithms for analog/IoT data. Learn how to abstract these data types to accommodate future requirements and simplify data lineage. Apply Extract, Transform, and Load (ETL) to create a structure that returns the answers to business problems. The end result is a data lakehouse that meets our needs. Speaking of human needs, learn Maslow's Hierarchy of Data Lakehouse Needs. Next explore data integration geared for Al, ML, and data mesh. Then deep dive with us into all of the varieties of analytics within the lakehouse, including structured, textual, and analog analytics. Witness how descriptive data, data catalog, and metadata can increase the value of the lakehouse. We conclude with a detailed evolution of data architecture, from magnetic tape to the data lakehouse as a bedrock foundation for AI, ML, and data mesh.
Let the sun shine through! The cloud in data warehousing skies is finally clearing as Dr. Barry Devlin builds the architectural and systems foundations for data lakehouse, data fabric, and data mesh, as well as the base cloud data warehouse.The past five years has seen an explosion of innovation and new technical forms as cloud data warehousing has gone mainstream. But confusion has grown too. After all, the business needs are largely unchanged. So, why are there so many options and approaches? How do they differ? Which one may be the best choice? And why?In this first volume of a two-part series, Dr. Barry Devlin-a founder of the entire data warehousing industry-offers initial answers these questions. Drawing lessons from the long history of data warehousing, he defines an all-embracing architecture and draws specific architectural design patterns for each of these modern approaches. And he discusses the various choices and paths from current systems to the different cloud solutions.Volume II expands further on the architectural considerations and offers deeper dives into cloud data warehouse, data fabric, data lakehouse, and data mesh. It also offers an independent view of their strengths and weaknesses.
This book constitutes the proceedings of the 17th International Conference on Research Challenges in Information Sciences, RCIS 2023, which took place in Corfu, Greece, during May 23¿26, 2023. It focused on the special theme "Information Science and the Connected World".The scope of RCIS is summarized by the thematic areas of information systems and their engineering; user-oriented approaches; data and information management; business process management; domain-specific information systems engineering; data science; information infrastructures, and reflective research and practice.The 28 full papers presented in this volume were carefully reviewed and selected from a total of 87 submissions. The book also includes 15 Forum papers and 6 Doctoral Consortium papers. The contributions were organized in topical sections named: Requirements; conceptual modeling and ontologies; machine learning and analytics; conceptual modeling and semantic networks; business process design and computing in the continuum; requirements and evaluation; monitoring and recommending; business process analysis and improvement; user interface and experience; forum papers; doctoral consortium papers. Two-page abstracts of the tutorials can be found in the back matter of the volume.
This book constitutes the refereed proceedings of the 20th International Conference onThe Semantic Web, ESWC 2023, held in Hersonissos, Crete, Greece, during May 28¿June 1, 2023.The 41 full papers included in this book were carefully reviewed and selected from 167 submissions. They are organized in topical sections as follows: research, resource and in-use.
This volume represents the 20th International Conference on Information Technology - New Generations (ITNG), 2023. ITNG is an annual event focusing on state of the art technologies pertaining to digital information and communications. The applications of advanced information technology to such domains as astronomy, biology, education, geosciences, security, and health care are the among topics of relevance to ITNG. Visionary ideas, theoretical and experimental results, as well as prototypes, designs, and tools that help the information readily flow to the user are of special interest. Machine Learning, Robotics, High Performance Computing, and Innovative Methods of Computing are examples of related topics. The conference features keynote speakers, a best student award, poster award, service award, a technical open panel, and workshops/exhibits from industry, government and academia. This publication is unique as it captures modern trends in IT with a balance of theoretical and experimental work. Most other work focus either on theoretical or experimental, but not both. Accordingly, we do not know of any competitive literature.
This second edition textbook covers a coherently organized framework for text analytics, which integrates material drawn from the intersecting topics of information retrieval, machine learning, and natural language processing. Particular importance is placed on deep learning methods. The chapters of this book span three broad categories:1. Basic algorithms: Chapters 1 through 7 discuss the classical algorithms for text analytics such as preprocessing, similarity computation, topic modeling, matrix factorization, clustering, classification, regression, and ensemble analysis.2. Domain-sensitive learning and information retrieval: Chapters 8 and 9 discuss learning models in heterogeneous settings such as a combination of text with multimedia or Web links. The problem of information retrieval and Web search is also discussed in the context of its relationship with ranking and machine learning methods. 3. Natural language processing: Chapters 10 through 16 discuss various sequence-centric and natural language applications, such as feature engineering, neural language models, deep learning, transformers, pre-trained language models, text summarization, information extraction, knowledge graphs, question answering, opinion mining, text segmentation, and event detection. Compared to the first edition, this second edition textbook (which targets mostly advanced level students majoring in computer science and math) has substantially more material on deep learning and natural language processing. Significant focus is placed on topics like transformers, pre-trained language models, knowledge graphs, and question answering.
Dieses Lehrbuch bietet eine kompakte Einführung in die Grundlagen der Graphentheorie und die Methoden der Netzwerkanalyse. Zahlreiche praktische Beispiele und Übungsaufgaben mit Lösungsvorschlägen helfen Leser:innen dabei, die theoretischen Konzepte besser zu verstehen und anzuwenden. Dabei werden unterschiedliche Technologien und Programmiersprachen verwendet, um ein breites Spektrum an Anwendungen abzudecken. Darüber hinaus beleuchten spezielle Kapitel die Methodik mit Blick auf die Planung und Durchführung eigener Netzwerkanalyseprojekte sowie ethische und datenschutzrechtliche Aspekte. So liefert das Buch nicht nur einen theoretischen Überblick, sondern auch praktische Tipps und Anleitungen für die Untersuchung eigener netzwerkanalytischer Fragestellungen. Dieses Buch eignet sich nicht nur als Nachschlagewerk für Studierende und Dozierende vielfältiger Fachdisziplinen mit curricularem Bezug zum Thema, sondern auch als Ergänzung des Repertoires von Praktiker:innen im Bereich Data Science mit Interesse an der Untersuchung von Netzwerken. Ob als theoretischer Einstieg oder als praktischer Ratgeber - dieses Buch leistet einen Beitrag für die Untersuchung und Analyse von Netzwerken und bietet eine Grundlage für weiterführende Studien und Projekte.
This volume is the first (I) of four under the main themes of Digitizing Agriculture and Information and Communication Technologies (ICT). The four volumes cover rapidly developing processes including Sensors (I), Data (II), Decision (III), and Actions (IV). Volumes are related to ¿digital transformation¿ within agricultural production and provision systems, and in the context of Smart Farming Technology and Knowledge-based Agriculture. Content spans broadly from data mining and visualization to big data analytics and decision making, alongside with the sustainability aspects stemming from the digital transformation of farming. The four volumes comprise the outcome of the 12th EFITA Congress, also incorporating chapters that originated from select presentations of the Congress. The focus in this volume is on different aspects of sensors implementation in agricultural production (e.g., types of sensors, parameters monitoring, network types, connectivity, accuracy, reliability, durability, and needs to be covered) and provides variety of information and knowledge in the subject of sensors design, development, and deployment for monitoring agricultural production parameters. The book consists of four (4) Sections. The first section presents an overview on the state-off-the art in sensing technologies applied in agricultural production while the rest of the sections are dedicated to remote sensing, proximal sensing, and wireless sensor networks applications.Topics include: Emerging sensing technologies Soil reflectance spectroscopy LoRa technologies applications in agricultureWireless sensor networks deployment and applications Combined remote and proximal sensing solutions Crop phenology monitoring Sensors for geophysical properties Combined sensing technologies with geoinformation systems
This third edition handbook describes in detail the classical methods as well as extensions and novel approaches that were more recently introduced within this field. It consists of five parts: general recommendation techniques, special recommendation techniques, value and impact of recommender systems, human computer interaction, and applications. The first part presents the most popular and fundamental techniques currently used for building recommender systems, such as collaborative filtering, semantic-based methods, recommender systems based on implicit feedback, neural networks and context-aware methods. The second part of this handbook introduces more advanced recommendation techniques, such as session-based recommender systems, adversarial machine learning for recommender systems, group recommendation techniques, reciprocal recommenders systems, natural language techniques for recommender systems and cross-domain approaches to recommender systems. The third part covers a wide perspective to the evaluation of recommender systems with papers on methods for evaluating recommender systems, their value and impact, the multi-stakeholder perspective of recommender systems, the analysis of the fairness, novelty and diversity in recommender systems. The fourth part contains a few chapters on the human computer dimension of recommender systems, with research on the role of explanation, the user personality and how to effectively support individual and group decision with recommender systems. The last part focusses on application in several important areas, such as, food, music, fashion and multimedia recommendation. This informative third edition handbook provides a comprehensive, yet concise and convenient reference source to recommender systems for researchers and advanced-level students focused on computer science and data science. Professionals working in data analytics that are using recommendation and personalization techniques will also find this handbook a useful tool.
This book examines the recent trend of extending data dependencies to adapt to rich data types in order to address variety and veracity issues in big data. Readers will be guided through the full range of rich data types where data dependencies have been successfully applied, including categorical data with equality relationships, heterogeneous data with similarity relationships, numerical data with order relationships, sequential data with timestamps, and graph data with complicated structures. The text will also discuss interesting constraints on ordering or similarity relationships contained in novel classes of data dependencies in addition to those in equality relationships, e.g., considered in functional dependencies (FDs). In addition to exploring the concepts of these data dependency notations, the book investigates the extension relationships between data dependencies, such as conditional functional dependencies (CFDs) that extend conventional functional dependencies (FDs). This forms in the book a family tree of extensions, mostly rooted in FDs, that help illuminate the expressive power of various data dependencies. Moreover, the book points to work on the discovery of dependencies from data, since data dependencies are often unlikely to be manually specified in a traditional way, given the huge volume and high variety in big data. It further outlines the applications of the extended data dependencies, in particular in data quality practice. Altogether, this book provides a comprehensive guide for readers to select proper data dependencies for their applications that have sufficient expressive power and reasonable discovery cost. Finally, the book concludes with several directions of future studies on emerging data.
This book constitutes the refereed proceedings of the 12th European Conference on Artificial Intelligence in Music, Sound, Art and Design, EvoMUSART 2023, held as part of Evo* 2023, in April 2023, co-located with the Evo* 2023 events, EvoCOP, EvoApplications, and EuroGP.The 20 full papers and 7 short papers presented in this book were carefully reviewed and selected from 55 submissions. They cover a wide range of topics and application areas of artificial intelligence, including generative approaches to music and visual art, deep learning, and architecture.
This volume constitutes selected papers presented at the First International Conference on Artificial Intelligence: Theories and Applications, ICAITA 2022, held in Mascara, Algeria, in November 2022. The 23 papers were thoroughly reviewed and selected from the 66 qualified submissions. They are organized in topical sections on ¿artificial vision; and articial intelligence in big data and natural language processing.
This open access book provides an introduction and an overview of learning to quantify (a.k.a. ¿quantification¿), i.e. the task of training estimators of class proportions in unlabeled data by means of supervised learning. In data science, learning to quantify is a task of its own related to classification yet different from it, since estimating class proportions by simply classifying all data and counting the labels assigned by the classifier is known to often return inaccurate (¿biased¿) class proportion estimates.The book introduces learning to quantify by looking at the supervised learning methods that can be used to perform it, at the evaluation measures and evaluation protocols that should be used for evaluating the quality of the returned predictions, at the numerous fields of human activity in which the use of quantification techniques may provide improved results with respect to the naive use of classification techniques, and at advanced topics in quantification research.The book is suitable to researchers, data scientists, or PhD students, who want to come up to speed with the state of the art in learning to quantify, but also to researchers wishing to apply data science technologies to fields of human activity (e.g., the social sciences, political science, epidemiology, market research) which focus on aggregate (¿macrö) data rather than on individual (¿micrö) data.
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.