Gør som tusindvis af andre bogelskere
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.Du kan altid afmelde dig igen.
It is a great pleasure to write a preface to this book. In my view, the content is unique in that it blends traditional teaching approaches with the use of mathematics and a mainstream Hardware Design Language (HDL) as formalisms to describe key concepts. The book keeps the "machine" separate from the "application" by strictly following a bottom-up approach: it starts with transistors and logic gates and only introduces assembly language programs once their execution by a processor is clearly de ned. Using a HDL, Verilog in this case, rather than static circuit diagrams is a big deviation from traditional books on computer architecture. Static circuit diagrams cannot be explored in a hands-on way like the corresponding Verilog model can. In order to understand why I consider this shift so important, one must consider how computer architecture, a subject that has been studied for more than 50 years, has evolved. In the pioneering days computers were constructed by hand. An entire computer could (just about) be described by drawing a circuit diagram. Initially, such d- grams consisted mostly of analogue components before later moving toward d- ital logic gates. The advent of digital electronics led to more complex cells, such as half-adders, ip- ops, and decoders being recognised as useful building blocks.
The last decade has seen two parallel developments, one in computer science, the other in mathematics, both dealing with the same kind of combinatorial structures: networks with strong symmetry properties or, in graph-theoretical language, vertex-transitive graphs, in particular their prototypical examples, Cayley graphs. In the design of large interconnection networks it was realised that many of the most fre- quently used models for such networks are Cayley graphs of various well-known groups. This has spawned a considerable amount of activity in the study of the combinatorial properties of such graphs. A number of symposia and congresses (such as the bi-annual IWIN, starting in 1991) bear witness to the interest of the computer science community in this subject. On the mathematical side, and independently of any interest in applications, progress in group theory has made it possible to make a realistic attempt at a complete description of vertex-transitive graphs. The classification of the finite simple groups has played an important role in this respect.
The SCAN conference, the International Symposium on Scientific Com- puting, Computer Arithmetic and Validated Numerics, takes place bian- nually under the joint auspices of GAMM (Gesellschaft fiir Angewandte Mathematik und Mechanik) and IMACS (International Association for Mathematics and Computers in Simulation). SCAN-98 attracted more than 100 participants from 21 countries all over the world. During the four days from September 22 to 25, nine highlighted, plenary lectures and over 70 contributed talks were given. These figures indicate a large participation, which was partly caused by the attraction of the organizing country, Hungary, but also the effec- tive support system have contributed to the success. The conference was substantially supported by the Hungarian Research Fund OTKA, GAMM, the National Technology Development Board OMFB and by the J6zsef Attila University. Due to this funding, it was possible to subsidize the participation of over 20 scientists, mainly from Eastern European countries. It is important that the possibly first participation of 6 young researchers was made possible due to the obtained support. The number of East-European participants was relatively high. These results are especially valuable, since in contrast to the usual 2 years period, the present meeting was organized just one year after the last SCAN-xx conference.
With the advent of portable and autonomous computing systems, power con- sumption has emerged as a focal point in many research projects, commercial systems and DoD platforms. One current research initiative, which drew much attention to this area, is the Power Aware Computing and Communications (PAC/C) program sponsored by DARPA. Many of the chapters in this book include results from work that have been supported by the PACIC program. The performance of computer systems has been tremendously improving while the size and weight of such systems has been constantly shrinking. The capacities of batteries relative to their sizes and weights has been also improv- ing but at a rate which is much slower than the rate of improvement in computer performance and the rate of shrinking in computer sizes. The relation between the power consumption of a computer system and it performance and size is a complex one which is very much dependent on the specific system and the technology used to build that system. We do not need a complex argument, however, to be convinced that energy and power, which is the rate of energy consumption, are becoming critical components in computer systems in gen- eral, and portable and autonomous systems, in particular. Most of the early research on power consumption in computer systems ad- dressed the issue of minimizing power in a given platform, which usually translates into minimizing energy consumption, and thus, longer battery life.
This book constitutes the refereed proceedings of the 14th International Conference on Reversible Computation, RC 2022, which was held in Urbino, Italy, during July 5-6, 2021.The 10 full papers and 6 short papers included in this book were carefully reviewed and selected from 20 submissions. They were organized in topical sections named: Reversible and Quantum Circuits; Applications of quantum Computing; Foundations and Applications.
This book serves as a single-source reference to the latest advances in Approximate Computing (AxC), a promising technique for increasing performance or reducing the cost and power consumption of a computing system. The authors discuss the different AxC design and validation techniques, and their integration. They also describe real AxC applications, spanning from mobile to high performance computing and also safety-critical applications.
OverviewSimple SysML for Beginners: Using Sparx Enterprise Architect is for beginners. The book assumes that you have just purchased a copy of Enterprise Architect and are anxious to get started, but otherwise don't know too much about SysML and don't have much experience using Enterprise Architect or any other similar tool. There are several good books on the market about SysML. However, these books show only finished diagrams. They don't cover the steps needed to construct the models and the diagrams. These steps can be remarkably complicated; the sequence of steps needed to construct the underlying model for a diagram is often less than obvious when using a real SysML tool. The purpose of this book is to help you get through the initial learning curve and get you on your way to becoming proficient at SysML modeling. ENTERPRISE ARCHITECT is a trademark of Sparx Systems Pty Ltd. Benefitsquick start with the tool simple SysML modeling examples50+ example models 400+ annotated screen capturessolid foundation for further study LimitationsRequirements Engineering - This book is not an exhaustive text on requirements engineering. However, the "Further Reading" appendix does list a number of excellent books for deeper understanding of this topic.Tool Version - The first edition of this book was authored primarily on Enterprise Architect version 14. There are minor differences in the procedures for later versions of the tool.Tool Manual - This is a beginner's introduction and is not a comprehensive reference for every feature of the tool.
Learn the big skills of C programming by creating bite-size projects! Work your way through these 21 fun and interesting tiny challenges to master essential C techniques you'll use in full-size applications. Tiny C Projects builds and hones your C programming skills with interesting and exciting challenges. You'll expand your C programming portfolio by creating useful utility programs, fun games, password generators, directory utilities, and more. Each program you create starts out simple and then deepens as you explore approaches and alternatives you can use to achieve your goals. Once you're done, you'll find it easy to scale up the skills you've learned from tiny projects into real applications.
This book covers technologies, applications, tools, languages, procedures, advantages, and disadvantages of reconfigurable supercomputing using Field Programmable Gate Arrays (FPGAs). The target audience is the community of users of High Performance Computers (HPC) who may benefit from porting their applications into a reconfigurable environment. As such, this book is intended to guide the HPC user through the many algorithmic considerations, hardware alternatives, usability issues, programming languages, and design tools that need to be understood before embarking on the creation of reconfigurable parallel codes. We hope to show that FPGA acceleration, based on the exploitation of the data parallelism, pipelining and concurrency remains promising in view of the diminishing improvements in traditional processor and system design. Table of Contents: FPGA Technology / Reconfigurable Supercomputing / Algorithmic Considerations / FPGA Programming Languages / Case Study: Sorting / Alternative Technologies and Concluding Remarks
Originally developed to support video games, graphics processor units (GPUs) are now increasingly used for general-purpose (non-graphics) applications ranging from machine learning to mining of cryptographic currencies. GPUs can achieve improved performance and efficiency versus central processing units (CPUs) by dedicating a larger fraction of hardware resources to computation. In addition, their general-purpose programmability makes contemporary GPUs appealing to software developers in comparison to domain-specific accelerators. This book provides an introduction to those interested in studying the architecture of GPUs that support general-purpose computing. It collects together information currently only found among a wide range of disparate sources. The authors led development of the GPGPU-Sim simulator widely used in academic research on GPU architectures.The first chapter of this book describes the basic hardware structure of GPUs and provides a brief overview of their history. Chapter 2 provides a summary of GPU programming models relevant to the rest of the book. Chapter 3 explores the architecture of GPU compute cores. Chapter 4 explores the architecture of the GPU memory system. After describing the architecture of existing systems, Chapters 3 and 4 provide an overview of related research. Chapter 5 summarizes cross-cutting research impacting both the compute core and memory system.This book should provide a valuable resource for those wishing to understand the architecture of graphics processor units (GPUs) used for acceleration of general-purpose applications and to those who want to obtain an introduction to the rapidly growing body of research exploring how to improve the architecture of these GPUs.
The emerging three-dimensional (3D) chip architectures, with their intrinsic capability of reducing the wire length, promise attractive solutions to reduce the delay of interconnects in future microprocessors. 3D memory stacking enables much higher memory bandwidth for future chip-multiprocessor design, mitigating the "e;memory wall"e; problem. In addition, heterogenous integration enabled by 3D technology can also result in innovative designs for future microprocessors. This book first provides a brief introduction to this emerging technology, and then presents a variety of approaches to designing future 3D microprocessor systems, by leveraging the benefits of low latency, high bandwidth, and heterogeneous integration capability which are offered by 3D technology.
This book provides computer engineers, academic researchers, new graduate students, and seasoned practitioners an end-to-end overview of virtual memory. We begin with a recap of foundational concepts and discuss not only state-of-the-art virtual memory hardware and software support available today, but also emerging research trends in this space. The span of topics covers processor microarchitecture, memory systems, operating system design, and memory allocation. We show how efficient virtual memory implementations hinge on careful hardware and software cooperation, and we discuss new research directions aimed at addressing emerging problems in this space.Virtual memory is a classic computer science abstraction and one of the pillars of the computing revolution. It has long enabled hardware flexibility, software portability, and overall better security, to name just a few of its powerful benefits. Nearly all user-level programs today take for granted that they will have been freed from the burden of physical memory management by the hardware, the operating system, device drivers, and system libraries.However, despite its ubiquity in systems ranging from warehouse-scale datacenters to embedded Internet of Things (IoT) devices, the overheads of virtual memory are becoming a critical performance bottleneck today. Virtual memory architectures designed for individual CPUs or even individual cores are in many cases struggling to scale up and scale out to today's systems which now increasingly include exotic hardware accelerators (such as GPUs, FPGAs, or DSPs) and emerging memory technologies (such as non-volatile memory), and which run increasingly intensive workloads (such as virtualized and/or "e;big data"e; applications). As such, many of the fundamental abstractions and implementation approaches for virtual memory are being augmented, extended, or entirely rebuilt in order to ensure that virtual memory remains viable and performant in the years to come.
This synthesis lecture presents the current state-of-the-art in applying low-latency, lossless hardware compression algorithms to cache, memory, and the memory/cache link. There are many non-trivial challenges that must be addressed to make data compression work well in this context. First, since compressed data must be decompressed before it can be accessed, decompression latency ends up on the critical memory access path. This imposes a significant constraint on the choice of compression algorithms. Second, while conventional memory systems store fixed-size entities like data types, cache blocks, and memory pages, these entities will suddenly vary in size in a memory system that employs compression. Dealing with variable size entities in a memory system using compression has a significant impact on the way caches are organized and how to manage the resources in main memory. We systematically discuss solutions in the open literature to these problems. Chapter 2 provides the foundations of data compression by first introducing the fundamental concept of value locality. We then introduce a taxonomy of compression algorithms and show how previously proposed algorithms fit within that logical framework. Chapter 3 discusses the different ways that cache memory systems can employ compression, focusing on the trade-offs between latency, capacity, and complexity of alternative ways to compact compressed cache blocks. Chapter 4 discusses issues in applying data compression to main memory and Chapter 5 covers techniques for compressing data on the cache-to-memory links. This book should help a skilled memory system designer understand the fundamental challenges in applying compression to the memory hierarchy and introduce him/her to the state-of-the-art techniques in addressing them.
Since the end of Dennard scaling in the early 2000s, improving the energy efficiency of computation has been the main concern of the research community and industry. The large energy efficiency gap between general-purpose processors and application-specific integrated circuits (ASICs) motivates the exploration of customizable architectures, where one can adapt the architecture to the workload. In this Synthesis lecture, we present an overview and introduction of the recent developments on energy-efficient customizable architectures, including customizable cores and accelerators, on-chip memory customization, and interconnect optimization. In addition to a discussion of the general techniques and classification of different approaches used in each area, we also highlight and illustrate some of the most successful design examples in each category and discuss their impact on performance and energy efficiency. We hope that this work captures the state-of-the-art research and development on customizable architectures and serves as a useful reference basis for further research, design, and implementation for large-scale deployment in future computing systems.
Machine learning, and specifically deep learning, has been hugely disruptive in many fields of computer science. The success of deep learning techniques in solving notoriously difficult classification and regression problems has resulted in their rapid adoption in solving real-world problems. The emergence of deep learning is widely attributed to a virtuous cycle whereby fundamental advancements in training deeper models were enabled by the availability of massive datasets and high-performance computer hardware.This text serves as a primer for computer architects in a new and rapidly evolving field. We review how machine learning has evolved since its inception in the 1960s and track the key developments leading up to the emergence of the powerful deep learning techniques that emerged in the last decade. Next we review representative workloads, including the most commonly used datasets and seminal networks across a variety of domains. In addition to discussing the workloads themselves, we also detail the most popular deep learning tools and show how aspiring practitioners can use the tools with the workloads to characterize and optimize DNNs.The remainder of the book is dedicated to the design and optimization of hardware and architectures for machine learning. As high-performance hardware was so instrumental in the success of machine learning becoming a practical solution, this chapter recounts a variety of optimizations proposed recently to further improve future designs. Finally, we present a review of recent research published in the area as well as a taxonomy to help readers understand how various contributions fall in context.
Hardware acceleration in the form of customized datapath and control circuitry tuned to specific applications has gained popularity for its promise to utilize transistors more efficiently. Historically, the computer architecture community has focused on general-purpose processors, and extensive research infrastructure has been developed to support research efforts in this domain. Envisioning future computing systems with a diverse set of general-purpose cores and accelerators, computer architects must add accelerator-related research infrastructures to their toolboxes to explore future heterogeneous systems. This book serves as a primer for the field, as an overview of the vast literature on accelerator architectures and their design flows, and as a resource guidebook for researchers working in related areas.
This book focuses on the core question of the necessary architectural support provided by hardware to efficiently run virtual machines, and of the corresponding design of the hypervisors that run them. Virtualization is still possible when the instruction set architecture lacks such support, but the hypervisor remains more complex and must rely on additional techniques.Despite the focus on architectural support in current architectures, some historical perspective is necessary to appropriately frame the problem. The first half of the book provides the historical perspective of the theoretical framework developed four decades ago by Popek and Goldberg. It also describes earlier systems that enabled virtualization despite the lack of architectural support in hardware.As is often the case, theory defines a necessary-but not sufficient-set of features, and modern architectures are the result of the combination of the theoretical framework with insights derived from practical systems. The second half of the book describes state-of-the-art support for virtualization in both x86-64 and ARM processors. This book includes an in-depth description of the CPU, memory, and I/O virtualization of these two processor architectures, as well as case studies on the Linux/KVM, VMware, and Xen hypervisors. It concludes with a performance comparison of virtualization on current-generation x86- and ARM-based systems across multiple hypervisors.
This book aims to achieve the following goals: (1) to provide a high-level survey of key analytics models and algorithms without going into mathematical details; (2) to analyze the usage patterns of these models; and (3) to discuss opportunities for accelerating analytics workloads using software, hardware, and system approaches. The book first describes 14 key analytics models (exemplars) that span data mining, machine learning, and data management domains. For each analytics exemplar, we summarize its computational and runtime patterns and apply the information to evaluate parallelization and acceleration alternatives for that exemplar. Using case studies from important application domains such as deep learning, text analytics, and business intelligence (BI), we demonstrate how various software and hardware acceleration strategies are implemented in practice. This book is intended for both experienced professionals and students who are interested in understanding core algorithms behind analytics workloads. It is designed to serve as a guide for addressing various open problems in accelerating analytics workloads, e.g., new architectural features for supporting analytics workloads, impact on programming models and runtime systems, and designing analytics systems.
An era of big data demands datacenters, which house the computing infrastructure that translates raw data into valuable information. This book defines datacenters broadly, as large distributed systems that perform parallel computation for diverse users. These systems exist in multiple forms-private and public-and are built at multiple scales. Datacenter design and management is multifaceted, requiring the simultaneous pursuit of multiple objectives. Performance, efficiency, and fairness are first-order design and management objectives, which can each be viewed from several perspectives. This book surveys datacenter research from a computer architect's perspective, addressing challenges in applications, design, management, server simulation, and system simulation. This perspective complements the rich bodies of work in datacenters as a warehouse-scale system, which study the implications for infrastructure that encloses computing equipment, and in datacenters as distributed systems, which employ abstract details in processor and memory subsystems. This book is written for first- or second-year graduate students in computer architecture and may be helpful for those in computer systems. The goal of this book is to prepare computer architects for datacenter-oriented research by describing prevalent perspectives and the state-of-the-art.
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.