C H A P T E R S
1 Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Eight Great Ideas in Computer Architecture 11
1.3 Below Your Program 13
1.4 Under the Covers 16
1.5 Technologies for Building Processors and Memory 24
1.6 Performance 28
1.7 The Power Wall 40
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors 43
1.9 Real Stuff: Benchmarking the Intel Core i7 46
1.10 Fallacies and Pitfalls 49
1.11 Concluding Remarks 52
1.12 Historical Perspective and Further Reading 54
1.13 Exercises 54
2 Instructions: Language of the Computer 60
2.1 Introduction 62
2.2 Operations of the Computer Hardware 63
2.3 Operands of the Computer Hardware 67
2.4 Signed and Unsigned Numbers 74
2.5 Representing Instructions in the Computer 81
2.6 Logical Operations 89
2.7 Instructions for Making Decisions 92
2.8 Supporting Procedures in Computer Hardware 98
2.9 Communicating with People 108
2.10 RISC-V Addressing for Wide Immediates and Addresses 113
2.11 Parallelism and Instructions: Synchronization 121
2.12 Translating and Starting a Program 124
2.13 A C Sort Example to Put it All Together 133
2.14 Arrays versus Pointers 141
2.15 Advanced Material: Compiling C and Interpreting Java 144
2.16 Real Stuff: MIPS Instructions 145
2.17 Real Stuff: x86 Instructions 146
2.18 Real Stuff: The Rest of the RISC-V Instruction Set 155
2.19 Fallacies and Pitfalls 157
2.20 Concluding Remarks 159
2.21 Historical Perspective and Further Reading 162
2.22 Exercises 162
3 Arithmetic for Computers 172
3.1 Introduction 174
3.2 Addition and Subtraction 174
3.3 Multiplication 177
3.4 Division 183
3.5 Floating Point 191
3.6 Parallelism and Computer Arithmetic: Subword Parallelism 216
3.7 Real Stuff: Streaming SIMD Extensions and Advanced Vector Extensions
in x86 217
3.8 Going Faster: Subword Parallelism and Matrix Multiply 218
3.9 Fallacies and Pitfalls 222
3.10 Concluding Remarks 225
3.11 Historical Perspective and Further Reading 227
3.12 Exercises 227
4 The Processor 234
4.1 Introduction 236
4.2 Logic Design Conventions 240
4.3 Building a Datapath 243
4.4 A Simple Implementation Scheme 251
4.5 An Overview of Pipelining 262
4.6 Pipelined Datapath and Control 276
4.7 Data Hazards: Forwarding versus Stalling 294
4.8 Control Hazards 307
4.9 Exceptions 315
4.10 Parallelism via Instructions 321
4.11 Real Stuff: The ARM Cortex-A53 and Intel Core i7 Pipelines 334
4.12 Going Faster: Instruction-Level Parallelism and Matrix Multiply 342
4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware
Design Language to Describe and Model a Pipeline and More Pipelining
Illustrations 345
4.14 Fallacies and Pitfalls 345
4.15 Concluding Remarks 346
4.16 Historical Perspective and Further Reading 347
4.17 Exercises 347
5 Large and Fast: Exploiting Memory Hierarchy 364
5.1 Introduction 366
5.2 Memory Technologies 370
5.3 The Basics of Caches 375
5.4 Measuring and Improving Cache Performance 390
5.5 Dependable Memory Hierarchy 410
5.6 Virtual Machines 416
5.7 Virtual Memory 419
5.8 A Common Framework for Memory Hierarchy 443
5.9 Using a Finite-State Machine to Control a Simple Cache 449
5.10 Parallelism and Memory Hierarchy: Cache Coherence 454
5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive
Disks 458
5.12 Advanced Material: Implementing Cache Controllers 459
5.13 Real Stuff: The ARM Cortex-A53 and Intel Core i7 Memory
Hierarchies 459
5.14 Real Stuff: The Rest of the RISC-V System and Special Instructions 464
5.15 Going Faster: Cache Blocking and Matrix Multiply 465
5.16 Fallacies and Pitfalls 468
5.17 Concluding Remarks 472
5.18 Historical Perspective and Further Reading 473
5.19 Exercises 473
6 Parallel Processors from Client to Cloud 490
6.1 Introduction 492
6.2 The Difficulty of Creating Parallel Processing Programs 494
6.3 SISD, MIMD, SIMD, SPMD, and Vector 499
6.4 Hardware Multithreading 506
6.5 Multicore and Other Shared Memory Multiprocessors 509
6.6 Introduction to Graphics Processing Units 514
6.7 Clusters, Warehouse Scale Co
......