Computer Architecture Topics
Master the fundamentals and dive deep into advanced concepts. Our comprehensive topic library covers everything from basic CPU design to cutting-edge parallel architectures.
CPU
Processor design, microarchitecture, and execution pipelines
Bit Manipulation Mastery
Essential bit manipulation techniques for computer architecture interviews.
Cache Coherence Protocols
Understanding how multiprocessor systems maintain data consistency across multiple cache levels and cores.
Essential foundations of computer architecture from instruction sets to pipelined execution, covering ISA design, the classic 5-stage pipeline, and hazard management.
GPU
Graphics processing, SIMD architectures, and compute shaders
Advanced GPU Architecture & Performance
Modern GPU microarchitecture, SIMT execution model, and performance optimization for senior-level design.
GPU Architecture Fundamentals
Deep dive into graphics processing unit design, SIMD execution, and parallel computing architectures.
GPU Sparsity Patterns & Performance
GPU sparsity patterns from unstructured CSR to structured 2:4 for modern tensor cores.
GPU Matrix Multiplication Optimization
GEMM optimization on GPUs: tiling, memory hierarchy, coalescing, and tensor cores for peak performance.
Memory
Cache hierarchies, memory systems, and storage technologies
Memory Hierarchy 101
Understand why memory hierarchy exists, from registers to DRAM. Learn about cache levels, locality principles, and bandwidth vs latency trade-offs.
NoC
Network-on-chip design, routing, and interconnect architectures
Network-on-Chip Virtual Channels
NoC virtual channels for deadlock-free routing in mesh topologies and many-core systems.
Understanding the phases of transformer architecture and their NoC traffic patterns, from embedding to self-attention to feedforward layers.
Parallel
Parallel processing, threading, and synchronization mechanisms
Parallel Computing & SIMD Basics
Introduction to parallelism fundamentals: data vs task parallelism, SIMD vector operations, throughput-oriented architectures, and essential parallelism metrics.
Performance
Optimization techniques, benchmarking, and performance analysis
Algorithm Patterns for Technical Interviews
Essential algorithm patterns: sliding window, prefix sums, monotonic structures, binary search.
Matrix Multiplication Optimization Techniques
Progressive optimization of matrix multiplication: from naive O(N³) to cache-blocked SIMD implementations.
Performance Metrics & Analysis Foundations
Master essential performance metrics including IPC, throughput, latency, and bandwidth. Learn to identify bottlenecks using Amdahl's Law and the Roofline Model.
Software
Programming languages, frameworks, containers, and system software design
C++ Coroutines (C++20)
Comprehensive guide to C++20 coroutines: syntax, implementation patterns, and SystemC integration.
C++ Fundamentals and Performance
Essential C++ techniques, modern features, and performance optimization patterns for systems programming.
Building STL-Like Containers from First Principles
Deep dive into implementing STL containers: linked lists, hash tables, red-black trees, and their performance characteristics.
SystemC Fundamentals
Introduction to SystemC for hardware modeling and system-level design.
TLM-2.0 Transaction-Level Modeling
Comprehensive guide to SystemC TLM-2.0 for high-level system modeling and communication.
Machine Learning
Core ML architectures, algorithms, and computational approaches for modern AI systems
KV Cache Optimization: A Comprehensive Guide
Key-Value cache optimization techniques for transformer inference: compression, retention, and memory efficiency.
Model Size Reduction Techniques
Comprehensive overview of quantization, pruning, and compression techniques for deploying large neural networks efficiently.
Stable Diffusion vs ViT (Vision Transformer)
Technical comparison of Vision Transformer and Stable Diffusion architectures and their convergence.
DatacenterArch
Large-scale system design, cluster management, and distributed architectures
TPU Pod Optical Interconnect System
Complete technical deep-dive into Google TPU Pod's optical circuit switching architecture, 3D torus topology, collective communication optimization, and datacenter-scale AI infrastructure
TPU Pod Optical Interconnects vs NVIDIA NVSwitch Comparison
Comprehensive comparison of Google TPU Pod optical interconnects with NVIDIA NVSwitch, InfiniBand, Ethernet, and emerging datacenter interconnect technologies for AI infrastructure
Start with beginner topics and progress to advanced concepts