Skip to main content
Research Hub

Computer Architecture Research

Stay current with breakthrough research and emerging trends. Explore cutting-edge papers from top-tier conferences and understand their practical implications.

3
Papers
1
Recent
3
Venues
3
Categories

Showing 3 of 3 papers

2024

1 paperRecent
InferenceOptRecent4 insights
ICML
2024

KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation

Minsik Cho, Mohammad Rastegari, Devang Naik

Novel parallelization scheme that accelerates LLM prompt phase by dual-purposing KV-cache for parallel generation, achieving 1.4× and 1.6× speedups for Llama 7B and Falcon 7B with asynchronous communication and context-level load-balancing.

Impact: Directly reduces time-to-first-token (TTFT) in production LLM serving systems, enabling better user experience for long-context applications like RAG, summarization, and in-context learning.

2023

2 papers
GPU3 insights
MICRO
2023

Dynamic Warp Scheduling for Improved GPU Utilization

Alex Thompson, Dr. Priya Patel, James Liu +1 more

Machine learning-based warp scheduler that adapts to workload characteristics, achieving 15-25% performance improvements across diverse GPU workloads.

Impact: Directly applicable to next-generation GPU architectures, with major vendors expressing interest in the approach for future products.

CPU3 insights
ISCA
2023

Scalable Cache Coherence for Manycore Processors

Sarah Chen, Michael Rodriguez, Dr. Lisa Wang

Novel directory-based coherence protocol that reduces memory overhead by 60% while maintaining performance in 256-core systems.

Impact: Enables cost-effective scaling to 256+ cores without prohibitive directory memory overhead, directly applicable to datacenter processors.

Stay Ahead of the Curve

Computer architecture is rapidly evolving. Our research summaries help you understand the latest breakthroughs and their practical implications for system design.

Top-Tier VenuesPractical InsightsIndustry Impact