Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

These are the Lecture Slides of Program Optimization for Multi Core Architectures which includes Triangular Lower Limits, Multiple Loop Limits, Dependence System Solvers, Single Equation, Simple Test, Extreme Value Test etc.Key important points are: Multi Core Computing, Contents, Scheduling, Scheduling Criteria, Scheduling Algorithms, Preemptive, Adding Priority

Typology: Slides

2012/2013

Uploaded on 03/28/2013

ekanath
ekanath 🇮🇳

3.8

(4)

80 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Objectives_template
file:///D|/...haudhary,%20Dr.%20Sanjeev%20K%20Aggrwal%20&%20Dr.%20Rajat%20Moona/Multi-core_Architecture/lecture2/2_1.htm[6/14/2012 11:20:48 AM]
Module 1: Multi-core: The Ultimate Dose of Moore's Law
Lecture 2: Introduction to Multi-core Architecture
The Lecture Contains:
Scaling Issues
Multi-core
Thread-level Parallelism
Communication in Multi-core
Tiled CMP (Hypothetical Floor-plan)
Shared Cache CMP
Niagara Floor-plan
Implications on Software
Research Directions
References
pf3
pf4
pf5

Partial preview of the text

Download MULTIC~2 and more Slides Computer Science in PDF only on Docsity!

Module 1: Multi-core: The Ultimate Dose of Moore's Law

Lecture 2: Introduction to Multi-core Architecture

The Lecture Contains:

Scaling Issues

Multi-core

Thread-level Parallelism

Communication in Multi-core

Tiled CMP (Hypothetical Floor-plan)

Shared Cache CMP

Niagara Floor-plan

Implications on Software

Research Directions

References

Module 1: Multi-core: The Ultimate Dose of Moore's Law

Lecture 2: Introduction to Multi-core Architecture

Scaling Issues

Hardware for extracting ILP has reached the point of diminishing return Need a large number of in-flight instructions Supporting such a large population inside the chip requires power-hungry delay- sensitive logic and storage Verification complexity is getting out of control How to exploit so many transistors? Must be a de-centralized design which avoids long wires

Multi-core

Put a few reasonably complex processors or many simple processors on the chip Each processor has its own primary cache and pipeline Often a processor is called a core Often called a chip-multiprocessor (CMP) Did we use the transistors properly? Depends on if you can keep the cores busy Introduces the concept of thread-level parallelism (TLP)

Module 1: Multi-core: The Ultimate Dose of Moore's Law

Lecture 2: Introduction to Multi-core Architecture

Communication in Multi-core

Ideal for shared address space Fast on-chip hardwired communication through cache (no OS intervention) Two types of architectures Tiled CMP: each core has its private cache hierarchy (no cache sharing); Intel Pentium D, Dual Core Opteron , Intel Montecito, Sun UltraSPARC IV, IBM Cell (more specialized) Shared cache CMP: Outermost level of cache hierarchy is shared among cores; Intel Woodcrest (server-grade Core duo), Intel Conroe (Core2 duo for desktop), Sun Niagara, IBM Power4, IBM Power

Tiled CMP (Hypothetical Floor-plan)

Module 1: Multi-core: The Ultimate Dose of Moore's Law

Lecture 2: Introduction to Multi-core Architecture

Shared Cache CMP

Niagara Floor-plan

Module 1: Multi-core: The Ultimate Dose of Moore's Law

Lecture 2: Introduction to Multi-core Architecture

References

A good reading is Parallel Computer Architecture by Culler, Singh with Gupta Caveat: does not talk about multi-core, but introduces the general area of shared memory multiprocessors Papers Check out the most recent issue of Intel Technology Journal http://www.intel.com/technology/itj/ http://www.intel.com/technology/itj/archive.htm Conferences: ASPLOS, ISCA, HPCA, MICRO, PACT Journals: IEEE Micro, IEEE TPDS, ACM TACO