《并行程序设计导论》第一章.ppt

上传人:scccc 文档编号:13010807 上传时间:2021-12-10 格式:PPT 页数:42 大小:3.31MB
返回 下载 相关 举报
《并行程序设计导论》第一章.ppt_第1页
第1页 / 共42页
《并行程序设计导论》第一章.ppt_第2页
第2页 / 共42页
《并行程序设计导论》第一章.ppt_第3页
第3页 / 共42页
《并行程序设计导论》第一章.ppt_第4页
第4页 / 共42页
《并行程序设计导论》第一章.ppt_第5页
第5页 / 共42页
点击查看更多>>
资源描述

《《并行程序设计导论》第一章.ppt》由会员分享,可在线阅读,更多相关《《并行程序设计导论》第一章.ppt(42页珍藏版)》请在三一文库上搜索。

1、,并行程序设计导论第一章,并行程序设计导论第一章,Roadmap,Why we need ever-increasing performance.Why were building parallel systems.Why we need to write parallel programs.How do we write parallel programs?What well be doing.Concurrent, parallel, distributed!,# Chapter Subtitle,并行程序设计导论第一章,Changing times,From 1986 2002, mic

2、roprocessors were speeding like a rocket, increasing in performance an average of 50% per year.Since then, its dropped to about 20% increase per year.,并行程序设计导论第一章,An intelligent solution,Instead of designing and building faster microprocessors, put multiple processors on a single integrated circuit.

3、,并行程序设计导论第一章,Now its up to the programmers,Adding more processors doesnt help much if programmers arent aware of them or dont know how to use them.Serial programs dont benefit from this approach (in most cases).,并行程序设计导论第一章,Why we need ever-increasing performance,Computational power is increasing, b

4、ut so are our computation problems and needs.Problems we never dreamed of have been solved because of past increases, such as decoding the human genome.More complex problems are still waiting to be solved.,并行程序设计导论第一章,Climate modeling,并行程序设计导论第一章,Protein folding,并行程序设计导论第一章,Drug discovery,并行程序设计导论第一

5、章,Energy research,并行程序设计导论第一章,Data analysis,并行程序设计导论第一章,Why were building parallel systems,Up to now, performance increases have been attributable to increasing density of transistors.But there areinherent problems.,并行程序设计导论第一章,A little physics lesson,Smaller transistors = faster processors.Faster p

6、rocessors = increased power consumption.Increased power consumption = increased heat.Increased heat = unreliable processors.,并行程序设计导论第一章,Solution,Move away from single-core systems to multicore processors.“core” = central processing unit (CPU),Introducing parallelism!,并行程序设计导论第一章,Why we need to writ

7、e parallel programs,Running multiple instances of a serial program often isnt very useful.Think of running multiple instances of your favorite game.What you really want is forit to run faster.,并行程序设计导论第一章,Approaches to the serial problem,Rewrite serial programs so that theyre parallel.Write translat

8、ion programs that automatically convert serial programs into parallel programs.This is very difficult to do.Success has been limited.,并行程序设计导论第一章,More problems,Some coding constructs can be recognized by an automatic program generator, and converted to a parallel construct.However, its likely that t

9、he result will be a very inefficient program.Sometimes the best parallel solution is to step back and devise an entirely new algorithm.,并行程序设计导论第一章,Example,Compute n values and add them together.Serial solution:,并行程序设计导论第一章,Example (cont.),We have p cores, p much smaller than n.Each core performs a

10、partial sum of approximately n/p values.,Each core uses its own private variablesand executes this block of codeindependently of the other cores.,并行程序设计导论第一章,Example (cont.),After each core completes execution of the code, is a private variable my_sum contains the sum of the values computed by its c

11、alls to Compute_next_value.Ex., 8 cores, n = 24, then the calls to Compute_next_value return:,1,4,3, 9,2,8, 5,1,1, 5,2,7, 2,5,0, 4,1,8, 6,5,1, 2,3,9,并行程序设计导论第一章,Example (cont.),Once all the cores are done computing their private my_sum, they form a global sum by sending results to a designated “mast

12、er” core which adds the final result.,并行程序设计导论第一章,Example (cont.),并行程序设计导论第一章,Example (cont.),Global sum8 + 19 + 7 + 15 + 7 + 13 + 12 + 14 = 95,并行程序设计导论第一章,But wait!Theres a much better wayto compute the global sum.,并行程序设计导论第一章,Better parallel algorithm,Dont make the master core do all the work.Shar

13、e it among the other cores.Pair the cores so that core 0 adds its result with core 1s result.Core 2 adds its result with core 3s result, etc.Work with odd and even numbered pairs of cores.,并行程序设计导论第一章,Better parallel algorithm (cont.),Repeat the process now with only the evenly ranked cores.Core 0 a

14、dds result from core 2.Core 4 adds the result from core 6, etc.Now cores divisible by 4 repeat the process, and so forth, until core 0 has the final result.,并行程序设计导论第一章,Multiple cores forming a global sum,并行程序设计导论第一章,Analysis,In the first example, the master core performs 7 receives and 7 additions.

15、In the second example, the master core performs 3 receives and 3 additions.The improvement is more than a factor of 2!,并行程序设计导论第一章,Analysis (cont.),The difference is more dramatic with a larger number of cores.If we have 1000 cores:The first example would require the master to perform 999 receives a

16、nd 999 additions.The second example would only require 10 receives and 10 additions.Thats an improvement of almost a factor of 100!,并行程序设计导论第一章,How do we write parallel programs?,Task parallelism Partition various tasks carried out solving the problem among the cores.Data parallelismPartition the da

17、ta used in solving the problem among the cores.Each core carries out similar operations on its part of the data.,并行程序设计导论第一章,Professor P,15 questions300 exams,并行程序设计导论第一章,Professor Ps grading assistants,TA#1,TA#2,TA#3,并行程序设计导论第一章,Division of work data parallelism,TA#1,TA#2,TA#3,100 exams,100 exams,1

18、00 exams,并行程序设计导论第一章,Division of work task parallelism,TA#1,TA#2,TA#3,Questions 1 - 5,Questions 6 - 10,Questions 11 - 15,并行程序设计导论第一章,Division of work data parallelism,并行程序设计导论第一章,Division of work task parallelism,TasksReceivingAddition,并行程序设计导论第一章,Coordination,Cores usually need to coordinate their

19、work.Communication one or more cores send their current partial sums to another core.Load balancing share the work evenly among the cores so that one is not heavily loaded.Synchronization because each core works at its own pace, make sure cores do not get too far ahead of the rest.,并行程序设计导论第一章,What

20、well be doing,Learning to write programs that are explicitly parallel.Using the C language.Using three different extensions to C.Message-Passing Interface (MPI)Posix Threads (Pthreads)OpenMP,并行程序设计导论第一章,Type of parallel systems,Shared-memoryThe cores can share access to the computers memory.Coordina

21、te the cores by having them examine and update shared memory locations.Distributed-memoryEach core has its own, private memory.The cores must communicate explicitly by sending messages across a network.,并行程序设计导论第一章,Type of parallel systems,Shared-memory,Distributed-memory,并行程序设计导论第一章,Terminology,Con

22、current computing a program is one in which multiple tasks can be in progress at any instant.Parallel computing a program is one in which multiple tasks cooperate closely to solve a problemDistributed computing a program may need to cooperate with other programs to solve a problem.,并行程序设计导论第一章,Concluding Remarks (1),The laws of physics have brought us to the doorstep of multicore technology.Serial programs typically dont benefit from multiple cores.Automatic parallel program generation from serial program code isnt the most efficient approach to get high performance from multicore computers.,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 社会民生


经营许可证编号:宁ICP备18001539号-1