33:
1222:
209:
microprocessors. This has occurred because, for various reasons, it has become increasingly impractical to increase either the clock speed or instructions per clock of a single core. If this trend continues, new applications will have to be designed to utilize multiple threads in order to benefit
148:
In a multiprocessor system, task parallelism is achieved when each processor executes a different thread (or process) on the same or different data. The threads may execute the same or different code. In the general case, different execution threads communicate with one another as they work, but
201:
such as databases. By running many threads at once, these applications are able to tolerate the high amounts of I/O and memory system latency their workloads can incur - while one thread is delayed waiting for a memory or disk access, other threads can do useful work.
135:
which involves running the same task on different components of data, task parallelism is distinguished by running many different tasks at the same time on the same data. A common type of task parallelism is
210:
from the increase in potential computing power. This contrasts with previous microprocessor innovations in which existing code was automatically sped up by running it on a newer/faster computer.
228:
The goal of the program is to do some net total task ("A+B"). If we write the code as above and launch it on a 2-processor system, then the runtime environment will execute it as follows.
548:
638:
164:
environment and we wish to do tasks "A" and "B", it is possible to tell CPU "a" to do task "A" and CPU "b" to do task "B" simultaneously, thereby reducing the
490:
246:
The "if" clause differentiates between the CPUs. CPU "a" will read true on the "if" and CPU "b" will read true on the "else if", thus having their own task.
619:
140:, which consists of moving a single set of data through a series of separate tasks where each task can execute independently of the others.
886:
54:
909:
798:
427:
165:
273:
Task parallelism can be supported in general-purpose languages by either built-in facilities or libraries. Notable examples include:
904:
881:
76:
483:
876:
691:
983:
846:
175:
Task parallelism emphasizes the distributed (parallelized) nature of the processing (i.e. threads), as opposed to the data (
1207:
1041:
659:
579:
337:
1252:
205:
The exploitation of thread-level parallelism has also begun to make inroads into the desktop market with the advent of
149:
this is not a requirement. Communication usually takes place by passing data from one thread to the next as part of a
1247:
1226:
1172:
632:
476:
47:
41:
1151:
946:
831:
793:
643:
533:
372:
1167:
1146:
1091:
978:
968:
941:
803:
281:
194:
169:
58:
1121:
747:
686:
599:
444:
309:
1036:
1182:
1177:
627:
112:
225:
program: ... if CPU = "a" then do task "A" else if CPU="b" then do task "B" end if ... end program
921:
853:
757:
649:
604:
299:
711:
393:
1013:
973:
926:
916:
654:
574:
513:
330:
249:
Now, both CPU's execute separate code blocks simultaneously, performing different tasks simultaneously.
953:
841:
836:
826:
813:
609:
357:
137:
367:
179:). Most real programs fall somewhere on a continuum between task parallelism and data parallelism.
1116:
1071:
897:
892:
871:
737:
124:
1141:
990:
963:
788:
752:
742:
543:
523:
518:
499:
198:
190:
128:
116:
701:
1187:
863:
821:
716:
423:
1197:
996:
931:
778:
594:
553:
362:
324:
305:
176:
132:
120:
1061:
1001:
936:
783:
773:
706:
538:
528:
197:
at once. This type of parallelism is found largely in applications written for commercial
104:
696:
1192:
1008:
665:
558:
17:
1241:
1081:
958:
681:
1202:
108:
336:
Examples of fine-grained task-parallel languages can be found in the realm of
219:
206:
1076:
1051:
318:
287:
161:
156:
As a simple example, if a system is running code on a 2-processor system (
1126:
1106:
1031:
150:
1131:
1111:
1086:
721:
341:
293:
1101:
1096:
468:
243:
In a parallel environment, both will have access to the same data.
265:
This concept can now be generalized to any number of processors.
1136:
1066:
1056:
345:
233:
472:
1046:
1023:
422:(Tata McGraw-Hill ed.). New Delhi: Tata McGraw-Hill Pub.
237:
157:
26:
119:
environments. Task parallelism focuses on distributing
452:
University of
Maryland: Department of Computer Science
1160:
1022:
862:
812:
766:
730:
674:
618:
567:
506:
168:of the execution. The tasks can be assigned using
193:inherent in an application that runs multiple
131:—across different processors. In contrast to
484:
420:Parallel programming in C with MPI and openMP
236:(single program, multiple data) system, both
8:
491:
477:
469:
394:"Understanding task and data parallelism"
262:program: ... do task "B" ... end program
256:program: ... do task "A" ... end program
77:Learn how and when to remove this message
89:Form of parallelization of computer code
40:This article includes a list of general
384:
392:Reinders, James (10 September 2007).
7:
298:C, C++, Objective-C, Swift (Apple):
222:below illustrates task parallelism:
314:Delphi (System.Threading.TParallel)
46:it lacks sufficient corresponding
25:
1221:
1220:
31:
692:Analysis of parallel algorithms
338:Hardware Description Languages
292:C++ (Open Source/Apache 2.0):
1:
639:Simultaneous and heterogenous
1227:Category: Parallel computing
123:—concurrently performed by
1269:
534:High-performance computing
418:Quinn, Michael J. (2007).
373:Parallel programming model
259:Code executed by CPU "b":
253:Code executed by CPU "a":
1216:
1168:Automatic parallelization
804:Application checkpointing
282:Threading Building Blocks
183:Thread-level parallelism
1183:Embarrassingly parallel
1178:Deterministic algorithm
61:more precise citations.
898:Associative processing
854:Non-blocking algorithm
660:Clustered multi-thread
300:Grand Central Dispatch
240:will execute the code.
170:conditional statements
18:Task-level parallelism
1014:Hardware acceleration
927:Superscalar processor
917:Dataflow architecture
514:Distributed computing
331:Task Parallel Library
277:Ada: Tasks (built-in)
893:Pipelined processing
842:Explicit parallelism
837:Implicit parallelism
827:Dataflow programming
445:"Concurrency Basics"
358:Algorithmic skeleton
172:as described below.
160:"a" & "b") in a
97:function parallelism
1253:Threads (computing)
1117:Parallel Extensions
922:Pipelined processor
101:control parallelism
1248:Parallel computing
991:Massively parallel
969:distributed shared
789:Cache invalidation
753:Instruction window
544:Manycore processor
524:Massively parallel
519:Parallel computing
500:Parallel computing
117:parallel computing
1235:
1234:
1188:Parallel slowdown
822:Stream processing
712:Karp–Flatt metric
87:
86:
79:
16:(Redirected from
1260:
1224:
1223:
1198:Software lockout
997:Computer cluster
932:Vector processor
887:Array processing
872:Flynn's taxonomy
779:Memory coherence
554:Computer network
493:
486:
479:
470:
463:
462:
460:
458:
449:
443:Hicks, Michael.
440:
434:
433:
415:
409:
408:
406:
404:
389:
363:Data parallelism
325:Java concurrency
269:Language support
177:data parallelism
133:data parallelism
111:across multiple
93:Task parallelism
82:
75:
71:
68:
62:
57:this article by
48:inline citations
35:
34:
27:
21:
1268:
1267:
1263:
1262:
1261:
1259:
1258:
1257:
1238:
1237:
1236:
1231:
1212:
1156:
1062:Coarray Fortran
1018:
1002:Beowulf cluster
858:
808:
799:Synchronization
784:Cache coherence
774:Multiprocessing
762:
726:
707:Cost efficiency
702:Gustafson's law
670:
614:
563:
539:Multiprocessing
529:Cloud computing
502:
497:
467:
466:
456:
454:
447:
442:
441:
437:
430:
417:
416:
412:
402:
400:
391:
390:
386:
381:
368:Fork–join model
354:
271:
263:
257:
226:
216:
146:
105:parallelization
103:) is a form of
95:(also known as
90:
83:
72:
66:
63:
53:Please help to
52:
36:
32:
23:
22:
15:
12:
11:
5:
1266:
1264:
1256:
1255:
1250:
1240:
1239:
1233:
1232:
1230:
1229:
1217:
1214:
1213:
1211:
1210:
1205:
1200:
1195:
1193:Race condition
1190:
1185:
1180:
1175:
1170:
1164:
1162:
1158:
1157:
1155:
1154:
1149:
1144:
1139:
1134:
1129:
1124:
1119:
1114:
1109:
1104:
1099:
1094:
1089:
1084:
1079:
1074:
1069:
1064:
1059:
1054:
1049:
1044:
1039:
1034:
1028:
1026:
1020:
1019:
1017:
1016:
1011:
1006:
1005:
1004:
994:
988:
987:
986:
981:
976:
971:
966:
961:
951:
950:
949:
944:
937:Multiprocessor
934:
929:
924:
919:
914:
913:
912:
907:
902:
901:
900:
895:
890:
879:
868:
866:
860:
859:
857:
856:
851:
850:
849:
844:
839:
829:
824:
818:
816:
810:
809:
807:
806:
801:
796:
791:
786:
781:
776:
770:
768:
764:
763:
761:
760:
755:
750:
745:
740:
734:
732:
728:
727:
725:
724:
719:
714:
709:
704:
699:
694:
689:
684:
678:
676:
672:
671:
669:
668:
666:Hardware scout
663:
657:
652:
647:
641:
636:
630:
624:
622:
620:Multithreading
616:
615:
613:
612:
607:
602:
597:
592:
587:
582:
577:
571:
569:
565:
564:
562:
561:
559:Systolic array
556:
551:
546:
541:
536:
531:
526:
521:
516:
510:
508:
504:
503:
498:
496:
495:
488:
481:
473:
465:
464:
435:
429:978-0070582019
428:
410:
383:
382:
380:
377:
376:
375:
370:
365:
360:
353:
350:
334:
333:
327:
321:
315:
312:
302:
296:
290:
284:
278:
270:
267:
261:
255:
251:
250:
247:
244:
241:
224:
215:
212:
145:
142:
88:
85:
84:
39:
37:
30:
24:
14:
13:
10:
9:
6:
4:
3:
2:
1265:
1254:
1251:
1249:
1246:
1245:
1243:
1228:
1219:
1218:
1215:
1209:
1206:
1204:
1201:
1199:
1196:
1194:
1191:
1189:
1186:
1184:
1181:
1179:
1176:
1174:
1171:
1169:
1166:
1165:
1163:
1159:
1153:
1150:
1148:
1145:
1143:
1140:
1138:
1135:
1133:
1130:
1128:
1125:
1123:
1120:
1118:
1115:
1113:
1110:
1108:
1105:
1103:
1100:
1098:
1095:
1093:
1090:
1088:
1085:
1083:
1082:Global Arrays
1080:
1078:
1075:
1073:
1070:
1068:
1065:
1063:
1060:
1058:
1055:
1053:
1050:
1048:
1045:
1043:
1040:
1038:
1035:
1033:
1030:
1029:
1027:
1025:
1021:
1015:
1012:
1010:
1009:Grid computer
1007:
1003:
1000:
999:
998:
995:
992:
989:
985:
982:
980:
977:
975:
972:
970:
967:
965:
962:
960:
957:
956:
955:
952:
948:
945:
943:
940:
939:
938:
935:
933:
930:
928:
925:
923:
920:
918:
915:
911:
908:
906:
903:
899:
896:
894:
891:
888:
885:
884:
883:
880:
878:
875:
874:
873:
870:
869:
867:
865:
861:
855:
852:
848:
845:
843:
840:
838:
835:
834:
833:
830:
828:
825:
823:
820:
819:
817:
815:
811:
805:
802:
800:
797:
795:
792:
790:
787:
785:
782:
780:
777:
775:
772:
771:
769:
765:
759:
756:
754:
751:
749:
746:
744:
741:
739:
736:
735:
733:
729:
723:
720:
718:
715:
713:
710:
708:
705:
703:
700:
698:
695:
693:
690:
688:
685:
683:
680:
679:
677:
673:
667:
664:
661:
658:
656:
653:
651:
648:
645:
642:
640:
637:
634:
631:
629:
626:
625:
623:
621:
617:
611:
608:
606:
603:
601:
598:
596:
593:
591:
588:
586:
583:
581:
578:
576:
573:
572:
570:
566:
560:
557:
555:
552:
550:
547:
545:
542:
540:
537:
535:
532:
530:
527:
525:
522:
520:
517:
515:
512:
511:
509:
505:
501:
494:
489:
487:
482:
480:
475:
474:
471:
453:
446:
439:
436:
431:
425:
421:
414:
411:
399:
395:
388:
385:
378:
374:
371:
369:
366:
364:
361:
359:
356:
355:
351:
349:
347:
343:
339:
332:
328:
326:
322:
320:
316:
313:
311:
307:
303:
301:
297:
295:
291:
289:
286:C++ (Intel):
285:
283:
280:C++ (Intel):
279:
276:
275:
274:
268:
266:
260:
254:
248:
245:
242:
239:
235:
231:
230:
229:
223:
221:
213:
211:
208:
203:
200:
196:
192:
188:
184:
180:
178:
173:
171:
167:
163:
159:
154:
152:
143:
141:
139:
134:
130:
126:
122:
118:
114:
110:
109:computer code
106:
102:
98:
94:
81:
78:
70:
60:
56:
50:
49:
43:
38:
29:
28:
19:
767:Coordination
697:Amdahl's law
633:Simultaneous
589:
584:
455:. Retrieved
451:
438:
419:
413:
401:. Retrieved
397:
387:
335:
272:
264:
258:
252:
227:
217:
204:
186:
182:
181:
174:
155:
147:
100:
96:
92:
91:
73:
64:
45:
1203:Scalability
964:distributed
847:Concurrency
814:Programming
655:Cooperative
644:Speculative
580:Instruction
191:parallelism
144:Description
59:introducing
1242:Categories
1208:Starvation
947:asymmetric
682:PRAM model
650:Preemptive
379:References
319:goroutines
220:pseudocode
207:multi-core
138:pipelining
113:processors
42:references
942:symmetric
687:PEM model
288:Cilk Plus
189:) is the
125:processes
1173:Deadlock
1161:Problems
1127:pthreads
1107:OpenHMPP
1032:Ateji PX
993:computer
864:Hardware
731:Elements
717:Slowdown
628:Temporal
610:Pipeline
352:See also
166:run time
162:parallel
151:workflow
67:May 2011
1132:RaftLib
1112:OpenACC
1087:GPUOpen
1077:C++ AMP
1052:Charm++
794:Barrier
738:Process
722:Speedup
507:General
342:Verilog
294:RaftLib
214:Example
199:servers
195:threads
129:threads
55:improve
1225:
1102:OpenCL
1097:OpenMP
1042:Chapel
959:shared
954:Memory
889:(SIMT)
832:Models
743:Thread
675:Theory
646:(SpMT)
600:Memory
585:Thread
568:Levels
426:
329:.NET:
323:Java:
310:fibers
232:In an
44:, but
1072:Dryad
1037:Boost
758:Array
748:Fiber
662:(CMT)
635:(SMT)
549:GPGPU
457:8 May
448:(PDF)
403:8 May
398:ZDNet
340:like
306:tasks
121:tasks
1137:ROCm
1067:CUDA
1057:Cilk
1024:APIs
984:COMA
979:NUMA
910:MIMD
905:MISD
882:SIMD
877:SISD
605:Loop
595:Data
590:Task
459:2017
424:ISBN
405:2017
346:VHDL
344:and
317:Go:
308:and
238:CPUs
234:SPMD
218:The
158:CPUs
99:and
1152:ZPL
1147:TBB
1142:UPC
1122:PVM
1092:MPI
1047:HPX
974:UMA
575:Bit
304:D:
187:TLP
127:or
115:in
107:of
1244::
450:.
396:.
348:.
153:.
492:e
485:t
478:v
461:.
432:.
407:.
185:(
80:)
74:(
69:)
65:(
51:.
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.