课程作业问题。求助中文答案
1.What are the main features/components in parallel computer architectures? Briefly describe the Shenwei Taihu Light in terms of these features/components.
2.Can we use the same approach (for instance apply striped partitioning of the matrix) to parallelize the Jacobi iteration and the Gauss-Seidel iteration? Explain your answers.
3.Explain why the use of standard MPI_SEND and MPI_RECV can cause a deadlock in a parallel execution. Give an example of such a deadlock. Will the use of shared-memory to exchange data sometimes also leads to a deadlock? Explain your answer.
4.What do we understand under the class of so-called “hierarchical methods or algorithms”? What do these methods have in common?
Compare the parallelization of the Multi-Grid method and the Barnes-Hut algorithm, what makes the parallelization of the Barnes-Hut algorithm more complicated?
Can you think of a problem in your (future) application which might benefit from a hierarchical approach? (in the discussion about the last question, you don’t have to give a solution already, just some ideas or thoughts are sufficient).
5.Consider the following MPI program.
(a) Is this a correct program? What should be modified to make it more efficient?
(b) If you compile this program, it will most likely execute without problem, however, if the send and receive of 1 value is replaced by 10^6 numbers (the program is not calculating the sum anymore), what will then happen with the communication?
/* C program: Sum processor numbers by passing them around in a ring. /
#include "mpi.h"
#include
/ Set up communication tags (these can be anything) /
#define to_right 201
#define to_left 102
void main (int argc, char *argv[]) {
int ierror, value, new_value, procnum, numprocs;
int right, left, other, sum, i;
MPI_Status recv_status;
/ Initialize MPI /
MPI_Init(&argc, &argv);
/ Find out this processor number /
MPI_Comm_rank(MPI_COMM_WORLD, &procnum);
/ Find out the number of processors /
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
/ Compute rank (number) of processors to the left and right /
right = procnum + 1;
if (right == numprocs) right = 0;
left = procnum - 1;
if (left == -1) left = numprocs-1;
sum = 0;
value = procnum;
for( i = 0; i < numprocs; i++) {
/ Send to the right /
MPI_Send(&value, 1, MPI_INT, right, to_right,
MPI_COMM_WORLD);
/ Receive from the left /
MPI_Recv(&new_value, 1, MPI_INT, left, to_right,
MPI_COMM_WORLD, &recv_status);
/ Sum the new value /
sum = sum + new_value;
/ Update the value to be passed /
value = new_value;
/ Print out the partial sums at each step /
printf ("PE %d:\t Partial sum = %d\n", procnum, sum);
}
/ Print out the final result /
if (procnum == 0) {
printf ("Sum of all processor numbers = %d\n", sum);
}
/ Shut down MPI */
MPI_Finalize();
return;
}
高性能计算/并行计算基础问题
- 写回答
- 好问题 0 提建议
- 追加酬金
- 关注问题
- 邀请回答
-
1条回答 默认 最新
- threenewbee 2016-06-28 15:53关注
问题太多了,回答第一题吧。如果你采纳了本答案,可以回答更多。
What are the main features/components in parallel computer architectures? Briefly describe the Shenwei Taihu Light in terms of these features/components.
并行计算架构主要分为纯cpu,cpu和gpu混合两种。神威太湖的相关情况可以看公开的报道,据说用的是国产的众核cpu。众核cpu介于cpu和gpu之间。解决 无用评论 打赏 举报
悬赏问题
- ¥20 删除和修改功能无法调用
- ¥15 kafka topic 所有分副本数修改
- ¥15 小程序中fit格式等运动数据文件怎样实现可视化?(包含心率信息))
- ¥15 如何利用mmdetection3d中的get_flops.py文件计算fcos3d方法的flops?
- ¥40 串口调试助手打开串口后,keil5的代码就停止了
- ¥15 电脑最近经常蓝屏,求大家看看哪的问题
- ¥60 高价有偿求java辅导。工程量较大,价格你定,联系确定辅导后将采纳你的答案。希望能给出完整详细代码,并能解释回答我关于代码的疑问疑问,代码要求如下,联系我会发文档
- ¥50 C++五子棋AI程序编写
- ¥30 求安卓设备利用一个typeC接口,同时实现向pc一边投屏一边上传数据的解决方案。
- ¥15 SQL Server analysis services 服务安装失败