20180413並行程式設計原理HW1

並行程式設計原理 hw1

馮浩然 1600013009

1intro

2實現

1. 外圍函式

/*
* generate random cuda matrix with "curand.h", and copy back to the host as a normal matrix
* cm represents "cuda matrix", m represents "matrix"
*/void generator(float *cm, float *m)
/** check if the result is correct
* dst represents the tranposed, src represents the previous
*/bool check(float *dst, float *src)
return
true;
}

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
using
namespace
std;
#pragma comment(lib, "curand.lib")
#define n 1024
#define tile 32
/** generate random cuda matrix with "curand.h", and copy back to the host as a normal matrix
* cm represents "cuda matrix", m represents "matrix"
*/void generator(float *cm, float *m)
/** check if the result is correct
* dst represents the tranposed, src represents the previous
*/bool check(float *dst, float *src)
return
true;
}int main()

2.主體轉置函式

2.1 *****方法

2.2 優化step1

2.3 優化step2

/*
* transpose matrix src, and store the result in matrix dst
* dst represents the tranposed, src represents the previous
* optimized step2: a unit is a 32 * 32 matrix, move by 4 * 1 elements
*/__global__ void matrix_trans_3(float *dst, float *src)
__syncthreads();
i = blockidx.y * tile + threadidx.x;
j = blockidx.x * tile + threadidx.y * 4;
ind = j * n + i;
tile_i = threadidx.x;
tile_j = threadidx.y * 4;
for (int i = 0; i < 4; i++)
}int main()

3執行及效能

4特別注釋

**位置為

manycore@master:/home/manycore/users/feng.haoran 中的matrix_trans_1, 2, 3， hw1_1, 2, 3是編譯完成的可執行檔案

（1是*****實現，2是step1優化後，3是step2優化後）

並行程式設計與PLINQ 任務並行

任務並行在tpl當中還可以使用parallel.invoke方法觸發多個非同步任務,其中 actions 中可以包含多個方法或者委託，paralleloptions用於配置parallel類的操作。public static void invoke action actions public st...

c 並行程式設計

本部落格將看c 並行程式設計的例子 1.執行緒程序原理執行緒是輕量級的程序，乙個程序可以擁有多個執行緒。編譯多執行緒程式加入 g lphread 2.openmp庫加速 2.1 openmp庫加速配置及hello，world 事實上有個openmp庫，可以實現單台cpu的加速 windows下使用...

並行程式設計 cuda memory

cuda儲存器模型 gpu片內 register，shared memory host 記憶體 host memory,pinned memory.板載視訊記憶體 local memory,constant memory,texture memory,texture memory,global me...

20180413並行程式設計原理HW1

並行程式設計與PLINQ 任務並行

c 並行程式設計

並行程式設計 cuda memory

相關推薦