Search Results

Search found 1 results on 1 pages for 'user3695701'.

Page 1/1 | 1 

  • matrix multiplication with MPI [on hold]

    - by user3695701
    I'm working on an assignment on matrix multiplication with MPI. A*B=C. the requirement is that B should be vertically partitioned. Here's what I intend to do: broadcast matrix A to all processes and scatter B into several slices with each slice containing n/p columns. The following code only works when the number of process(p) is 1. when p1(say 2), I got [cluster2:21080] *** Process received signal *** [cluster2:21080] Signal: Segmentation fault (11) [cluster2:21080] Signal code: Address not mapped (1) [cluster2:21080] Failing at address: (nil) [cluster2:21080] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7f49f38108f0] [cluster2:21080] [ 1] /lib/libc.so.6(memcpy+0xe1) [0x7f49f35024c1] [cluster2:21080] [ 2] /usr/lib/libmpi.so.0(ompi_convertor_unpack+0x121)[0x7f49f47c88e1] [cluster2:21080] [ 3] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x8a26) [0x7f49f0dcea26] [cluster2:21080] [ 4] /usr/lib/openmpi/lib/openmpi/mca_btl_tcp.so(+0x662c) [0x7f49efce462c] [cluster2:21080] [ 5] /usr/lib/libopen-pal.so.0(+0x1ede8) [0x7f49f42e0de8] [cluster2:21080] [ 6] /usr/lib/libopen-pal.so.0(opal_progress+0x99) [0x7f49f42d5369] [cluster2:21080] [ 7] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x5585) [0x7f49f0dcb585] [cluster2:21080] [ 8] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(+0xcc01) [0x7f49eeeb1c01] [cluster2:21080] [ 9] /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so(+0x266c) [0x7f49eeea766c] [cluster2:21080] [10] /usr/lib/openmpi/lib/openmpi/mca_coll_sync.so(+0x1388) [0x7f49ef0c0388] [cluster2:21080] [11] /usr/lib/libmpi.so.0(MPI_Bcast+0x10e) [0x7f49f47d025e] [cluster2:21080] [12] ./out(main+0x259) [0x401571] [cluster2:21080] [13] /lib/libc.so.6(__libc_start_main+0xfd) [0x7f49f3498c8d] [cluster2:21080] [14] ./out() [0x400f29] [cluster2:21080] *** End of error message *** Can someone help me? Thanks. //matrices A and B //double* A =(double *)malloc(n*n*sizeof(double)); //double* B =(double *)malloc(n*n*sizeof(double)); //code initializing A,B... //n is the size of the matrix //p is the number of processes //myrank is the rank of calling process MPI_Init (&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &p); //broadcast A to all processes MPI_Bcast (A, n*n, MPI_DOUBLE, 0, MPI_COMM_WORLD); MPI_Datatype tmp_type, col_type; // extract a slice from B MPI_Type_vector(n, num_of_col_per_slice, n, MPI_DOUBLE, &tmp_type); // position of the first (0) and each next (stride * sizeof(double) ) slice MPI_Type_create_resized(tmp_type, 0, n * sizeof(double), &col_type); MPI_Type_commit(&col_type); //scatter a slice of B to each process MPI_Scatter(B, 1, col_type, B+myrank*n/p, n * n/p, MPI_DOUBLE, 0, MPI_COMM_WORLD); //use blas function to calculate A*sliceOfB and store the resulting slice to C cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, n, n/p, n, 1.0, A, n, B+myrank*n/p, n, 0.0, C+myrank*n/p, n); //gather all those resulting slices into C MPI_Gather (C+myrank*n/p, n*n/p, MPI_DOUBLE, C, n*n/p, MPI_DOUBLE, 0, MPI_COMM_WORLD);

    Read the article

1