> M1
  1 3 4 5 7 8
1 0 0 1 0 0 0
3 0 0 0 0 0 0
4 1 0 0 0 0 0
5 0 0 0 0 0 0
7 0 0 0 0 1 0
8 0 0 0 0 0 0
> M2
  1 3 4 5 8
1 0 0 1 0 0
3 0 0 0 0 0
4 1 0 0 0 0
5 0 0 0 0 0
8 0 0 0 0 0
> M1 %ms% M2
  1 3 4 5 7 8
1 0 0 2 0 0 0
3 0 0 0 0 0 0
4 2 0 0 0 0 0
5 0 0 0 0 0 0
7 0 0 0 0 1 0
8 0 0 0 0 0 0

This is my code:

M1 <- matrix(c(0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0), byrow = TRUE, ncol = 6)
colnames(M1) <- c(1,3,4,5,7,8)
M2 <- matrix(c(0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0), byrow = TRUE, ncol = 5)
colnames(M2) <- c(1,3,4,5,8)
# to data.table objects
DT1 <- data.table(M1, keep.rownames = TRUE, key = "rn")
DT2 <- data.table(M2, keep.rownames = TRUE, key = "rn")
# join and sum of common columns
if (nrow(DT1) > nrow(DT2)) {
    A <- DT2[DT1, roll = TRUE]
    A[, list(X1 = X1 + X1.1, X3 = X3 + X3.1, X4 = X4 + X4.1, X5 = X5 + X5.1, X7, X8 = X8 + X8.1), by = rn]
}

That outputs:

   rn X1 X3 X4 X5 X7 X8
1:  1  0  0  2  0  0  0
2:  3  0  0  0  0  0  0
3:  4  2  0  0  0  0  0
4:  5  0  0  0  0  0  0
5:  7  0  0  0  0  1  0
6:  8  0  0  0  0  0  0

Then I can convert back this data.table to a matrix and fix row and column names.

The questions are:

how to generalize this procedure?

I need a way to automatically create list(X1 = X1 + X1.1, X3 = X3 + X3.1, X4 = X4 + X4.1, X5 = X5 + X5.1, X7, X8 = X8 + X8.1) because i want to apply this function to matrices which dimensions (and row/columns names) are not known in advance.

In summary I need a merge procedure that behaves as described.
there are other strategies/implementations that achieve the same goal that are, at the same time, faster and generalized? (hoping that some data.table monster help me)
to what kind of join (inner, outer, etc. etc.) is assimilable this procedure?

Thanks in advance.

p.s.: I'm using data.table version 1.8.2

EDIT - SOLUTIONS

@Aaron solution. No external libraries, only base R. It works also on list of matrices.

add_matrices_1 <- function(...) {
  a <- list(...)
  cols <- sort(unique(unlist(lapply(a, colnames))))
  rows <- sort(unique(unlist(lapply(a, rownames))))
  out <- array(0, dim = c(length(rows), length(cols)), dimnames = list(rows,cols))
  for (m in a) out[rownames(m), colnames(m)] <- out[rownames(m), colnames(m)] + m
  out
}

@MadScone solution. Used reshape2 package. It works only on two matrices per call.

add_matrices_2 <- function(m1, m2) {
  m <- acast(rbind(melt(M1), melt(M2)), Var1~Var2, fun.aggregate = sum)
  mn <- unique(colnames(m1), colnames(m2))
  rownames(m) <- mn
  colnames(m) <- mn
  m
}

BENCHMARK (100 runs with microbenchmark package)

Unit: microseconds
   expr                min         lq    median         uq       max
1 add_matrices_1   196.009   257.5865   282.027   291.2735   549.397
2 add_matrices_2 13737.851 14697.9790 14864.778 16285.7650 25567.448

No need to comment the benchmark: @Aaron solution wins.

I'll continue to investigate a similar solution for data.table objects.

I'll add other solutions eventually reported or discovered.

Developer IT

Join and sum not compatible matrices through data.table - Developer IT

Join and sum not compatible matrices through data.table

r

join

matrix

merge

data.table

Related posts about r

Related posts about join

Improving Partitioned Table Join Performance

MS SQL 2008, join or no join?

SQL Server 2008, join or no join?

Simple join gives unnecessary rows,How to get join properly with only matches in both tables

Mulitple full joins in Postgres is slow

Categories cloud