Tag Archives: Statistics

R Function Binding Vectors and Matrices of Variable Length, bug fixed

Now this is something very geeky, but useful. I had to bind two matrices or vectors together to become a bigger matrix. However, they need not have the same number of rows or even the same row names.

The standard cbind() functions require the vectors or matrices to be compatible. The matching is “stupid”, in the sense that it ignores any order or assumes that the elements which are to be joined into a matrix have the same row names. This, of course, need not be the case. A classical merge command would fail here, as we dont really know what to merge by and what to merge on.

Ok… I am not being clear here. Suppose you want to merge two vectors

A 2
B 4
C 3

and

G 2
B 1
C 3
E 1

now the resulting matrix should be

A  2  NA
B  4  1
C  3  3
E NA  1
G NA  2

Now the following Rfunction allows you to do this. It is important however, that you assign rownames to the objects to be merged (the A,B,C,E,G in the example), as it does matching on these.

cbindM <-
function(A, v, repl=NA) {

  dif <- setdiff(union(rownames(A),rownames(v)),intersect(rownames(A),rownames(v)))
#if names is the same, then a simple cbind will do
    if(length(dif)==0) {

      A<- cbind(A,v[match(rownames(A),rownames(v))])

        rownames(A) <- rownames(v)

    }    else if(length(dif)>0) {#sets are not equal, so either matrix is longer / shorter

#this tells us which elements in dif are part of A (and of v) respectively      
      for(i in dif)     {

        if(is.element(i,rownames(A))) {
#element is in A but not in v, so add it to v and then a        

          temp<-matrix(data =repl, nrow = 1, ncol = ncol(v), byrow = FALSE, dimnames =list(i))
            v <- rbind(v,temp)

        } else {
# element is in v but not in A, so add it to A

          temp<-matrix(data = repl, nrow = 1, ncol = ncol(A), byrow = FALSE, dimnames =list(i))
            A<-rbind(A,temp)
        }
      }


      A<-cbind(A,v[match(rownames(A),rownames(v))])

    }

  A
}

Note: 09.11.2011: I fixed a bug and added a bit more functionality. You can now tell it, with what you want the missing data to be replaced. Its standard to replace it with NA but you could change it to anything you want.