Dec 16 2009

Quick Review of Matrix Algebra in R

Lately, I’ve been running a series of fMRI experiments on visual perception. In the interests of understanding the underlying properties of the images I’m using as stimuli, I’ve been trying to learn more about the matrix transformations commonly used for image compression and image manipulation. Thankfully, R provides simple-to-use implementations for all of the matrix operations I wanted to play around with, so it’s been quite easy to get started. For the next few posts, I thought that I’d review the standard matrix techniques for image compression and editing, giving full examples of their implementation in R and demonstrations of their real-world value.

Before I start, I should make sure that you’re familiar with basic matrix operations in R. I imagine almost every R user knows a little bit about matrix algebra and probably knows the basics of using R to perform matrix algebra, but here’s a quick review to make sure I don’t leave anyone in the dark:

Building Matrices

You can build a matrix in R using the matrix function:

1
2
3
4
5
6
m <- matrix(c(1, 0, 0, 1), nrow = 2, ncol = 2, byrow = TRUE)
 
m
#      [,1] [,2]
# [1,]    1    0
# [2,]    0    1

You can test whether an item you’ve been given is a matrix using is.matrix and you can convert appropriate objects to matrices using as.matrix:

1
2
3
4
5
6
7
8
9
m <- matrix(c(1, 0, 0, 1), nrow = 2, ncol = 2, byrow = TRUE)
 
is.matrix(m)
# TRUE
 
as.matrix(data.frame(x = c(1, 0), y = c(0, 1)))
#      x y
# [1,] 1 0
# [2,] 0 1

Diagonal Matrices

The matrix I’ve been building in the examples above is a diagonal matrix, so you could also construct it using diag:

1
2
3
4
5
6
m <- diag(nrow = 2, ncol = 2)
 
m
#      [,1] [,2]
# [1,]    1    0
# [2,]    0    1

In general, you can generate the n by n identity matrix as:

1
2
3
n <- 10
 
diag(nrow = n, ncol = n)

I’ll be exploiting a more interesting use of diag in my next post, so let’s see how you can build matrices other than the identity with diag by specifying a vector of entries along the diagonal:

1
2
3
4
5
6
m <- diag(c(2, 1), nrow = 2, ncol = 2)
 
m
#      [,1] [,2]
# [1,]    2    0
# [2,]    0    1

This form of diag turns out to be extremely useful, as you’ll see once I cover the SVD’s syntax in R.

Matrix Algebra: Addition, Scalar Multiplication, Matrix Multiplication

The three core operations that can be performed on matrices are addition, scalar multiplication and matrix multiplication. These are easy to work with in R:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
m <- matrix(c(0, 2, 1, 0), nrow = 2, ncol = 2, byrow = TRUE)
 
m + m
#      [,1] [,2]
# [1,]    0    4
# [2,]    2    0
 
2 * m
#      [,1] [,2]
# [1,]    0    4
# [2,]    2    0
 
m %*% m
#      [,1] [,2]
# [1,]    2    0
# [2,]    0    2

Be careful with the * operator: it does not perform matrix multiplication, but rather an entry-wise multiplication:

1
2
3
4
5
m * m
 
#      [,1] [,2]
# [1,]    0    4
# [2,]    1    0

Matrix Transposes and Inverses

The next few matrix operations are a little more complex: transposition and inversion. Transposition is the easier of the two. To get the transpose of a matrix, you simply call the t function:

1
2
3
4
t(m)
#      [,1] [,2]
# [1,]    0    1
# [2,]    2    0

In contrast, inversion is a little more complex, partly because the function you’d want to use has a non-obvious name: solve.

1
2
3
4
5
solve(m)
 
#      [,1] [,2]
# [1,]  0.0    1
# [2,]  0.5    0

The reason that solve is called solve is that it’s a general purpose function you can use to solve matrix equations without wasting time computing the full inverse, which is often inefficient. If you want to know more about the computational efficiency issues, you should look into the ideas behind the even faster variant, qr.solve.

Now, you probably know this already, but the definition of a matrix’s inverse is that the product of the matrix and its inverse is the identity matrix, if the inverse exists. I always find this a good way to make sure that I’m correctly computing the inverse of a matrix:

1
2
3
4
5
solve(m) %*% m == diag(nrow = nrow(m), ncol = ncol(m))
 
#      [,1] [,2]
# [1,] TRUE TRUE
# [2,] TRUE TRUE

Further Matrix Operations

The car package defines an inv function, which is simply a new name for solve:

1
2
3
4
5
6
library('car')
inv
 
# function (x) 
# solve(x)
# <environment: namespace:car>

I think this is pretty clever, though I’m loathe to import a package just for this one snippet.

More interestingly, the MASS package defines a ginv function, which gives the matrix pseudoinverse, a generalization of matrix inversion that works for all matrices:

1
2
3
4
5
6
7
8
9
10
11
12
library('MASS')
 
m <- matrix(c(1, 1, 2, 2), nrow = 2, ncol = 2)
 
solve(m)
# Error in solve.default(m) : 
#  Lapack routine dgesv: system is exactly singular
 
ginv(m)
#      [,1] [,2]
# [1,]  0.1  0.1
# [2,]  0.2  0.2

In practice, you can usually get around using the pseudoinverse, but it’s nice to know that it’s at hand all the time.

Eigenvalues and Eigenvectors

Eigenvectors are surely the bane of every starting student of linear algebra, though their considerable power to simplify problems makes them the darling of every applied mathematician. Thankfully, R makes it easy to get these for every matrix:

1
2
3
4
5
6
7
8
9
10
m <- diag(nrow = 2, ncol = 2)
 
eigen(m)
# $values
# [1] 1 1
#
# $vectors
#      [,1] [,2]
# [1,]    0   -1
# [2,]    1    0

Matrix Metadata

Last, but not least, you can get metadata about the shape of a matrix using dim, nrow and ncol:

1
2
3
4
5
6
7
8
9
10
m <- diag(nrow = 2, ncol = 2)
 
dim(m)
# [1] 2 2
 
nrow(m)
# [1] 2
 
ncol(m)
# [1] 2

Hopefully those operations were already familiar to any readers out there, as I doubt that they’ll be clear from this short explanation without prior knowledge. I just felt compelled to review them before using any of them in my next set of posts. Tomorrow, I’ll start going through more interesting matrix algorithms in R, beginning with the SVD.


Dec 15 2009

Some Pitfalls of Object Oriented Programming in R

Yesterday, I wrote a post about implementing setter methods for objects in R. On my way to understanding how to implement these methods correctly, I made a bunch of mistakes. To keep people from wasting their time with these same mistakes, I’m reviewing them here.

Mistake One: Naming Conventions

Every generic method will search for a class method using a single convention: if the generic method name is GENERIC_METHOD and the class name is CLASS_NAME, then the class method is always GENERIC_METHOD.CLASS_NAME. This is true no matter how strange the generic method looks: the generic method id<- for class user has a class method called id<-.user. The strange <- in the method name doesn't change the fact that .user comes at the end of the class method name.

Mistake Two: Inspecting Existing Functions with Strange Names

If you want to get a sense how the function class works, you can simply type class at the R interpreter. Unfortunately, the parsing rules for R insure that this will never work for methods that end in "<-". For that reason, it's not at all clear how to get the definitions of these strangely named functions to come up in the interpreter. When defining these sort of methods, it's sufficient to simply enclose the method name, but that will not work for the purposes of introspection:

1
2
3
'class<-'
 
# [1] "class<-"

As you can see, 'class<-' evaluates to a string, which is neither surprising nor undesirable. Unfortunately, this leaves it unclear how to get at the definition of class<-. The secret for viewing the definitions of these strangely named functions is to enclose the function name in backticks:

1
2
3
`class<-`
 
# function (x, value)  .Primitive("class<-")

Mistake Three: Wasting Time on `<<-`

Before I learned that the trick to writing setter methods is to return an edited copy of the changed object, I was convinced that I needed to evaluate an assignment in the caller's scope. While pursuing this idea, I discovered the <<- operator. It's quite an interesting operator, but it's not the solution to the setter method definition problem.

You see, <<- is an assignment operator that will go searching through enclosing scopes (more precisely, R's environments) until it finds a variable that can be assigned to. Here's an example of how you could use <<- to do something surprising in R:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
i <- 1
 
"++" <- function()
{
  i <<- i + 1
}
 
"++"()
 
print(i)
 
# [1] 2
 
"++"()
 
print(i)
 
# [1] 3

This is pretty cool. In fact, it was cool enough that I wasted a good thirty minutes playing with it before realizing that it wasn't helping anything.

Mistake Four: UseMethod is Magic

Figuring out how to write code that calls UseMethod for functions with many arguments is not trivial: UseMethod's behavior is a little magical, and the R help docs are as spartan as ever with regard to calling UseMethod. If you naively try to write a function like this,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
i <- 2
 
class(i) <- 'myint'
 
raise.power <- function(x, power)
{
  UseMethod('raise.power', x, power)
}
 
raise.power.myint <- function(x, power)
{
  return(as.numeric(x ** power))
}
 
raise.power(i, 3)

you'll succeed at defining a generic method that generates warnings every single time it's run. This is because you are never allowed to pass more than two arguments to UseMethod. You see, UseMethod automatically discovers the arguments passed to its caller when it's called and otherwise completely ignores the arguments passed to it after the second argument. For that reason, what you need to write is

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
i <- 2
 
class(i) <- 'myint'
 
raise.power <- function(x, power)
{
  UseMethod('raise.power', x)
}
 
raise.power.myint <- function(x, power)
{
  return(as.numeric(x ** power))
}
 
raise.power(i, 3)

with the latter arguments dropped from the call to UseMethod. If you're like me, you'll probably feel that you're going to lose value along the way the first few times that you write this. I suggest you try this approach out, convince yourself that it works, and then just take advantage of the UseMethod voodoo to solve your problems.


Oct 13 2009

Coase and Literary Talent among Scientists

There is no reason to suppose that most human beings are engaged in maximizing anything unless it be unhappiness, and even this with incomplete success.1

Why can’t more scientists write like this? I have the sad impression that most scientists view wit and spirit of this sort as signs of intellectual weakness rather than as signs of personal depth.

  1. R. H. Coase : The Firm, the Market, and the Law : 1, The Aim of the Book

Sep 6 2009

Hating on Norman Mailer

This assault on Norman Mailer impresses me more the more times I read it. I’m not familiar enough with Mailer to judge its accuracy, but the tone is perfect for my taste:

His vulgarity was a more significant factor in his allure than whatever he possessed of high aspiration. The way his most serious ambition was joined to his crassest need made him singularly appealing to a literary public that fed on nonsensical political ideas and fantasies of artistic superstardom, with its fabulous perquisites of cultural ubiquity, wealth, and hot sex.1

  1. Algis Valiunas : Commentary Magazine : The Naked Novelist and the Dead Reputation

Aug 28 2009

Sick Day Achievements

Spending my day at home while being sick has at least one perk: I can catch up with everything on the Internet that I’ve put off reading. After a day of RSS feed reading, I’ve compiled a list of things that I think are awesome and another list of things that I think are awful.

Things I think are awesome:

  1. Robots will take over the world one day: after all, here’s a Sudoku-solving Lego Mindstorms robot.
  2. Even better, there’s also a Rubik’s Cube-solving Lego Mindstorms robot.
  3. My friend Jim Keller’s latest project was featured on the Drupal main page today.
  4. Those cheating chimpanzee women are not to be trusted.
  5. How often does your Congressperson speak up?
  6. Here’s the 56th Carnival of Mathematics
  7. Cognitive psychology has something to say about the use of gendered words in language.
  8. A clever scheme from Bryan Caplan for making money in an irrational economy.
  9. Because repelling just gets boring after a while.

Things I think are awful:

  1. The Rule of Law is flouted once again by an American state.
  2. Swedish regulation at its stupidest.
  3. Englewood has a little town drama.
  4. Most cyclists are killed by terrible drivers.
  5. Still think that our government officials can be trusted?
  6. Eugene Volokh notes a recent and dangerous precedent for homeschoolers in a case that he wishes never had to be decided by the legal system.
  7. The public really, really has prurient taste.

May 18 2009

Stagger Lee

To know a man, you need only find out how he envisions the story of Stagger Lee.


Apr 9 2009

The Painted Veil

I watched the Painted Veil again tonight. I’m always amazed by the quality of the story and the dialogue: while I understand that it’s an adaptation of a classic novel, I am nevertheless continually astonished that such a good movie should ever have been released. There is just such a wealth of wonderful points about the subtler parts of love, affection and betrayal:

Do you know what I find strange? That your husband should never look at you. He looks at the walls, the floor, his shoes…1

  1. The Painted Veil

Mar 10 2009

The ISS Surpasses Venus

I think there’s something beautiful about the International Space Station becoming the brightest object in the night sky after the Moon.


Mar 4 2009

Anatman

The New York Times has a remarkable story up about a New York school teacher who went missing for almost a month after suffering from a case of dissociative fugue. I particularly enjoy how the article portrays the fragility of a self and the potential for massive disconnection from our memories.

Also fascinating is the woman’s assertion that “she can’t let New York win.” I am always surprised that non-native New Yorkers choose to live in a city that they view as an adversary. New York has always been home to me; sometimes it is too garish or agitating for me to stand, but it has never been an opponent to compete with in an effort to prove my ability to endure. If anything, the hardest thing for me is imagining living perpetually far away from New York.


Feb 5 2009

Late Night Flippancy

If history is so corrupted by patriarchical mores that it should be called herstory, should histograms be called herstograms?