## Using Sparse Matrices in R

Introduction I’ve recently been working with a couple of large, extremely sparse data sets in R. This has pushed me to spend some time trying to master the CRAN packages that support sparse matrices. This post describes three of them: the Matrix, slam and glmnet packages. The first two packages provide data storage classes for […]

## The Psychology of Music and the ‘tuneR’ Package

Introduction This semester I’m TA’ing a course on the Psychology of Music taught by Phil Johnson-Laird. It’s been a great course to teach because (i) so much of the material is new to me and (ii) because the study of the psychology of music brings together so many of the intellectual tools I enjoy, including […]

## Twitter Math Puzzle and Solution

Yesterday I posted a very simple math puzzle to Twitter that I found in Jonathan Baron’s book, Thinking and Deciding. The puzzle is the following: Show that every number of the form ABC,ABC is divisible by 13. The puzzle comes up in Baron’s book as an example of an “insight problem” in which one goes […]

## Visualizing Periodic Data

Yesterday the Princeton machine learning reading group went through a paper by Tukey on “Some graphic and semigraphic displays”. One issue we talked about at length was Tukey’s idiosyncratic approach to visualizing periodic data in a circular format to emphasize the connections between the “start” and the “end” of the data set. Allison Chaney pointed […]

## ProjectTemplate News

The news below was recently reported on the ProjectTemplate mailing list. For completeness, I’m also reporting it here. The first piece of ProjectTemplate news is that I won’t be the exclusive maintainer for ProjectTemplate anymore. Allen Goodman, who works at BankSimple, is now my co-maintainer and he has full commit privileges. In the next few […]

## Speeding Up MLE Code in R

Recently, I’ve been fitting some models from the behavioral economics literature to choice data. Most of these models amount to non-linear variants of logistic regression in which I want to infer the parameters of a utility function. Because several of these models aren’t widely used, I’ve had to write my own maximum likelihood code to […]

## The Post-Lehman Era

The existence of recessions no more invalidates economic theory than the existence of AIDS invalidates molecular biology.

## Norvig and the Nature of Modern Science

In this, Chomsky is in complete agreement with O’Reilly. (I recognize that the previous sentence would have an extremely low probability in a probabilistic model trained on a newspaper or TV corpus.)1 Anyone who considers themself an intellectual should be required to read this new essay by Peter Norvig. It’s the best summary I’ve ever […]

## Problems with ggplot2 0.8.9 and R 2.13.0 on Mac OS X via plyr 1.5

This morning I tried to completely update my R installation. I first dumped a list of all the packages I have on my system using the installed.packages() function. Then I installed R 2.13.0 using the OS X disk image. And finally I reinstalled all of my packages from scratch. Unfortunately, I ran into some serious […]

## A Request for Foursquare Data

[UPDATE 3/28/2011: Fixed an enormous bug in the R code.] I’m trying to collect data sets that showcase how the classical statistical distributions appear in modern contexts. I’ve already got some data that shows how the gamma distribution appears in video game scores, and now I’m hoping to find an example where the exponential distribution […]