Review of R Graphs Cookbook
The kind people at Packt Publishing recently asked me to review one of their newest R books: the R Graphs Cookbook. In general, I think pretty highly of the book: it provides a nice overview of the basic tools for visualizing data in R. If you’re just getting started with creating graphs in R, this [...]
Modern Science and the Bayesian-Frequentist Controversy
The Bayesian-Frequentist debate reflects two different attitudes to the process of doing science, both quite legitimate. Bayesian statistics is well-suited to individual researchers, or a research group, trying to use all the information at its disposal to make the quickest possible progress. In pursuing progress, Bayesians tend to be aggressive and optimistic with their modeling [...]
Inconsistencies in Bayesian Models of Decision-Making
But modeling devices that make sense for an unbiased decisionmaker may not make sense for a biased one. For example, why would individuals have priors and posteriors if they are destined to apply Bayes’ law incorrectly?1 A question I often ask myself. Wolfgang Pesendorfer : Behavioral Economics Comes of Age: A Review Essay on Advances [...]
Academic Jargon: Field-Specific Insults
Every academic field seems to develop a set of generic insults based on their intellectual toolkit. Here are two examples I hear often: Probabilists and Statisticians: “I think that’s an interesting case, but it’s in a set with measure zero.” Economists: “X group’s behavior is clearly rent-seeking.” Do any readers have good examples from other [...]
A Draft of ProjectTemplate v0.2-1
I’ve just uploaded a new binary of ProjectTemplate to GitHub. This is a draft version of the next release, v0.2-1, which includes some fairly substantial changes and is backwards incompatible in several ways with previous versions of ProjectTemplate. Foremost of the changes is that most of the logic for load.project() is now built into the [...]
The NYC Marathon
New York’s annual marathon took place yesterday. Watching a bit of it on television with my friends, I was struck by the much earlier starting time for women than men. Specifically, professional women started running yesterday at 9:10 AM, while professional men start running at 9:40 AM. (This information comes from the runner’s handbook.) I [...]
The Answer Depends on the Question
To quote from the preface to the first edition in Jeffreys (1961): ‘It is sometimes considered a paradox that the answer depends not only on the observations but on the question; it should be a platitude.’1 Generalized Linear Models : P. McCullagh and J. A. Nelder : Chapter 2↩
Promising R Packages
As a quick note, here are two R packages that were mentioned to me recently and that look promising: reldist and mixtools.
EM and Regression Mixture Modeling
[UPDATE: As Will points out in the comments, this isn't really the EM algorithm. There isn't a proper E step, because there's no distribution being estimated: there's only a maximization step that alternates between maximizing the class labels and the slopes. You can think of this algorithm as a degenerate version of EM in the [...]
R Recommendation Contest Launches on Kaggle
The R Recommendation Engine contest is now live on Kaggle. Please head over there and start submitting your predictions for the test data set. Once you do, you can check the leaderboard to see how your algorithm compares with other people’s work. We know that there’s still plenty of progress that can be made, because [...]