But Muddle We Must

We are constantly exploited by the tools meant to foil our exploitation. For a progressive to acknowledge as much is tantamount to abandoning progressivism. So it’s no surprise that progressives would rather worry over trivialities such as campaign finance reform than dwell on the paradoxes of political power. But it really isn’t the Citizens United […]

Academic Jargon: Field-Specific Insults

Every academic field seems to develop a set of generic insults based on their intellectual toolkit. Here are two examples I hear often: Probabilists and Statisticians: “I think that’s an interesting case, but it’s in a set with measure zero.” Economists: “X group’s behavior is clearly rent-seeking.” Do any readers have good examples from other […]

A Draft of ProjectTemplate v0.2-1

I’ve just uploaded a new binary of ProjectTemplate to GitHub. This is a draft version of the next release, v0.2-1, which includes some fairly substantial changes and is backwards incompatible in several ways with previous versions of ProjectTemplate. Foremost of the changes is that most of the logic for load.project() is now built into the […]

The NYC Marathon

New York’s annual marathon took place yesterday. Watching a bit of it on television with my friends, I was struck by the much earlier starting time for women than men. Specifically, professional women started running yesterday at 9:10 AM, while professional men start running at 9:40 AM. (This information comes from the runner’s handbook.) I […]

The Answer Depends on the Question

To quote from the preface to the first edition in Jeffreys (1961): ‘It is sometimes considered a paradox that the answer depends not only on the observations but on the question; it should be a platitude.’1 Generalized Linear Models : P. McCullagh and J. A. Nelder : Chapter 2↩

Promising R Packages

As a quick note, here are two R packages that were mentioned to me recently and that look promising: reldist and mixtools.

EM and Regression Mixture Modeling

[UPDATE: As Will points out in the comments, this isn’t really the EM algorithm. There isn’t a proper E step, because there’s no distribution being estimated: there’s only a maximization step that alternates between maximizing the class labels and the slopes. You can think of this algorithm as a degenerate version of EM in the […]

Apologies for Polluting Twitter

I’d like to publicly apologize to anyone that follows me on Twitter and saw the argument I started with two people yesterday morning. While I still believe that the people on the other side of the argument had behaved inappropriately enough that someone needed to confront them, my actual reaction was completely counter-productive and represented […]

R Recommendation Contest Launches on Kaggle

The R Recommendation Engine contest is now live on Kaggle. Please head over there and start submitting your predictions for the test data set. Once you do, you can check the leaderboard to see how your algorithm compares with other people’s work. We know that there’s still plenty of progress that can be made, because […]

Build a Recommendation System for R Packages

On Dataists, a new collaborative blog for data hackers that I’m contributing to, we’ve just announced a data contest that’s custom made for R users. To win the contest, you need to build a recommendation system for R packages. To find out more, check out the official announcement on Dataists. Then go to GitHub to […]