EM and Regression Mixture Modeling

EM and Regression Mixture Modeling

[UPDATE: As Will points out in the comments, this isn’t really the EM algorithm. There isn’t a proper E step, because there’s no distribution being estimated: there’s only a maximization step that alternates between maximizing the class labels and the slopes. You can think of this algorithm as a degenerate version of EM in the […]

R Recommendation Contest Launches on Kaggle

The R Recommendation Engine contest is now live on Kaggle. Please head over there and start submitting your predictions for the test data set. Once you do, you can check the leaderboard to see how your algorithm compares with other people’s work. We know that there’s still plenty of progress that can be made, because […]

Build a Recommendation System for R Packages

On Dataists, a new collaborative blog for data hackers that I’m contributing to, we’ve just announced a data contest that’s custom made for R users. To win the contest, you need to build a recommendation system for R packages. To find out more, check out the official announcement on Dataists. Then go to GitHub to […]

ProjectTemplate Version 0.1-3 Released

I’ve just released the newest version of ProjectTemplate. The primary change is a completely redesigned mechanism for automatically loading data. ProjectTemplate can now read compressed CSV files, access CSV data files over HTTP, read Stata, SPSS and RData binary files and even load MySQL database tables automatically. For my own projects, this is a big […]

Two New R Packages: log4r and SortableHTMLTables

I’ve just released two new packages for R: log4r and SortableHTMLTables. log4r is a minimal logging utility for R that’s inspired by the log4j family of logging tools. It has substantially fewer features than other logging tools for R, but it’s hopefully easier to use. SortableHTMLTables uses brew and the jQuery Tablesorter plugin to provide […]

Higher Order Functions in R

Introduction Because R is, in part, a functional programming language, the ‘base’ package contains several higher order functions. By higher order functions, I mean functions that take another function as an argument and then do something with that function. If you want to know more about the usefulness of writing higher order functions in general, […]

Doing Maximum Likelihood Estimation by Hand in R

Doing Maximum Likelihood Estimation by Hand in R

Lately I’ve been writing maximum likelihood estimation code by hand for some economic models that I’m working with. It’s actually a fairly simple task, so I thought that I would write up the basic approach in case there are readers who haven’t built a generic estimation system before. First, let’s start with a toy example […]

iBad: The FSF Kool-Aid and Other Dystopian Hallucinations

The people who worry that the iPad will bring about a dystopian future for home computing keep forgetting something: for the rest of humanity, their ideal world of perfectly hackable machines is already a dystopian nightmare. It’s a world in which nothing works without spending hours setting it up, in which basic features are missing […]

Cleaning Up an iTunes Library with MacRuby

For a little more than a year now, I’ve been meaning to write a script to rename all of the files in my iTunes library so that they’re in proper English title case. In large part, this project was inspired by reading John Gruber’s post about a Perl script that he’d written to convert text […]

Suggestions for TextMate’s Search and Replace

Like so many other programmers, I adore TextMate. For that reasons, here are two simple features that I’d enjoy seeing in the next version: A case-preserving search and replace tool. If I search for a string like my_class and want to replace it with my_new_class, I’d like my_class to transform into my_new_class at the same […]