Archives by date

You are browsing the site archives by date.

What’s Wrong with Statistics in Julia?

Introduction Several months ago, I promised to write an updated version of my old post, “The State of Statistics in Julia”, that would describe how Julia’s support for statistical computing has evolved since December 2012. I’ve kept putting off writing that post for several reasons, but the most important reason is that all of my […]

The Lesser Known Normal Forms of Database Design

-1st Normal Form: The database contains at least one table that is an exact copy of another table, except with additional columns. -2nd Normal Form: The database contains at least one table that is a corrupt, out-of-date copy of another table, except with additional columns. It is impossible to determine if these additional columns can […]

Values vs. Bindings: The Map is Not the Territory

Many newcomers to Julia are confused by the seemingly dissimilar behaviors of the following two functions: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 julia> a = [1, 2, 3] […]

That Way Madness Lies: Arithmetic on data.frames

tl;dr Please do not use arithmetic on data.frame objects when programming in R. It’s a hack that only works if you know everything about your datasets. If anything happens to change the order of the rows in your data set, previously safe data.frame arithmetic operations will produce incorrect answers. If you learn to always explicitly […]

My Experience at JuliaCon

Introduction I just got home from JuliaCon, the first conference dedicated entirely to Julia. It was a great pleasure to spend two full days listening to talks about a language that I started advocating for just a little more than two years ago. What follows is a very brief review of the talks that excited […]

Falsifiability versus Rationalization

Here are two hypothetical conversations about psychological research. I’ll leave it to others to decide whether these conversation could ever take place. Theories are just directional assertions about effects Person A: And, just as I predicted, I found in my early studies that the correlation between X and Y is 0.4. Person B: What do […]

A Note on the Johnson-Lindenstrauss Lemma

Introduction A recent thread on Theoretical CS StackExchange comparing the Johnson-Lindenstrauss Lemma with the Singular Value Decomposition piqued my interest enough that I decided to spend some time last night reading the standard JL papers. Until this week, I only had a vague understanding of what the JL Lemma implied. I previously mistook the JL […]

Data corruption in R 3.0.2 when using read.csv

Introduction It may be old news to some, but I just recently discovered that the automatic type inference system that R uses when parsing CSV files assumes that data sets will never contain 64-bit integer values. Specially, if an integer value read from a CSV file is too large to fit in a 32-bit integer […]