<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>John Myles White: Die Sudelbücher</title>
	<atom:link href="http://www.johnmyleswhite.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.johnmyleswhite.com</link>
	<description>"He who refuses to do arithmetic is doomed to talk nonsense."</description>
	<pubDate>Tue, 06 Jan 2009 00:23:08 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>April May Be the Cruelest, But January Is the Strangest Month</title>
		<link>http://www.johnmyleswhite.com/notebook/2009/01/06/april-may-be-the-cruelest-but-january-is-the-strangest-month/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2009/01/06/april-may-be-the-cruelest-but-january-is-the-strangest-month/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 00:23:06 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3252</guid>
		<description><![CDATA[I always find January a strange month, because the weather tends to get colder over the course of the month, even though the days get progressively longer. Given that I had already scrounged up data on the temperature in New York City a while back, I thought I should plot a graph showing the strange [...]]]></description>
			<content:encoded><![CDATA[<p>I always find January a strange month, because the weather tends to get colder over the course of the month, even though the days get progressively longer. Given that I had already scrounged up data on the temperature in New York City a while back, I thought I should plot a graph showing the strange disconnect between day length and temperature that characterizes January. I was able to find <a href="http://dc.metblogs.com/2006/12/22/the-longest-night-is-over/">an Excel spreadsheet</a> that calculated the number of hours of daylight New York City received each day of the year; I used my <a href="http://www.engr.udayton.edu/weather/">previous weather data</a> for the average temperatures for each day in 2003, 2005, 2006 and 2007. For the three data points for which I had no temperatures &#8212; 1/13/2003, 3/1/2003 and 8/28/2007 &#8211;, I used linear interpolation to estimate that day&#8217;s average temperature. I skipped 2004 in my analysis because it was a leap year.</p>
<p>The graphs below make clear &#8212; some moreso than others &#8212; that January is a distinctive month, because the mean temperature noticeably lags behind the mean length of the day. You can also see a similar pattern for the temperatures in July, which are warmer than June&#8217;s temperatures even though the days are already getting shorter. Both of these imply that it is not the mere presence of sunlight that determines the temperature each day, but rather the accumulated warmth due to the sunlight of the previous month.</p>
<p>To be clear about the construction of the graphs and their interpretation, the rank of a data point is its relative position in the order of the entire data set. The warmest day in a year is ranked 365 and the coldest day is ranked 1; similarly, the longest day in a year is ranked 365 and the shortest day is ranked 1. Day 1 is January 1st; day 365 is December 31st.</p>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2009/01/2003.png" alt="2003.png" border="0" width="480" height="480" /></div>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2009/01/2005.png" alt="2005.png" border="0" width="480" height="480" /></div>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2009/01/2006.png" alt="2006.png" border="0" width="480" height="480" /></div>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2009/01/2007.png" alt="2007.png" border="0" width="480" height="480" /></div>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2009/01/06/april-may-be-the-cruelest-but-january-is-the-strangest-month/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Linear Regression Sampling Techniques Revisited</title>
		<link>http://www.johnmyleswhite.com/notebook/2009/01/04/linear-regression-sampling-techniques-revisited/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2009/01/04/linear-regression-sampling-techniques-revisited/#comments</comments>
		<pubDate>Sun, 04 Jan 2009 23:58:45 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3246</guid>
		<description><![CDATA[While thinking about linear regression today, I believe that I&#8217;ve realized why using clustered sampling and two point average slope calculation works better than least squares regression with scattered sampling. Specifically, it is the least squares formula that itself causes the problem, because squaring the errors gives undue weight to certain errors, skewing the results. [...]]]></description>
			<content:encoded><![CDATA[<p>While thinking about linear regression today, I believe that I&#8217;ve realized why using clustered sampling and two point average slope calculation works better than least squares regression with scattered sampling. Specifically, it is the least squares formula that itself causes the problem, because squaring the errors gives undue weight to certain errors, skewing the results. This skew quickly goes away as the sample size grows, but in small samples it can make a large difference.</p>
<p>It will probably take me another few weeks before I can find time to prove this mathematically.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2009/01/04/linear-regression-sampling-techniques-revisited/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Democracy in Action</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/31/democracy-in-action/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/31/democracy-in-action/#comments</comments>
		<pubDate>Wed, 31 Dec 2008 17:10:47 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Politics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3244</guid>
		<description><![CDATA[The Pew Research Center reported yesterday that the voters of 29 states have already approved bans on same-sex marriage.
For me, this observation highlights the absurdity of the naïve apotheosis of populism and democratic institutions that constitutes a core element of the contemporary Western zeitgeist. We tend to take for granted that democracy is something intrinsically [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://pewresearch.org">The Pew Research Center</a> reported yesterday that <a href="http://pewresearch.org/databank/dailynumber/?NumberID=669">the voters of 29 states have already approved bans on same-sex marriage.</a></p>
<p>For me, this observation highlights the absurdity of the naïve apotheosis of populism and democratic institutions that constitutes a core element of the contemporary Western zeitgeist. We tend to take for granted that democracy is something intrinsically good, an assumption that gives strength to the growing scorn we see in our society for &#8220;elitism&#8221; or any other movement that threatens to usurp the will of the people. We Americans seem to invariably forget that, while democratic institutions may sculpt our society in accord with the will of the people, this in no way implies that the people&#8217;s vision of a perfect society is something we should wish to see given form. Democracy does indeed give power to the people, but it does not and can never give the people the moral integrity to put that power to proper use.</p>
<p>Indeed, if the age-old adage that &#8220;power corrupts&#8221; is true, then democracy might even contribute to the moral and intellectual degradation of the populations of democratic nations. Or, as seems more likely, the age-old adage is simply wrong: corruption is a part of the human inheritance, and power, like alcohol, simply brings that latent vice to the forefront.</p>
<p>Before I close, I should note that this is not a peculiarly American problem, though I know many people who would like to claim so. After all, the Swiss are about <a href="http://www.swissinfo.org/eng/front/Voters_to_decide_on_controversial_minaret_ban.html?siteSect=108&#038;sid=9310389&#038;cKey=1215599759000">to vote on a law that would permanently ban the construction of minarets.</a></p>
<p>Really, when I think of all the crimes that democratic nations commit  against their own moral codes, it&#8217;s enough to make me wonder if William Henry Vanderbilt was onto something when he said, &#8220;the people be damned.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/31/democracy-in-action/feed/</wfw:commentRss>
		</item>
		<item>
		<title>And the Teslas Just Keep on Coming</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/29/and-the-teslas-just-keep-on-coming/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/29/and-the-teslas-just-keep-on-coming/#comments</comments>
		<pubDate>Mon, 29 Dec 2008 15:47:01 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3242</guid>
		<description><![CDATA[I think this Youtube video does a far better job of showcasing the dangers of MRI machines than &#8220;The Magnetic Zone&#8221; video that Siemens distributes. I particularly enjoy the &#8220;take off&#8221; sound that the air cylinder makes three seconds into the clip.
]]></description>
			<content:encoded><![CDATA[<p>I think <a href="http://www.youtube.com/watch?v=RcNXinLX9GA">this Youtube video</a> does a far better job of showcasing the dangers of MRI machines than &#8220;The Magnetic Zone&#8221; video that Siemens distributes. I particularly enjoy the &#8220;take off&#8221; sound that the air cylinder makes three seconds into the clip.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/29/and-the-teslas-just-keep-on-coming/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Data Collection Strategies Revisited</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/27/data-collection-strategies-revisited/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/27/data-collection-strategies-revisited/#comments</comments>
		<pubDate>Sat, 27 Dec 2008 01:42:12 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3228</guid>
		<description><![CDATA[To follow up on my post earlier today on two approaches to linear model fitting, I decided to do some Monte Carlo simulations to test the relative strength of my two proposals for data collection strategies. To test the merits of sampling clustered data points versus sampling scattered data points, I generated 100,000 data sets [...]]]></description>
			<content:encoded><![CDATA[<p>To follow up on my post earlier today on two approaches to linear model fitting, I decided to do some Monte Carlo simulations to test the relative strength of my two proposals for data collection strategies. To test the merits of sampling clustered data points versus sampling scattered data points, I generated 100,000 data sets of four sizes (N = 10, N = 100, N = 1,000, N = 10,000) using each of these two approaches. For clustered data sets, I calculated the slope of the regression line using the slope formula everyone learns from remedial algebra; for the scattered data sets, I calculated the regression coefficients using standard linear model algorithms. I then compared these slopes with the true value and calculated the absolute error for each approach. Using these individual errors, I calculated the mean absolute error for each approach. The results are plotted in the graph below and the code I used to run the simulations is at the end of this post.</p>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2008/12/monte-carlo-results.png" alt="Monte Carlo Results.png" border="0" width="480" height="480" /></div>
<p>As you can see from this graph, the two approaches seem to have indistinguishable performance for large data sets, but the clustered data set approach seems to perform slightly better for data sets of size N = 10. I was quite surprised by this, as I assumed my slightly ad hoc approach would perform worse than the standard approach. I therefore would appreciate any/all of the following: theoretical analyses of the two approaches&#8217; merits for small data sets, the discovery of errors in my code, or an insight about the R function <code>rnorm()</code> that implies these results regardless of the intrinsic quality of the two approaches. One other conceivable source of error in these results &#8212; which I had hoped would wish out with the large number of iterations in my simulations &#8212; is that the data sets used to perform the analysis were simply incomparable, which is problematic because I compared classical regression on scattered point data sets to slope estimation on clustered point data sets. As a follow up, I should probably consider the performance of classical regression on clustered point data sets relative to scattered point data sets, though this approach would not itself answer the question of which data collection strategy is best.</p>

<div class="wp_codebox"><table width="100%" ><tr id="p32282"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
</pre></td><td class="code" id="p3228code2"><pre class="c" style="font-family:monospace;"><span style="color: #339933;"># Compare performance of scattered point regression with slope estimation.</span>
&nbsp;
<span style="color: #339933;"># See whether one algorithm does better with certain size data sets.</span>
&nbsp;
<span style="color: #339933;"># Compare difference in performance on 100,000 samples of four sizes:</span>
<span style="color: #339933;"># 10, 100, 1000, 10000.</span>
sample.<span style="color: #202020;">sizes</span> <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span><span style="color: #0000dd;">10</span><span style="color: #339933;">,</span> <span style="color: #0000dd;">100</span><span style="color: #339933;">,</span> <span style="color: #0000dd;">1000</span><span style="color: #339933;">,</span> <span style="color: #0000dd;">10000</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
classical <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
dual.<span style="color: #202020;">point</span> <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span>i in <span style="color: #0000dd;">1</span><span style="color: #339933;">:</span>length<span style="color: #009900;">&#40;</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
  slopes <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  alt.<span style="color: #202020;">slopes</span> <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  errors <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  alt.<span style="color: #202020;">errors</span> <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #339933;"># Classical regression.</span>
  <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span>iteration in <span style="color: #0000dd;">1</span><span style="color: #339933;">:</span><span style="color: #0000dd;">100000</span><span style="color: #009900;">&#41;</span>
  <span style="color: #009900;">&#123;</span>
    x <span style="color: #339933;">=</span> <span style="color: #0000dd;">1</span><span style="color: #339933;">:</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    y <span style="color: #339933;">=</span> <span style="color: #0000dd;">2</span> <span style="color: #339933;">*</span> x <span style="color: #339933;">+</span> rnorm<span style="color: #009900;">&#40;</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    slopes<span style="color: #009900;">&#91;</span>iteration<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> coef<span style="color: #009900;">&#40;</span>lm<span style="color: #009900;">&#40;</span>y ~ x<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    errors<span style="color: #009900;">&#91;</span>iteration<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> abs<span style="color: #009900;">&#40;</span>slopes<span style="color: #009900;">&#91;</span>iteration<span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  <span style="color: #339933;"># Dual point slope estimatation.</span>
  <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span>iteration in <span style="color: #0000dd;">1</span><span style="color: #339933;">:</span><span style="color: #0000dd;">100000</span><span style="color: #009900;">&#41;</span>
  <span style="color: #009900;">&#123;</span>
    x <span style="color: #339933;">=</span> c<span style="color: #009900;">&#40;</span>rep<span style="color: #009900;">&#40;</span><span style="color: #0000dd;">1</span><span style="color: #339933;">,</span> sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">/</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> rep<span style="color: #009900;">&#40;</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">/</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    y <span style="color: #339933;">=</span> <span style="color: #0000dd;">2</span> <span style="color: #339933;">*</span> x <span style="color: #339933;">+</span> rnorm<span style="color: #009900;">&#40;</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    alt.<span style="color: #202020;">slopes</span><span style="color: #009900;">&#91;</span>iteration<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span>mean<span style="color: #009900;">&#40;</span>y<span style="color: #009900;">&#91;</span><span style="color: #009900;">&#40;</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">/</span> <span style="color: #0000dd;">2</span> <span style="color: #339933;">+</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">:</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">-</span> mean<span style="color: #009900;">&#40;</span>y<span style="color: #009900;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #339933;">:</span><span style="color: #009900;">&#40;</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">/</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">/</span> <span style="color: #009900;">&#40;</span>sample.<span style="color: #202020;">sizes</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> <span style="color: #0000dd;">1</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    alt.<span style="color: #202020;">errors</span><span style="color: #009900;">&#91;</span>iteration<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> abs<span style="color: #009900;">&#40;</span>alt.<span style="color: #202020;">slopes</span><span style="color: #009900;">&#91;</span>iteration<span style="color: #009900;">&#93;</span> <span style="color: #339933;">-</span> <span style="color: #0000dd;">2</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
&nbsp;
  classical<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> mean<span style="color: #009900;">&#40;</span>errors<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  dual.<span style="color: #202020;">point</span><span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> mean<span style="color: #009900;">&#40;</span>alt.<span style="color: #202020;">errors</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/27/data-collection-strategies-revisited/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Linear Regression and Decisions about Sampling</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/26/linear-regression-and-decisions-about-sampling/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/26/linear-regression-and-decisions-about-sampling/#comments</comments>
		<pubDate>Fri, 26 Dec 2008 18:03:08 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Observations]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3225</guid>
		<description><![CDATA[Lately I&#8217;ve been thinking about the optimal strategy for data collection when you plan to run a linear regression. Clearly, you want a sample of widely distributed points if you&#8217;re unsure that a strict linearity assumption is appropriate. If you already know from theoretical reasons that linearity is appropriate, then you know that you only [...]]]></description>
			<content:encoded><![CDATA[<p>Lately I&#8217;ve been thinking about the optimal strategy for data collection when you plan to run a linear regression. Clearly, you want a sample of widely distributed points if you&#8217;re unsure that a strict linearity assumption is appropriate. If you already know from theoretical reasons that linearity is appropriate, then you know that you only need two correct (x, y) data points to uniquely define the regression line. To get this, one conventionally samples many (x, y) pairs and then computes the regression line&#8217;s slope and intercept. Why not sample only two x data points over and over again instead? If you are trying to find the formula for the line E[y | x], it seems reasonable to assume that high quality estimates of the points (a, E[y | x = a]) and (b, E[y | x = b]) would be a good way to do this.</p>
<p>Is this a reasonable approach to two variable linear regressions? Is this approach less efficient statistically than sampling at many points? Or is the reason to avoid this strategy in practice is that one is uncertain of the validity of the linearity assumption in all but exceptional cases?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/26/linear-regression-and-decisions-about-sampling/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Making the Most of My Mac</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/26/making-the-most-of-my-mac/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/26/making-the-most-of-my-mac/#comments</comments>
		<pubDate>Fri, 26 Dec 2008 15:29:14 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Mac OS X]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3223</guid>
		<description><![CDATA[For literally years I&#8217;ve been meaning to write a post about my favorite programs and utilities for the Mac, but I&#8217;ve always managed to put it off. Given that I recently sent my girlfriend my old Powerbook, I thought that I should finally write down a list of the programs and tools that I&#8217;ve found [...]]]></description>
			<content:encoded><![CDATA[<p>For literally years I&#8217;ve been meaning to write a post about my favorite programs and utilities for the Mac, but I&#8217;ve always managed to put it off. Given that I recently sent my girlfriend my old Powerbook, I thought that I should finally write down a list of the programs and tools that I&#8217;ve found worth having as a Mac user. This list is definitely idiosyncratic &#8212; with a heavy bias towards programming and scientific tools &#8211;, but I think that there are still a lot of very good programs on this list that do not always get as much publicity as they deserve. All that said, here&#8217;s my list.</p>
<p>1. <b>Adium</b>: The best chat client for the Mac that I&#8217;m aware of. I use it as a client for GMail chat, AIM and MSN. I&#8217;d use iChat if it worked with all of those services as well as Adium does, but it just doesn&#8217;t as far as I can tell. There are things that iChat does that Adium can&#8217;t do &#8212; e.g. video chat &#8211;, but I don&#8217;t have any use for those features. I could also use the AIM and MSN programs provided by AOL and Microsoft, but I much prefer a single integrated program over several separate programs. (<a href="http://www.adiumx.com/">site</a>)</p>
<p>2. <b>Caffeine</b>: A simple little program that keeps your Mac from going to sleep, turning off the monitor or activating the screen saver. Very useful when you&#8217;re giving presentations. (<a href="http://lightheadsw.com/caffeine/">site</a>)</p>
<p>3. <b>Carbon Emacs</b>: The only build of Emacs that I find reliably renders the keys on my Western Spanish keyboard. It is also the only one that seems to respect traditional Emacs key bindings, which is very important to me. (<a href="http://www.apple.com/downloads/macosx/unix_open_source/carbonemacspackage.html">site</a>)</p>
<p>4. <b>Cubase</b>: My favorite music composition software for the Mac. Cubase is well-deservedly famous as a MIDI sequencer and I&#8217;ve found that it&#8217;s equally good as a multitrack audio recording system. I use it along with a Toneport UX2 to record guitar and Superior Drummer 2.0 for drum tracking, and I&#8217;ve gotten great results so far. (<a href="http://www.steinberg.net/en/products/musicproduction/cubase4_product.html">site</a>)</p>
<p>5. <b>Cyberduck</b>: The FTP/SFTP client I use. I&#8217;m sure there are better tools than Cyberduck (such as Transmit 3), but Cyberduck is free and does the job more than well enough for my needs. (<a href="http://cyberduck.ch/">site</a>)</p>
<p>6. <b>Delicious Library</b>: A program to help you keep a record of all of the books, DVD&#8217;s and CD&#8217;s you own. I find it especially useful for keeping track of the books I lend to people. (<a href="http://www.delicious-monster.com/">site</a>)</p>
<p>7. <b>Flickr Uploader</b>: If I&#8217;m going to upload a lot of photos to Flickr, I really don&#8217;t want to have to use a Web interface. Flickr Uploader lets me do all of the editing on my machine and then send the labelled and tagged photos as a single group to Flickr. Most importantly, the progress I&#8217;ve made in tagging photos isn&#8217;t lost when my Internet connection flakes out. (<a href="http://www.flickr.com/tools/uploadr/">site</a>)</p>
<p>8. <b>Gimp</b>: I&#8217;m too cheap to buy Photoshop, but the recent builds of Gimp for the Mac work well enough for my purposes. (<a href="http://gimp.lisanet.de/Website/Overview.html">site</a>)</p>
<p>9. <b>Graphviz</b>: If I have to draw any sort of graph, I always use Graphviz. It&#8217;s a great interface to compilers for the DOT language developed at Bell Labs to describe graphs. If you can program at all and ever need to write up flowcharts or diagrams of any sort, I think Graphviz is the way to go. You should also know that the Pixelglow build for Macs is much better than the default. (<a href="http://www.pixelglow.com/graphviz/">site</a>)</p>
<p>10. <b>Growl</b>: Growl provides one of those clever little hacks to the basic Mac user interface that Windows users always find impressive: it creates a service for displaying notifications on your screen that quickly fade away after you&#8217;ve seen them. But the truth is that Growl&#8217;s usefulness is only obvious after you&#8217;ve used it for a while. (<a href="http://growl.info/">site</a>)</p>
<p>11. <b>Handbrake</b>: The best video transcoder I know of for the Mac. Whenever I need to change one video format into another, Handbrake&#8217;s been able to do it for me. (<a href="http://handbrake.fr/">site</a>)</p>
<p>12. <b>Hazel</b>: Another great service for Macs: install Hazel and you have a simple daemon that will regularly move files according to a set of rules you define yourself. I use it to sort every file on my desktop into folders specific to filetypes &#8212; moving MP3&#8217;s to one folder and PDF&#8217;s to another. It&#8217;s been a major part of my efforts to be more organized with my files. (<a href="http://www.noodlesoft.com/hazel.php">site</a>)</p>
<p>13. <b>KeePassX</b>: A password manager that I find very helpful for navigating the mass of passwords I need to remember without leaving the passwords as plain text anywhere on my system. (<a href="http://www.keepassx.org/">site</a>)</p>
<p>14. <b>MacFreePOPS</b>: A simple little program that will let you access your Hotmail account from Mail as if it were a POP server. Extremely useful. (<a href="http://www.e-link.it/macfreepops/">site</a>)</p>
<p>15. <b>MacFUSE</b>: Probably the most impressive of all of the hacks created by the Mac user community. MacFUSE allows you to install new file system drivers that run entirely in user space. The result is that you&#8217;ll get easy access to NTFS (i.e. Windows) hard drives and a slew of other formats. I think everyone should put MacFUSE on their machine the day they buy it. (<a href="http://code.google.com/p/macfuse/">site</a>)</p>
<p>16. <b>Mac The Ripper</b>: If I need to make a copy of a DVD I&#8217;ve made, Mac The Ripper makes it much easier for me to do so. Unfortunately only the older version is still freely distributed, but it works for most DVD&#8217;s. (<a href="http://www.mactheripper.org/">site</a>)</p>
<p>17. <b>MarsEdit</b>: Just as I don&#8217;t like using a web interface to upload photos to Flickr, I don&#8217;t much like using one when writing blog posts. So I do all of my writing in MarsEdit, which then handles uploading my finished posts to my server. (<a href="http://www.red-sweater.com/marsedit/">site</a>)</p>
<p>18. <b>Mathematica</b>: I use Mathematica fairly frequently when I want to get a quick sense of how functions behave or when I need to evaluate an integral I&#8217;ve forgotten how to solve by hand. (<a href="http://www.wolfram.com/products/mathematica/index.html">site</a>)</p>
<p>19. <b>Matlab</b>: If I need to do a lot of basic number crunching involving matrices, I always use Matlab. Additionally, I tend to use it along with PsychToolBox and DotsX for coding experiments in neuroscience and psychology. (<a href="http://www.mathworks.com/products/matlab/">site</a>)</p>
<p>20. <b>MySQL</b>: I always use MySQL as the database system for every dynamic web site I build. It works perfectly on Mac OS X these days, so I tend to demo things on my own machine before moving them off to a stand-alone server. (<a href="http://dev.mysql.com/downloads/">site</a>)</p>
<p>21. <b>NetNewsWire</b>: My favorite RSS reader for the Mac. I can&#8217;t speak highly enough of NetNewsWire&#8217;s interface or the fact that the iPhone application is just as great as the desktop version. (<a href="http://www.newsgator.com/INDIVIDUALS/NETNEWSWIRE/">site</a>)</p>
<p>22. <b>OpenOffice</b>: Again, I&#8217;m too cheap to buy a copy of Office, so I use OpenOffice. It&#8217;s managed to serve me pretty well so far. It&#8217;s still a little lacking on the Mac, but it&#8217;s getting much better with time. (<a href="http://www.openoffice.org/">site</a>)</p>
<p>23. <b>Papers</b>: My means for storing and organizing all of my PDF files. Think of it as iTunes for PDF&#8217;s. If you read journal articles, Papers will improve your life more than you could possibly expect. (<a href="http://mekentosj.com/papers/">site</a>)</p>
<p>24. <b>Perian</b>: Perian will outfit your Quicktime player with almost all of the codecs you could want. Without it, I find Quicktime almost useless. (<a href="http://www.perian.org/">site</a>)</p>
<p>25. <b>Perl</b>: The classic programming language needs no introduction, but I think it&#8217;s worth noting that you&#8217;re always better building your own version of Perl and storing it in <code>/usr/local/</code>, where you won&#8217;t be able to destroy the version that OS X ships with by default. I&#8217;ve also found it nearly impossible to get many modules to build without some customization. (<a href="http://www.cpan.org/">site</a>)</p>
<p>26. <b>Python</b>: Again, I don&#8217;t think Python needs an introduction, but building your own copy seems like a very good idea to me. (<a href="http://www.python.org/">site</a>)</p>
<p>27. <b>Quicksilver</b>: A great tool for getting easy access to programs. I don&#8217;t use nearly as many of Quicksilver&#8217;s features as a lot of people do, but I find it really helpful to be able to avoid using Finder when I don&#8217;t need to. (<a href="http://blacktree.com/">site</a>)</p>
<p>28. <b>R</b>: My language of choice for statistical computing and data analysis. Great tools for producing graphs and an amazing set of facilities for any statistical computation you could ever want to perform. If you want to do statistics like a grown up statistician would, R is the way to go. (<a href="http://www.r-project.org/">site</a>)</p>
<p>29. <b>ReadIris</b>: My favorite OCR software for the Mac. I use this every time I want to copy a long section of text I&#8217;ve scanned. I invariably have to make corrections by hand, but that&#8217;s much faster than typing everything myself from scratch. Given how well ReadIris performs for me, I have high hope that one day in my life we&#8217;ll see a properly Bayesian piece of OCR software that gets everything right. (<a href="http://www.irislink.com/c2-532-189/OCR-Software---Product-list.aspx">site</a>)</p>
<p>30. <b>Ruby</b>: Another programming language that needs no introduction, but which I&#8217;d recommend building from source and storing in <code>/usr/local</code>. (<a href="http://www.ruby-lang.org/en/">site</a>)</p>
<p>31. <b>ScreenFlow</b>: The best screencasting software for the Mac I could find. Given the number of features, the quality of the interface and its relatively low cost, I doubt one could find something better for a few years to come. (<a href="http://www.telestream.net/screen-flow/overview.htm">site</a>)</p>
<p>32. <b>Scrivener</b>: An amazing application that makes writing extended works (for me those are mostly translations) much, much easier. I don&#8217;t use it as often these days, but Scrivener is a brilliant tool if you do a lot of writing that can be broken into sections and outlined carefully. (<a href="http://www.literatureandlatte.com/scrivener.html">site</a>)</p>
<p>33. <b>ScummVM</b>: When I&#8217;m not working, I like to play some old LucasArts games. ScummVM makes that possible. (<a href="http://www.scummvm.org/">site</a>)</p>
<p>34. <b>Senuti</b>: If I need to transfer a file off of an iPod (which iTunes makes impossible), Senuti is there for me. It was free for a long time, so I&#8217;m somewhat surprised to find that you&#8217;re supposed to pay for it now. (<a href="http://www.fadingred.com/senuti/">site</a>)</p>
<p>35. <b>Sequel Pro</b>: My favorite database client system. The heir to the great CocoaMySQL application. A perfect compliment to MySQL on the Mac. (<a href="http://www.sequelpro.com/">site</a>)</p>
<p>36. <b>Skype</b>: Who doesn&#8217;t use Skype as their VoIP program? (<a href="http://www.skype.com/">site</a>)</p>
<p>37. <b>TexLive</b>: When I want documents to look clean, I always use LaTeX. TexLive is the current standard distribution of LaTeX for UNIX systems and it has some great tools specifically made for the Mac. (<a href="http://www.tug.org/texlive/">site</a>)</p>
<p>38. <b>TextMate</b>: They claim it, and I agree: TextMate is Emacs for the 21st century. If you are young enough that you find GUI&#8217;s helpful and don&#8217;t think touching the mouse is a crime against nature, TextMate is the best text editor you will ever find. Every Rails person worth his salt is a TextMate user and there is an endless supply of bundles to customize TextMate for the language of your choice. (I&#8217;ve recently used it a lot with Matlab, R and Erlang.) (<a href="http://macromates.com/">site</a>)</p>
<p>39. <b>The Unarchiver</b>: If you&#8217;ve ever received a compressed file you couldn&#8217;t open, get The Unarchiver and your problems will be solved. Everything else is a waste of time and/or money. (<a href="http://wakaba.c3.cx/s/apps/unarchiver.html">site</a>)</p>
<p>40. <b>Twitterific</b>: My favorite Twitter client for the Mac. (<a href="http://iconfactory.com/software/twitterrific">site</a>)</p>
<p>41. <b>Unison</b>: Probably my single favorite tool for the Mac. Unison lets my keep all of the files that matter to me in perfect sync between my laptop and my desktop. In practice, that amounts to a brilliant back-up system as well as making my life incredibly easier when I do some work on my laptop and then some more work on my desktop. In the end, Unison seems to be the program destined to replace rsync one day. (<a href="http://www.cis.upenn.edu/~bcpierce/unison/">site</a>)</p>
<p>42. <b>VLC</b>: The system I always use to watch anything that doesn&#8217;t open in Quicktime with Perian installed. (<a href="http://www.videolan.org/vlc/">site</a>)</p>
<p>43. <b>VMWare Fusion</b>: Sometimes I need to run Windows or Linux. VMWare Fusion makes it incredibly easy to do so and runs both of those operating systems with remarkable efficiency. (<a href="http://www.vmware.com/products/fusion/">site</a>)</p>
<p>44. <b>Zenmap</b>: If I need to figure out the structure of the network I&#8217;m on, nmap is the tool for doing so. Zenmap provides nmap for the Mac and also a (sometimes) helpful GUI. (<a href="http://nmap.org/zenmap/">site</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/26/making-the-most-of-my-mac/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Again with the Null Hypothesis Significance Testing</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/26/again-with-the-null-hypothesis-significance-testing/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/26/again-with-the-null-hypothesis-significance-testing/#comments</comments>
		<pubDate>Fri, 26 Dec 2008 14:15:38 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3221</guid>
		<description><![CDATA[As I was finishing reading &#8220;The Cult of Statistical Significance&#8221; yesterday, the following passage struck me as particularly important:

Rothman computed a p-value function &#8212; a continuous function of p-values mapped against a range of effect sizes. The range of effect sizes was here again measured by the relative risk ratio and includes both beneficial and [...]]]></description>
			<content:encoded><![CDATA[<p>As I was finishing reading &#8220;The Cult of Statistical Significance&#8221; yesterday, the following passage struck me as particularly important:</p>
<blockquote><p>
Rothman computed a <i>p</i>-value function &#8212; a continuous function of <i>p</i>-values mapped against a range of effect sizes. The range of effect sizes was here again measured by the relative risk ratio and includes both beneficial and nonbeneficial effects. He shows that another hypothesis, a fantastically beneficial risk ratio, RR = 4.1, shares the same <i>p</i>-value, .14, as the null, RR = 1.0 (2002, 125). This is common in medicine and all the sciences. To think that <i>p</i>-values have a 1-to-1 correspondence with a unique risk ratio is to ignore the symmetry of the <i>p</i>-function.<sup>1</sup>
</p></blockquote>
<p>I wish the symmetry of the distributions used for testing significance, especially the <i>t</i> distribution, would be emphasized to students during their introduction to statistics. We generally test distributions so that the <i>t</i>-value comparison is strongly positive to see whether we can reject the null hypothesis of zero difference between the means for some two sets of observations. But it is always possible to test another null hypothesis, in which the difference between the means for the two groups is much larger than the difference we observed, that we will also always fail to reject every time that we fail to reject the primary null hypothesis of zero difference. Yet we never test this hypothesis &#8212; despite their being no good reason for this mathematically. The only justification is an implicit Bayesian prior in defense of the null hypothesis rather than its dopplegänger hypothesis in which the difference is much larger than we have seen in practice. Is this implicit underweighting of the alternative null hypothesis really sound? That is an empirical question that is, unfortunately, not likely to be answered soon, but it suggests that conventional statistical practice may consistently underestimate the effects being examined using significance testing.</p>
<p>Of course, this problem is itself tied to the erroneous conflation of a failure to reject the null hypothesis with its acceptance &#8212; with the result that statistically insignificant effects are treated as non-existent, rather than inconclusively determined by the data at hand. Statistically insignificant differences tend not to be tested empirically a second time, so that it is hard to know how often they are really larger than our first experiments suggested.</p>
<ol class="footnotes"><li id="footnote_0_3221" class="footnote">Stephen T. Ziliak and Deirdre N. McCloskey : The Cult of Statistical Significance : On Drugs, Disability and Death</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/26/again-with-the-null-hypothesis-significance-testing/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Proving the Obvious and Understanding the Not-So Obvious</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/13/proving-the-obvious-and-understanding-the-not-so-obvious/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/13/proving-the-obvious-and-understanding-the-not-so-obvious/#comments</comments>
		<pubDate>Sat, 13 Dec 2008 18:52:27 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3219</guid>
		<description><![CDATA[Continuing on with my exploration of the National Survey of Drug Use and Health, I thought that I should calculate some simple conditional frequency statistics. The graph below strikes me as a very good example of how conditional probabilities play out in the real world. From it, you can see how the right piece of [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing on with my exploration of the National Survey of Drug Use and Health, I thought that I should calculate some simple conditional frequency statistics. The graph below strikes me as a very good example of how conditional probabilities play out in the real world. From it, you can see how the right piece of information can radically improve your ability to make guesses about the answer to another question.</p>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2008/12/cigarettes-and-cocaine.png" alt="Cigarettes and Cocaine.png" border="0" width="480" height="480" /></div>
<p>To quantify the pattern that you can see in the chart, only 4% of those who&#8217;ve tried cocaine have not also tried cigarettes at some point in their lives. In contrast, 49% of those who&#8217;ve tried cigarettes have never tried cocaine. In general, people are unlikely to try cocaine, but those who do are almost certain to have tried cigarettes as well. In other words, cocaine use tells you a lot about cigarette use, but cigarette use tells you effectively nothing about cocaine use. If you meet someone who&#8217;s tried cocaine, and you assume that they&#8217;ve also tried cigarettes, these statistics suggest that your assumption will be wrong less than 5% of the time.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/13/proving-the-obvious-and-understanding-the-not-so-obvious/feed/</wfw:commentRss>
		</item>
		<item>
		<title>National Survey of Drug Use and Health</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/11/national-survey-of-drug-use-and-health/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/11/national-survey-of-drug-use-and-health/#comments</comments>
		<pubDate>Thu, 11 Dec 2008 23:50:12 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3216</guid>
		<description><![CDATA[Lately, I&#8217;ve been exploring the data set that was recently released by the National Survey of Drug Use and Health. There&#8217;s enough raw data in it to spend months trying to make sense of it all. That said, for the moment I thought that I would simply post the following chart I generated using a [...]]]></description>
			<content:encoded><![CDATA[<p>Lately, I&#8217;ve been exploring the data set that was recently released by the National Survey of Drug Use and Health. There&#8217;s enough raw data in it to spend months trying to make sense of it all. That said, for the moment I thought that I would simply post the following chart I generated using a very quick calculation of the relative frequencies of substance abuse broken down by substance.</p>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2008/12/substance-abuse.png" alt="Substance Abuse.png" border="0" width="480" height="480" /></div>
<p>The variables used in this analysis were ABUSEALC, ABUSECOC, ABUSEHAL, ABUSEHER, ABUSEINH, ABUSEMRJ, ABUSEANL, ABUSESED, ABUSESTM, ABUSETRN. The meanings of these variables are somewhat obscure, but my hope is that the definition of abuse is similar enough across substances to allow for a relative frequency analysis. Every subject classified as abusing a substance was summed over and then the resulting number was divided by the total number of subjects in the data set to find a frequency of abuse per substance.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/11/national-survey-of-drug-use-and-health/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Breast Cancer and Early First Pregnancy?</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/09/breast-cancer-and-early-first-pregnancy/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/09/breast-cancer-and-early-first-pregnancy/#comments</comments>
		<pubDate>Tue, 09 Dec 2008 00:24:25 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Citations]]></category>

		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3213</guid>
		<description><![CDATA[Reading David Freedman&#8217;s book &#8220;Statistical Models: Theory and Practice&#8221; today, I was very struck by this passage:

Example 1. In cross-national comparisons, there is a striking correlation between the number of telephone lines per capita in a country and the death rate from breast cancer in that country. This is not because talking on the telephone [...]]]></description>
			<content:encoded><![CDATA[<p>Reading David Freedman&#8217;s book &#8220;Statistical Models: Theory and Practice&#8221; today, I was very struck by this passage:</p>
<blockquote><p>
Example 1. In cross-national comparisons, there is a striking correlation between the number of telephone lines per capita in a country and the death rate from breast cancer in that country. This is not because talking on the telephone causes cancer. Richer countries have more phones and higher cancer rates. The probable explanation for the excess cancer risk is that women in richer countries have fewer children. Pregnancy &#8212; especially early first pregnancy &#8212; is protective.<sup>1</sup>
</p></blockquote>
<p>Is Freedman correct about the protective benefits of pregnancy? This would be remarkable if true. And, in the absence of evidence to the contrary, I am likely to believe Freedman&#8217;s claims.</p>
<ol class="footnotes"><li id="footnote_0_3213" class="footnote">David Freedman : Statistical Models: Theory and Practice : Chapter I</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/09/breast-cancer-and-early-first-pregnancy/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Alexander and His Hectors</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/07/alexander-and-his-hectors/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/07/alexander-and-his-hectors/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 18:38:25 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Citations]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3211</guid>
		<description><![CDATA[I suspect that anyone with an interest in evolutionary psychology or game theory will enjoy this passage from Taleb&#8217;s &#8220;The Black Swan:&#8221;

I discovered that it is much more  effective to act like a nice guy and be  “reasonable” if you prove willing to go  beyond just verbiage. You can afford to be [...]]]></description>
			<content:encoded><![CDATA[<p>I suspect that anyone with an interest in evolutionary psychology or game theory will enjoy this passage from Taleb&#8217;s &#8220;The Black Swan:&#8221;</p>
<blockquote><p>
I discovered that it is much more  effective to act like a nice guy and be  “reasonable” if you prove willing to go  beyond just verbiage. You can afford to be compassionate, lax, and courteous if, once in a while, when it is least expected of you, but completely justified, you sue someone, or savage an enemy, just to show that you can walk the walk.<sup>1</sup>
</p></blockquote>
<ol class="footnotes"><li id="footnote_0_3211" class="footnote">Nassim Nicholas Taleb : The Black Swan: The Impact of the Highly Improbable : Preface</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/07/alexander-and-his-hectors/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Masquerading as Rigorous Science</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/04/masquerading-as-rigorous-science/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/04/masquerading-as-rigorous-science/#comments</comments>
		<pubDate>Thu, 04 Dec 2008 02:35:12 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3209</guid>
		<description><![CDATA[
In our days, serious arguments have been made from data. Beautiful, delicate theorems have been proved; although the connection with data analysis often remains to be established. And an enormous amount of fiction has been produced, masquerading as rigorous science.1

I would like to believe that, if only more statisticians wrote like David Freedman, we might [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p>
In our days, serious arguments have been made from data. Beautiful, delicate theorems have been proved; although the connection with data analysis often remains to be established. And an enormous amount of fiction has been produced, masquerading as rigorous science.<sup>1</sup>
</p></blockquote>
<p>I would like to believe that, if only more statisticians wrote like David Freedman, we might succeed in ridding ourselves of so much of the fashionable nonsense that masquerades as science today.</p>
<p>Hat tip to Jiaying Zhao for bringing this truly amazing article to my attention.</p>
<ol class="footnotes"><li id="footnote_0_3209" class="footnote">David Freedman : Foundations of Science : <a href="http://www.stat.berkeley.edu/~census/fos.pdf">Some Issues in the Foundation of Statistics</a></li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/04/masquerading-as-rigorous-science/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Posner at His Best</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/12/01/posner-at-his-best/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/12/01/posner-at-his-best/#comments</comments>
		<pubDate>Mon, 01 Dec 2008 02:51:48 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Citations]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3207</guid>
		<description><![CDATA[This passage from today&#8217;s post on the Becker-Posner blog exemplifies the reasons why I have so much respect for Posner:

As the campaigns progressed, a surprising number of conservatives switched their support to Obama. Thoughtful conservatives, already disturbed by the accumulation of blunders of the current Administration (the Iraq WMD, Katrina, the Justice Department scandals), culminating [...]]]></description>
			<content:encoded><![CDATA[<p>This passage from <a href="http://www.becker-posner-blog.com/archives/2008/11/the_future_of_c.html">today&#8217;s post</a> on <a href="http://www.becker-posner-blog.com">the Becker-Posner blog</a> exemplifies the reasons why I have so much respect for Posner:</p>
<blockquote><p>
As the campaigns progressed, a surprising number of conservatives switched their support to Obama. Thoughtful conservatives, already disturbed by the accumulation of blunders of the current Administration (the Iraq WMD, Katrina, the Justice Department scandals), culminating in its uncertain response to the financial crisis, were appalled at the iconic status that Joe the Plumber attained in the Republican campaign, the wild rumors spread by the conservative bloggers and talk-radio hosts, and the intellectual vacuity of many Republican candidates and advocates. The Republican Party seemed to have descended to anti-intellectualism&#8211;to deriding highly educated people who speak in complete sentences as &#8220;elitists,&#8221; as compared to the down-to-the-earth ignorance of Joe and his ilk&#8211;which sorts badly with the strong intellectual tradition of conservatism. It is a self-defeating strategy of conservatives to argue that &#8220;all&#8221; intellectuals are liberal and therefore conservatives should think with their guts rather than their brains.</p>
<p>For myself, I would be happy to see conservatism exit from the political scene&#8211;provided it takes liberalism with it. I would like to see us enter a post-ideological era in which policies are based on pragmatic considerations rather than on conformity to a set of preconceptions rooted in a rapidly vanishing past. We have accumulated a substantial history of liberal and conservative failures. The liberal failures include underestimating the cost of egalitarianism and of social engineering by judges (the Warren Court, Roe v. Wade, the near abolition of capital punishment), and the benefits of discipline, of punishment, of enforcing principles of personal responsibility, and of military force. The conservative failures include overestimating the efficiency of unregulated markets, the efficacy of military force, and the beneficent effects of religiosity. Liberals are wrong to promote unions (described by one wag, albeit with some exaggeration, as the parasites that kill their hosts) and conservatives to promote abstinence as a substitute for condoms in preventing teenage pregnancy.<sup>1</sup>
</p></blockquote>
<ol class="footnotes"><li id="footnote_0_3207" class="footnote">Richard Posner : The Becker-Posner Blog : The Future of Conservatism &#8212; Posner</li></ol>]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/12/01/posner-at-his-best/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Suicide Rates and GDP</title>
		<link>http://www.johnmyleswhite.com/notebook/2008/11/30/suicide-rates-and-gdp/</link>
		<comments>http://www.johnmyleswhite.com/notebook/2008/11/30/suicide-rates-and-gdp/#comments</comments>
		<pubDate>Sun, 30 Nov 2008 19:40:33 +0000</pubDate>
		<dc:creator>John Myles White</dc:creator>
		
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.johnmyleswhite.com/?p=3205</guid>
		<description><![CDATA[As part of an ongoing project on the behavioral consequences of tryptophan depletion, I read an article today that claimed to have found a positive correlation between high levels of corn consumption and homicide across many nations. The researchers claimed that corn, being deficient in tryptophan, chronically depletes serotonin levels, thereby increasing incidents of physical [...]]]></description>
			<content:encoded><![CDATA[<p>As part of an ongoing project on the behavioral consequences of tryptophan depletion, I read <a href="http://scholar.google.com/scholar?q=mawson%20jacobs%20tryptophan%20corn&#038;ie=utf-8&#038;oe=utf-8&#038;rls=org.mozilla:en-US:official&#038;client=firefox-a&#038;um=1&#038;sa=N&#038;tab=ws">an article today that claimed to have found a positive correlation between high levels of corn consumption and homicide across many nations</a>. The researchers claimed that corn, being deficient in tryptophan, chronically depletes serotonin levels, thereby increasing incidents of physical violence.</p>
<p>I was fascinated by the claim, albeit rather incredulous. But, rather than pursue the question of tryptophan&#8217;s effects on suicide, I decided to look into a question I&#8217;ve often wondered about: the correlation of GDP and suicide rates.</p>
<p>After some data diving of my own, using <a href="http://www.imf.org/external/pubs/ft/weo/2008/02/weodata/weorept.aspx?sy=2007&#038;ey=2007&#038;scsm=1&#038;ssd=1&#038;sort=country&#038;ds=.&#038;br=1&#038;pr1.x=33&#038;pr1.y=18&#038;c=512%2C446%2C914%2C666%2C612%2C668%2C614%2C672%2C311%2C946%2C213%2C137%2C911%2C962%2C193%2C674%2C122%2C676%2C912%2C548%2C313%2C556%2C419%2C678%2C513%2C181%2C316%2C682%2C913%2C684%2C124%2C273%2C339%2C921%2C638%2C948%2C514%2C943%2C218%2C686%2C963%2C688%2C616%2C518%2C223%2C728%2C516%2C558%2C918%2C138%2C748%2C196%2C618%2C278%2C522%2C692%2C622%2C694%2C156%2C142%2C624%2C449%2C626%2C564%2C628%2C283%2C228%2C853%2C924%2C288%2C233%2C293%2C632%2C566%2C636%2C964%2C634%2C182%2C238%2C453%2C662%2C968%2C960%2C922%2C423%2C714%2C935%2C862%2C128%2C716%2C611%2C456%2C321%2C722%2C243%2C942%2C248%2C718%2C469%2C724%2C253%2C576%2C642%2C936%2C643%2C961%2C939%2C813%2C644%2C199%2C819%2C184%2C172%2C524%2C132%2C361%2C646%2C362%2C648%2C364%2C915%2C732%2C134%2C366%2C652%2C734%2C174%2C144%2C328%2C146%2C258%2C463%2C656%2C528%2C654%2C923%2C336%2C738%2C263%2C578%2C268%2C537%2C532%2C742%2C944%2C866%2C176%2C369%2C534%2C744%2C536%2C186%2C429%2C925%2C178%2C746%2C436%2C926%2C136%2C466%2C343%2C112%2C158%2C111%2C439%2C298%2C916%2C927%2C664%2C846%2C826&#038;s=NGDPD&#038;grp=0&#038;a=">GDP data from the IMF</a> and <a href="http://www.who.int/mental_health/prevention/suicide_rates/en/index.html">suicide data from WHO</a>, I found no meaningful correlation between suicide rates and GDP. Interestingly, a simple scatterplot of the relevant data sets reveals that, for each gender separately, there are several very substantial outliers that make any such correlation impossible to find, as you can see below.</p>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2008/11/male-suicides-and-gdp.png" alt="Male Suicides and GDP.png" border="0" width="480" height="480" /></div>
<div style="text-align:center;"><img src="http://www.johnmyleswhite.com/notebook/wp-content/uploads/2008/11/female-suicides-and-gdp.png" alt="Female Suicides and GDP.png" border="0" width="480" height="480" /></div>
<p>So the question I&#8217;m left with is, &#8220;what variables explain the very different suicide rates seen across nations in this data set?&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.johnmyleswhite.com/notebook/2008/11/30/suicide-rates-and-gdp/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 1.175 seconds -->
