Interpretational Challenges with Ideal Point Models

2019-01-20 4-minute read

As a graduate student, I came to love working with the roll call voting data sets that have been compiled for the United States Congress by political scientists like Keith Poole and Howard Rosenthal. These datasets can be represented in simplified form as matrices in which the rows correspond to legislators and the columns correspond to bills that the legislators vote for or against. One encoding strategy that’s particular convenient is to encode voting in favor of a bill as a 1, voting against a bill as a 0 and abstaining as a missing value.

These incomplete binary matrices are so pleasant to work with because they demonstrate how effective dimensionality reduction techniques can truly be in the wild: a matrix completion algorithm that’s run against most of these matrices will easily reproduce the full matrix with well over 90% accuracy using only a few dimensions.

Within political science, the most popular strategy for analyzing these matrices is to employ an ideal point model, which the IQSS blog describes as follows:

People have long thought about politics in spatial terms: “left” and “right” have been used to describe political preferences since at least the French Revolution, when royalists sat on the right and radicals on the left in the Legislative Assembly. Ideal point models attempt to estimate the position of each legislator on the left-right or other dimensions using the votes that they cast on legislation. Basically, the models assume that a legislator will vote in favor of a motion if it moves policy outcomes closer to their most preferred policy. The resulting estimates from these models provide a descriptive summary of the distribution of preferences within a legislature.

Recently it occurred to me that these ideal point models provide an unusually clean example of how assumption-laden the measurement techniques in social science can be. In particular, I suspect that ideal point models measure political positions if and only if the assumption that “a legislator will vote in favor of a motion if it moves policy outcomes closer to their most preferred policy” holds. If, instead of voting based on preferred policies, legislators vote based on loyalty to their party (e.g. by voting for bills their party puts forward and against bills the opposite party put forwards), then ideal point estimates measure the intensity of party loyalty rather than the policy preferences of legislators.

What drove this message home for me was thinking about the following theoretical dataset in which the disconnect between party loyalty and ideology would be complete. Imagine the following scenario:

Once per day, the legislature brings forward a new bill. On even numbered days, a bill is brought forward by Party 1 (e.g. the Democrats); on odd numbered days, a bill is brought forward by Party 2 (e.g. the Republicans). The bill is immediately voted upon by members of both parties.
Let’s additionally assume that the substantive content of each day’s ostensibly new bill is actually the same every day; the only thing that changes is which party has brought the bill forward. On even numbered days, Party 1 advocates for Policy X; on odd numbered days, Party 2 advocates for Policy X.
Finally let’s assume that the members of the party that brings a bill forward will almost all vote for the bill and that the members of the opposite party will almost all vote against the bill they have not brought forward.

Under these assumptions, I claim that an ideal point model will behave as follows:

The ideal points estimated for the two parties’ legislators will show clear clusters with one party clustered in one region of space and another party clustered in a different region.
The ideal points estimate for the bills will also show clear clustering with the even numbered days’ bills in one region of space and the odd numbered days’ bills in another region.

Collectively, we’ll obtain measurements that could be interpreted as evidence of polarization, but which are probably better described as evidence of party loyalty and discipline.

To be clear, my goal with writing out this example is only to clarify in simple form what I think almost all political scientist would already say: the ideal point models being applied to roll call votes are simply measuring the extent to which legislators vote in ways that can be represented in terms of clusters. There is no semantic content to these clusters in the absence of stronger assumptions about the process that drives voting behavior. If one wants to measure ideology with regard to concrete stances on policy, the data sets employed must contain information about the actual policies implied by bills and the models must be designed to be sensitive to the content of the bills.

Again, none of this is new (except perhaps for the example itself), but I found this thought experiment (and, in particular, the imaginary data generating process) so clear that it was worth documenting.