Is Campaign Finance Data Unusually Dirty Data? At first glance, it sure seems that way.

In an idle moment, just poking around looking at different data files, I decided to load some campaign finance data from New York’s Open Data site. Just go to the site, search on “elections,” pick a file and see what you get. I looked at a couple different files. The analysis below, which is typical, is from the file, “Campaign Finance Expenditures Submitted to the New York State Board of Elections Beginning 1999.” (Note, though I’ve a bunch of questions, I did not call the Board of Elections. I probably will, but I don’t think it necessary before playing with the issues discussed below.) 

Given the issues around campaign finance, should we be at all surprised that the data appear especially dirty? I don’t mean this in a political sense, but in a geek sense.

What a mess:

  • Misspellings and different spellings of the same names
  • Incomplete data
  • Non-existent data
  • Inconsistent date formats
  • Invalid data
And, none of that even asks the question of whether data is accurate.

Here’s an example of an easily avoidable problem: identifying the state in the contributor’s address. It should be pretty easy to get that one right. Right?

Yet, over the fifteen year period, 1999 through 2014, 9.0 percent of the records (over 195,000 of them) did not even list a state. Those records were associated with reported contributions (perhaps, accurate, perhaps valid, but perhaps not) of over $131 million, about 4.7 percent of the reported contributions. And more were clearly invalid. Less than 91 percent of the total records had a valid state identifier. Can you imagine if the Post Office had an error rate like that?

 

Table 1999 2014  Validity of Contributor State IDs  NYS BOE  Public Signals LLC

 

 

 

 

 

Validity of Contributor State IDs  NYS BOE  Public Signals LLC

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here are some fun examples. The listed state code is on the left and the count of the number of times it was used is on the right. Recognize any of them? Though not shown, my favorite state code in this file was “OZ.” I guess that doesn’t mean Kansas, does it Dorothy?  

Examples of Invalid State Codes in BOE Campaign Expenditure Reports  Public Signals LLC

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The date data were at least as messy. My favorite was the submission dated in the year, 200. Yes, only three digits. But there were also submissions from 1899, 1900, and 1901. 
 
The years listed are in a field labeled “election year.” Why are the numbers for 2013 so much greater than 2014? Draw your own conclusions.
 
 I haven’t explored every data set on New York’s Open Data site, but of the files I have looked at, the elections data files are certainly the lowest quality. Why might that be? Well, what are the usual explanations in other domains?
  • No one involved in the process of preparing, submitting, and review (never mind analysis) of the data has a stake in clean data. Indeed, some might even be advantaged by dirty data as it clouds and muddies the what might otherwise be evident.
  • There’s little or no penalty for inadequate data.
  • Campaigns tend to be short-term affairs, especially losing ones. So even if inclined to get it right, there’s no opportunity for improvement. 

 Well, you can add your own theories. I have others. They’re less geeky and much more cynical.

{ Comments on this entry are closed }

Here’s more detail regarding the last post. It shows percent change in sales tax receipts from NYS Tax and Finance, rank ordered by percent change from SFY 2003 to 2014.

NewImage

{ Comments on this entry are closed }

Variation, Always Variation

by John W Rodat on December 22, 2014

Was running some quick numbers this morning and generated the attached graphic.

By State Fiscal Year, it shows the percent change in sales tax distributions by the New York State Department of Tax and Finance to each county outside New York City. The data are indexed with each year showing the percent change from the first year (as opposed to the prior year). From the receiving end, it shows percent change in county sale and use tax revenue from SFY 2003 to each year after. 

The data simply reflect actual funds and are not adjusted for changes in sales tax rates. As can be seen, the largest percent change came in Oswego County. Though not labeled, the second largest, was Jefferson County. I’ll put an interactive version online later. The lowest was Albany County, which is particularly problematic since Albany is also especially dependent on sales tax revenues. Also unlabeled, the second lowest, was Ulster County.

From start to finish, there’s more than a six-fold difference from the lowest to the highest. That’s always worth exploring.

I’ll put up an interactive version later.

NewImage

 

{ Comments on this entry are closed }

Edward Snowden is the Citizen Four for government misuse of data. Who’s Snowden for corporate use?

The car ride company, Uber has a callous corporate culture. It’s not just annoying. It’s frightening. Read Tufekci and King in the New York Times: We Can’t Trust Uber

{ Comments on this entry are closed }

Ghost of Tom Joad

by John W Rodat on November 12, 2014

Somehow the “The Ghost of Tom Joad” feels timely.

Bruce Springsteen and the E Street Band with Tom Morello at Madison Square Garden in NYC, October, 2009.

{ Comments on this entry are closed }

The latest challenge to the Affordable Care Act has the press and pundits in a dither. The Supreme Court preempted a full panel in the Court of Appeals (District of Columbia Circuit) already scheduled to hear an appeal in December and agreed to hear King v. Burwell. Both proponents and opponents are both amped up with the prospect that were the Court to find for the plaintiffs that this would be the end of the ACA. The key issue is whether people who get private insurance through the Federal Exchange are entitled to a Federal tax credit. The law’s challenger’s say, “no, they are not entitled.” Rather, they argue that if you read the law literally, only those who get their coverage through a State exchange are entitled.

I disagree with both the legal challenge (Having done some legislative drafting in my time, I can only snort at the plaintiff’s legal interpretation) and more importantly, I disagree with the doomsayers.

I’ll explain the dispute and analyze the potential effects in further detail over the next few days. But here’s the short version.

  1. As of August of this year, 17 States had state-based exchanges, 27 had Federally-facilitated exchanges and seven had Partnership exchanges
  2. Should the plaintiffs prevail, those people who have gotten their insurance though the Federal exchange in the 27 states (and, depending on the Court’s view of the Partnership exchanges) possibly 34) would lose their Federal tax credit, and many would no longer be able to afford the insurance offered through the exchange. That is, these people would suffer a tax increase and lose their coverage to boot. 
  3. Without further Federal action, state officials in those 27 states could remedy that loss by establishing exchanges, possibly by contracting with the Administration and putting their own label on a portal to the Federal exchange. Should they fail to do so or should they decide not to because of their opposition to the ACA, they would likely feel local pressure. 
  4. While both sides will attempt to take political advantage in Washington, doing nothing would have a very interesting political effect and the greatest pain is likely to occur not in politically Blue States, but Red ones. In those states that have expanded Medicaid, but have opted not to set up their own exchanges, the political pressures on State officials would be even greater. They would have provided a benefit for the poor, but by inaction have declined to help others. 
  5. In any case, citizens of the 17 states that operate their own exchanges would continue to benefit from the ACA, most notably from the tax credit, while citizens of the other states would pay higher taxes and receive no benefit.

Try putting that inter-state comparison in your political calculator.

More to come.

{ Comments on this entry are closed }

Data Thought for the Day

by John W Rodat on October 16, 2014

Since the Obama election, and particularly the re-election, politicians have gained a deepening appreciation of the use of data for politics – for tactics, targeting and personalization.

It’s just a matter of time – and we can accelerate that time – before they have the same appreciation of using data for governing – including both policy making and operations.

{ Comments on this entry are closed }

A friend sent me an article from Crain’s on New York’s DSRIP program. (Here’s a related DSRIP article, this one from Modern Healthcare.) I think his query was largely, though not entirely, from a hospital perspective.

And in return, I sent him this one from Health Leaders on the expansion of new health care businesses

And then, he asked me what my takeaway was. 

My answer, especially when combining the two stories is that the center of gravity in healthcare is shifting from hospitals to health plans. It’s not that they’ve reached national parity in bargaining power, nor more importantly, parity within many, many markets – where the bargaining actually takes place. It hasn’t. But the balance is shifting. 

DSRIP requires cooperation – even to the point of governance and managerial nightmares – but that’s not the primary goal. The primary goal is to change the business fundamentals so that there are sustainable reductions in the need for hospitalizations. Why should hospitals cooperate in their own potential demise? Because there’s a narrow path that enables some to survive in a diminished and less inflationary system. The choice before hospitals in New York is between almost certain failure and the reduced risk of failure. These alternatives involve different timelines, but also different risk levels.

Add to that new data generating tools – the new apps enabled by the iPhone 6 are just an example – and their ability to leverage data where there is none or damn little today and the balance shifts further and faster. The ability to gather, synthesize, and leverage data is much more up the plans’ alleys than hospitals. 

And DSRIP is about Medicaid and sometimes Medicare. Will hospitals change practices just for one set of patients? No, not successfully. Also from Modern Healthcare, Hospitals are more willing to accept new payment models for privately insured patients as well:

Reform Update: Capitated payments more acceptable to providers, survey finds

A survey of 39 health plans released this week adds to mounting evidence that hospitals and medical groups are getting comfortable with incentive-based payment structures that reward quality and lower costs. This new snapshot includes surprising evidence that a significant percentage are willing to expose themselves to financial losses under a new generation of capitation models, which went out of vogue 20 years ago. 

The survey by Catalyst for Payment Reform, an employer-funded health policy group, found that 15% of what the participating health insurers spend on medical bills is paid under capitation. Experts cautioned the figure may be somewhat skewed because the health plans that responded to the survey included a lopsided number of insurers with capitation contracts, but that would not entirely account for the significant percentage. 

The survey reflects payments for 101 million people—about two-thirds of the nation’s privately insured. The rapid adoption of more robust risk-based payment models could foster more rapid changes to how patients receive medical care.

So the shift being pushed by DSRIP is not the only pressure on hospitals.

The best we might hope for from DSRIP is that goals and objectives of hospitals and health plans, and maybe even their financial incentives, will be better aligned. That will make easier the hospital transition to fundamentally different organizational models and payment arrangements and incentives. However, if they don’t adjust to this inevitability, it will be even more painful, most likely fatal, than if they do.

{ Comments on this entry are closed }

I had downloaded the data on distribution of surplus military equipment to local law enforcement agencies and started to work with it, but Greorge Gorczynskid did a very nice job in beating me to it. Here is his data visualization of The Militarization of US Police. Take a look at it. Gorczynskid’s work was done using Tableau Software, which is what I would have used. Here’s some background on the program:

This program is operated by Department of Defense’s Defense Logistics Agency’s Law Enforcement Support Office (LESO). The LESO facilitates the 1033 program, which originated from the National Defense Authorization Act of Fiscal Year 1997 (FY 97) (PDF). This law allows transfer of excess Department of Defense property that might otherwise be destroyed to law enforcement agencies across the United States and its territories.

A couple of other points:

Gorczynskid smartly matched the distribution data with crime statistics by state. (Click on his third tab, “Crime Trends.”) So you can see which states are outliers. As I began my work on the same data, I was looking at population and distribution data at the county as well as state level. I may still go back and look at that not only for total population, but to enable analysis taking racial and ethnic minority populations into account.

The biggest item in the news has been the distribution of Mine Resistant Armored Personnel Carriers (MRAPs). They are big. But they, at least can arguably be used for defensive purposes. What I haven’t figured out is the rationale for distributing thousands of bayonets and grenade launchers. Really?

The big, dramatic stuff, armored personnel carriers, automatic weapons, and so on, is what’s gotten attention so far and for good reason. But buried in the data is a lot of head-scratching small stuff too.  It’s not just the weapons and armament:

  • 23 soccer balls
  • 30 softballs
  • 9 bucket mops
  • 1 brass Bible stand, $667.
  • …Underwear. It just goes on and on

Over a half $billion worth each last year and the year before. So far, about $1/4 billion this year.

There have been some reports of equipment being distributed to schools. Here’s an example

A number of local agencies have been suspended from the program, primarily for failure to account for and presumably control what they’ve received. Here’s an example. This may be another one.

In local old news, some years back the Albany County Sheriff’s Department got a boat through the program. They paid so little attention to it that it sank.

Perhaps the most important point of all is best said with the old saw, “if you have a hammer, everything begins to look like a nail.” Acquiring heavy military equipment is not a neutral act.

{ Comments on this entry are closed }

it will be interesting to see if they can pull this off. This from the New York Times:

“Hospitals and Insurer Join Forces in California”

In a partnership that appears to be the first of its kind, Anthem Blue Cross, a large California health insurance company, is teaming up with seven fiercely competitive hospital groups to create a new health system in the Los Angeles area. The partnership includes such well-known medical centers as UCLA Health and Cedars-Sinai.

Anthem and the hospital groups plan to announce on Wednesday the formation of a joint venture whose aim is to provide the level of coordinated, high-quality and efficient care that is now associated with only a handful of integrated health systems like Kaiser Permanente in California, Intermountain Healthcare in Utah and Geisinger Health System in Pennsylvania. 

But the Anthem venture, for the first time, includes hospitals that are competitors and are not owned by the plan itself. Anthem will continue to offer other health plans, and the hospital groups will continue to have arrangements with other insurers. 

What they’re evidently not doing is fully integrating organizationally. What the micro-economists, Ronald Coase and Oliver Williamson have taught is that that’s not cost-free. There are transaction costs and organizational frictions exceeding what happens within one firm. Those costs may not be easily identifiable, but that doesn’t mean they’re not real. If nothing else, it takes more time to reach agreements, especially unambiguous agreements and there are certainly the costs of having to repeatedly negotiation resolution of new issues.

Getting even more concrete, some modest little issues might include:

  • Is Anthem going to pay on other than a fee-for-service basis, and if so, what? Are they going to use the same method for all the participating hospitals?
  • What information are they going to share. Will hospitals see even summary data from their competitors? 
  • What does Anthem represent as a share of each hospital’s business? Will hospitals change their operational practices for all patients based on what they do for Anthem? How long will Anthem tolerate such free-riding by its competitors?

Well, bless ‘em for trying. Tough road though.

{ Comments on this entry are closed }