Update on nearly completed international tax article

I’m very close to completing a full draft of a lengthy article on U.S. international taxation in the aftermath of the 2017 act. All I need to write, at this point, is the conclusion and an abstract. Lots of footnote work is also needed, but that’s unlikely to affect substance.

The article’s current working title is “The New Non-Territorial U.S. International Tax System.” Final length may approach 30,000 words, although I feel that it moves fast through the issues that it covers, rather than lingering. It covers a great deal of ground, in part by reason of its joining together (1) a general normative discussion of how to best think about the main set of international income tax policy issues, and (2) a moderately detailed assessment of 3 key international provisions in the 2017 tax act: the BEAT, GILTI, and FDII.

Combining both of these parts in a single piece makes it a rather long haul. But I think this is the right design for the paper, as the two are interrelated. It’s hard to assess the new rules without a normative framework. And I think it’s worth my while to update the framework that I’ve set forth in previous work (such as my international tax book) given the changes since then in the legal environment.

I’ll be presenting the piece in Vienna and Oxford in June, Ann Arbor in October, and Copenhagen plus presumably NTA (in New Orleans) in November. My current publishing plan is to put it in Tax Notes, for rapid turnaround and broad professional readership. Given the piece’s length, it probably would need to appear in successive weeks as part 1 and part 2. Maybe, with luck, I can shoot for September publication. I’d then be able to post the article on SSRN once a few weeks have psssed (Tax Notes has rules about this).

In principle I suppose I should incorporate the ideas in the piece into a second edition of my international tax book, but I’m not sure if this will happen, as I might prefer to spend the time and effort working instead on my literature and high-end inequality book(s).

Why resist the irresistible

Sylvester (the black-and-white) and Gary (with stripes) have never been able to resist what I call the “crack sweater,” because they respond to it so strongly. Indeed, they’re the ones whose constant kneading has caused it to look so shabby.

The book I’m reading on Kindle is Margery Sharp’s Something Light. Just discovered her after a mention in the Sunday NYT Book Review. Very good mid-century English comic writer; she’s been compared to Barbara Pym and Elizabeth Taylor (the novelist, not the actress), but also has what’s almost a touch of Wodehousean absurdity.
Eventually the babies (as we still call them at age 6) settle down, but it still makes reading a bit more challenging.

A new mix of experiences for me

In more than twenty years in New York City, I’ve never before looked out at the beautiful white flowering pear trees that festoon the West Village in late April, while listening to howling northern winds and knowing that I’m about to head out in the sub-freezing wind chill wearing my winter coat, earmuffs, and gloves.

Tax policy colloquium, week 12: Emily Satterthwaite on VAT exemptions for small businesses

This past Tuesday at the colloquium, Emily Satterthwaite presented Electing Into a Value-Added Tax: Survey Evidence From Ontario Micro-Entrepreneurs. This interesting empirical work has both a quantitative and a qualitative dimension, derived from surveying small suppliers in various lines of business at Toronto-area farmer’s markets. (Now there is some empirical research that I’d actually like to do – going to farmer’s markets, which I do intensively anyway from spring through fall, at least in years when there actually is a spring.)  The subjects were people who are not required to register as businesses under Canada’s VAT, because their annual gross receipts are less than $30,000 in Canadian dollars ($23,000 US). But they are allowed to register voluntarily if they wish.

The research sheds light on the design question of how high or low small-business VAT exemptions should generally be. In addition, micro-entrepreneurs’ behavior (and expressed attitudes or knowledge) around elective VAT participation may also be more generally illuminating, both about VATs and, more generally, this sector of the economy.

1. My priors on high vs. low mandatory VAT registration thresholds
In practice, VAT small-business mandatory registration thresholds vary quite significantly. I gather that there is no small business exemption in Sweden – if you have $1 of relevant sales, you are supposed to file. In Canada, as noted above, the threshold currently stands at about $23,000 in US dollars, and is trending down annually, since the nominal amount hasn’t been changed in more than 20 years and isn’t indexed to inflation. In the UK, by contrast, the threshold for mandatory VAT participation exceeds $100,000 in US dollars.

While I haven’t previously thought much about whether VAT mandatory registration thresholds should be low or high, I come equipped with attitudes (which I am of course quite willing to reexamine) suggesting that one would want to aim towards the low end.

Now admittedly, in favor of a relative high registration level are the points that:

(a) The social value of accurately measuring and collecting each dollar of correctly determined tax revenue is generally much less than a dollar. A payment of tax is a transfer, so the dollar is just moving from one dollar to another. Getting it right is obviously worth something – presumably, in efficiency and/or distributional terms – or else we’d just have a lump sum tax of some kind, but the marginal value of correctness is presumably just some fraction of the full dollar. This of course is standard Kaplow et al.

(b) Small businesses are likely to have higher marginal compliance costs per dollar of revenue collected than big ones; also the marginal administrative costs of auditing them may be relatively high. So one might have to climb up the scale a bit before it’s worth it.

But there are also a bunch of reasons or arguments for wanting to aim low. For example:

(a) VAT exemption amounts generally function as a notch or a cliff – unlike, say, income tax exemption amounts. E.g., if you’re one dollar under the VAT registration ceiling, you don’t have to collect any VAT from your customers. But once you hit the ceiling you have to collect it all, from the first dollar onwards. One of the students in the class found this great article about problems that this has been causing in the UK. Setting the threshold high tends to result in a bigger notch, and under the Sweden approach there would be no notch. The notch literature suggests that they’re generally bad as a design matter, unless the notch occurs at a low point in a multimodal distribution. Not clear how or why one would find such a thing in small business size, however.

(b) VAT exemptions can in effect create a tax preference for small business, inefficiently steering consumer demand towards them and inducing them to stay under the threshold. If you want a comprehensive and relatively neutral tax base, significant exemption thresholds will be at least a matter of regret.

(c) Consider again the point that small businesses are associated with higher marginal compliance and administrative costs. I noted above the possible conclusion that this may support exempting them from the tax. But suppose we look at it the other way around. Small businesses generate negative externalities if they’re exempted. If it’s better to have a comprehensive system with not just tax payments but information reporting that extends as broadly as possible, then one may think of the small businesses as imposing disproportionate costs on the system, rather than the system as imposing disproportionate costs on them. Or one may adopt a Coasean joint causation perspective, a la the railway and the hay fields.

Again, my hunch from all this tended to come down on the side of setting thresholds low rather than high. But on the other hand there’s a paper by Michael Keen and Jack Mintz, modeling the broader social welfare effects (but in light, I suspect, of the authors’ considerable empirical knowledge), that suggests it may often be optimal to set the threshold relatively high. I tend to have a very high regard for those two individuals’ work, so that does move the needle for me a bit.

2. When do or should small suppliers voluntarily register to participate in the VAT?
Again, Canada allows small suppliers voluntarily to register for VAT participation, and the paper’s main contribution is exploring when and (in their own stated terms) why they choose to do this or not.

But for starters, what one should think of voluntary registration? We tend to think of choice as good, especially where the state benefits from more people participating (so there presumably is no downside if they voluntarily opt in), unless one is especially concerned about the cost of having to choose. With respect to tax elections in particular, however, it’s often the case that (i) electivity is good if people are using it mainly to lower their compliance and planning costs, but (ii) it’s likely to be bad if they’re using it to lower their tax liabilities, since the value of the $$ to the government is an externality from their standpoint and it’s unclear why this filter would relate closely to whom we want to bear higher vs. lower taxes.

But anyway, when should we expect people to opt into the Canadian VAT? Financially, it tends to have both an upside and a downside. The upside is that one need not charge the VAT on sales directly to consumers. The downside is that VAT-registered businesses that sell directly to you will still charge the VAT, but you won’t get it refunded. This is especially disadvantageous if you then sell to another VAT-registered business, in which case, the ultimate downstream VAT collected ends up being higher than if all were registered, as there is unreimbursed cascading for the liability charged on your mid-stream purchase.

The paper’s empirical findings are roughly consistent with this. It finds no significant effect on the upstream side (i.e., whether a given farmer’s market micro-entrepreneur purchased inputs from VAT-registered businesses), but it does find significant effects on the downstream side (i.e., whether one sells directly to consumers, discouraging registration; or to other VAT-registered businesses, potentially encouraging it).

There is also some indication that informal considerations may matter. E.g., registering or not might involve either signaling or communicating type, although the alternative theories that might apply here are numerous. These might also feed back into influencing the normative analysis.

Youtube videos in which I discuss inequality

NYU Law School has now posted three Youtube videos (each just over a minute long) in which I discuss inequality.

In the first one, available here, I discuss the differences between high-end and low-end inequality.

In the second, available here, I discuss high-end inequality and luck.

In the third, available here, I discuss the extent to which U.S. efforts to address income inequality have succeeded (or not).

Tax policy colloquium, week 11: Jason Furman on growth and inequality

Yesterday at the colloquium, Jason Furman, who is now at the Harvard Kennedy School, presented Should Policymakers Care Whether Inequality is Helpful or Harmful for Growth?  Here are some of my thoughts about this very interesting paper.

In common parlance there are these 2 things, “growth” and “inequality,” that often are discussed without the speaker being very precise about what exactly either of them means.

The old conventional wisdom held: growth is good, inequality is bad, but there is a tradeoff between them. Not only they are empirically correlated, but more of either tends to result in more of the other.

There is an emerging new conventional wisdom in some circles holding that one can indeed have it all, i.e., greater growth plus lesser inequality, again with not just correlation but causal arrows running both ways.  Hence, directly addressing either can be win-win, improving the other as well.
The paper says: Not so fast. Jason suggests that, if he were a betting man, he would put his chips down in favor of “win-win” if the betting odds were 50-50, but not if they were much tilted the other way. This of course is just a description of his personal subjective probability for the causal relationship. But in part by reason of the relatively close odds, he says to those who favor addressing higher inequality: Don’t bet the house on this being true. After all, even if it’s true that reducing inequality could increase growth. And that’s not likely to be the reason why you care about inequality. So don’t unduly play down the other concerns by making the inequality debate one that is instead about growth.
Abroader point that the paper makes is conceptual: We need better-defined, more normatively meaningful, and more precisely differentiated categories than those that are offered by the general terms “growth” and “inequality.”
I will herein further discuss these issues in 3 parts: first growth, then inequality, then causal theories and takeaways from the topic and the paper.

1.         GROWTH

In the literature that Jason has in mind, “growth” is typically defined as the increase over time in GDP, either absolute or per capita. The higher future GDP is relative to current GDP, the better.
To dramatize the argument that he’s making, let’s start by abstracting from time. What would we do if our policy goal was that current GDP be as high as possible, full stop (i.e., not just, all else equal)?

As a tax person, I naturally think of making tax changes first. So Step 1 might be to replace all taxes on income, consumption, wealth, etcetera, with lump sum taxes, such as uniform head taxes. Hence we would wholly avoid discouraging productive economic activity.

Why stop there, however? We could also, at gunpoint, force people of all ages to work long hours. After all, this would increase GDP, and this by hypothesis is our sole policy aim.
Or if that’s too radical, we could further raise lump sum taxes, such as uniform head taxes, in order to fund income subsidies, under which, the more you earn, the more the government pays you (instead of you paying it).
Something else we might do, if all we cared about was increasing GDP, was to confiscate people’s wealth (while somehow credibly promising that we would never do it again). The income effect of wiping out people’s savings would be to induce them to work more, so as to start replacing it.
By the way, lest this last idea seem too fanciful, it’s worth noting that, in the dynamic growth model that the Joint Committee of Taxation used with regard to the 2017 tax act, one of the sources of GDP growth within the budget window was the assumption that, with the fiscal gap being reduced outside the window with lump sum takeaways from people who had, say, expected retirement benefits under present law, such individuals would farsightedly respond by working more in response to the expected calamity for them. Not too much was said publicly about this supposed cause of “dynamic” growth effects.
Does this sound like an appealing policy proposal? I figured it wouldn’t. But what makes it so unappealing is that we don’t actually care just about GDP. The measure ignores the value of leisure, distributional considerations (i.e., who gets $$ or leisure), and all other relevant amenities and disamenities.
Given the obviousness of the point that maximizing current GDP is not a well-stated policy goal – except as modified to take account of a whole lot more – why would growth proponents state the long-term social goal as maximizing future GDP? The short answer is that they’re being foolish or myopic insofar as they focus just on GDP, without reference to distributional considerations, how much people have to work if they’d rather not, and a wide array of relevant amenities and disamenities. But I think their doing has been encouraged by a couple of things:

(1) Since we know less about future distribution than current distribution, people who are seeking a rhetorical edge as they urge the enactment of what they assert are pro-growth policies, have a degree of freedom simply to assume or assert that a rising tide lifts all boats (in tension with the actual facts about rising US GDP over the last 20 years, which has featured about a 0% share at the bottom).

(2) There are multiple narratives associated with comparing the future to the present that can lead to treating GDP growth as something to be welcomed more unconditionally and unreservedly than just good old GDP itself. For example, there are:

–Biological narratives: We like to think of our own lives as improving over time. And parents often want their children to have better lives than they are having themselves.
Psychological narratives: Habituation to one’s current material circumstances may prompt wanting them to improve.  And, by dreaming of a better future, one may sometimes soothe one’s discontent about the present.

Historical narratives: Humanity’s economic rise from the Stone Age to the dawn of civilization to the Industrial Revolution and beyond has not gone unnoticed. We may also have examples of mind of countries that “succeeded” versus “failed” from common starting points, with the former experiencing far higher GDP growth. Examples might include the U.S. versus Argentina (which were on a par, as to per capita GDP, in the 19th century, or West Germany versus East Germany between the end of World War II and 1989 unification, or South Korea vs. North Korea. But in each of these examples GDP growth might be better seen as a consequence of greater success, rather than itrself an independent cause.

But whatever the force of these narratives, they don’t support ignoring that, for the future just like the present, GDP and social welfare are not equivalent. So I commend the paper for suggesting that we should be skeptical about just maximizing future GDP, just as we would not treat maximizing current GDP at all costs as a plausible summum bonum.

2.         INEQUALITY

The paper doesn’t interrogate “inequality” to the same extent that it does “growth.” But it could!

For example, I frequently emphasize the important differences between high-end inequality (e.g., plutocracy) and low-end inequality (e.g., poverty), notwithstanding that they are commonly  blended together in a single term (or in a composite measure, such as the Gini coefficient). They may matter for different reasons, and have different effects.

Thus, suppose one thinks inequality may reduce growth because the super-rich capture the political process and engage in rent extraction. That’s about the high end. Or suppose one thinks that poverty leads to a failure to develop children’s human capital. That’s about the low end.
In my view, the typical welfare economics maxim that the main reason for aversion to inequality is that material consumption has declining marginal utility does a better job of capturing the main issues by low-end inequality, but much less with respect to high-end inequality.

Even if high-end and low-end inequality were effectively the same, a given Gini measure that equates them could be under-informative regarding how the composite actually affects people’s wellbeing in one society, as compared to another.  It may matter, for example, whether a given society features high or low social and economic mobility. Or it may matter whether (a) old elites are being challenged by new ones, or (b) it’s just new people not much different than the old.

Then of course there are such questions as “equality of what?”  Typical candidates might include wealth, consumption, personal lifetime income, dynastic lifetime income, status, legal rights, political power, opportunity, etcetera.

One may also subscribe to an ethical theory under which it matters whether, or to what extent, economic success and failure are thought to be deserved. Meritocracy, for example, can be thought of as a theory of distributive desert. A meritocrat might ask: To what extent do people’s success and failure in my society depend on what I define as merit?

While there is no tension between any of this and the paper, it suggests an arena in which the paper’s deconstructive exercise could further be pursued.


Sometimes we say: What we need in Area X is a good theory. That is not the issue when we’re considering the relationship between inequality and growth. Rather, there are if anything too many good theories. And, in at least some cases, they may be inconsistent, rather than complementary or offsetting.
Here are just a few:
(1) High-end inequality leads to greater growth, perhaps because the rich save and invest more (Kaldor).
(2) High-end inequality leads to lower growth, because (perhaps via its effect on the fiscal self-interest of the median voter), it prompts the adoption of higher capital income taxes that are anti-growth (Alesina-Rodrick).
(3) High-end inequality leads to lower growth, because the rich use their greater sway to increase rent-seeking (Acemoglu et al). Note that this theory, while having the same causal relationship as Alesina-Rodrick, bases it on a view of the rich as politically strong, rather than politically weak. Hence, one might expect some tension or even incompatibility between the two theories.
(4) Higher growth leads to greater high-end inequality, perhaps because technological transformations proceed via tournaments with concentrated mega-winners.
(5) Higher growth leads to lower high-end inequality, perhaps under a Kuznets model in which it eases (from the diffusion of new knowledge and production methods) as the society grows richer.
(6) Low-end inequality leads to higher growth, perhaps via the deployment of a mass low-wage workforce.
(7) Low-end inequality leads to lower growth, perhaps from wasted human potential as children in poor households suffer from under-privilege.
Each of these theories might at least sometimes be true, and several could be true (perhaps even offsetting each other) at the same time. But the plethora of causal pathways undermines thinking that there will be a stable relationship between inequality and growth, even disregarding all the issues raised by too simplistically deploying either of these two terms.
The paper urges of thinking in terms of a 2 X 2 grid, which might (under a progressive’s view of the issues) look like this:

PRO-GROWTH                         ANTI-GROWTH

PRO-EQUALITY      Education, aid poor children,     Capital income taxation

pro-competition (antitrust,          Redistributive taxation?

weaker IP), min wage/unions?


ANTI-EQUALITY    Opposite of Box 2?                     Opposite of Box 1?

Needless to say, there is considerable controversy regarding the assignments above of particular items to particular boxes. But insofar as something does indeed belong in Box 1, it would be dismaying, albeit unsurprising, to see it being rejected by prominent political actors.
A key argument of the paper is that, in economically advanced countries that have been politically stable and considering a relatively limited policy spectrum, there should be a “lexicographic” preference for Box 2 policies (upper right) and against Box 3 policies (lower left). The rationale is as follows. Suppose we look at advanced and (heretofore) stable countries with relatively pro-market policies, such as the US and the UK, and compare them to countries with very different, more pro-regulatory and redistributive policies, such as France or the Nordic nations. The growth differences between them over time have been so small that surely the distributional differences are more consequential. Hence, in a country like the US we should start by ranking our policy choices based on their distributional effects, and only use growth effects as a tiebreaker. The paper agrees that this approach is generally not well-suited to poorer countries with still-developing (or undeveloped) economies, in which basics such as the rule of law may be in question.
Adopting this lexicographic preference for looking at distributional effects first, and growth (or efficiency) effects only secondarily, would be a rather large change in U.S. policy debate. Consider how it compares to consideration of the 2017 tax act. Or consider the Kaplow-Shavell proposition, much debated in the law schools, that distribution issues should be left purely to the tax and transfer system, with all other legal issues (e.g., concerning corporate governance, torts, contracts, intellectual property, etcetera) being analyzed purely on efficiency grounds.
I’m reminded of Boris Bittker’s gibe, from the 1970s, to the effect that the Yale Law School faculty was a mix of young fogies (who cared only about efficiency) and old Turks (who cared only about equity or fairness). The young fogies prevailed for decades, but might the tide be turning again?

Tax policy colloquium, week 10: Ajay Mehrotra on US history and the VAT

Yesterday at the colloquium, Ajay Mehrotra presented an early stage of an interesting and important long-term research project that’s entitled “The VAT Laggard: A Comparative History of U.S. Resistance to the Value-Added Tax.” The project aims to explore why the U.S. remains the only advanced industrialized nation that doesn’t have a VAT or other such national consumption tax.

One underlying datum for the inquiry is that there have been at least five particular moments in U.S. fiscal history when the enactment of a national consumption tax has been on the agenda poliitically, and seemingly had some chance of happening, but didn’t. So among the questions posed is whether these were unique events, or instead had common causation, perhaps even sounding in “American exceptionalism.” The moments were as follows:

1) Early 1920s – With post-World War I fiscal retrenchment taking place amid a switch from the Wilson Administration to Republican leadership, major tax changes were being considered. The great Treasury economist T.S. Adams, known to tax folks today mainly by reason of the Graetz-O-Hear article that described his central role in creating the international tax credit, also more or less invented the VAT in 1921, and tried unsuccesfully to get it adopted by Congress. Business ambivalance and opposition to such an instrument, which was not as yet well understood or in place anywhere, apparently played a role in this outcome. So did the fact that the income tax had helped finance World War I and that the Republicans were not aiming to go back to pre-World War I finance. What happened instead was mainly just a lowering of income tax rates, which had risen to very high levels in order to help finance World War I.

2) 1940s – The Roosvelt Administration publicly considered the possible adoption of a national retail sales tax, in order to help finance World War I.  States’ opposition, reflecting that many of them had recently adopted their own retail sales taxes, was one of the factors behind the decision to rely instead on expanding the income tax.

3) 1970s – The Nixon Administration publicly floated the idea of replacing residential school property taxes with a national VAT to fund public education. The 1976 Blueprints tax reform study also discussed the adoption of a national consumption tax, and Ways & Means chair Ullman notoriously lost his 1980 reelection bid after advocating the national adoption of a VAT. The late 1970s tax revolt and election of Reagan appears to have shut this down. Tax reform in 1986 was really focused on the income tax, although the 1984 Treasury “bluebook” report did discuss consumption taxation as an alternative option.

4) 1990s and early 2000s – By this decade, national consumption taxation had become a standard feature of tax reform discussion, such as in the 1995 Nunn-Domenici plan, and the report of President Bush II’s 2005 Presidential Advisory Panel on Tax Reform.

5) 2016 and 2017 – Ted Cruz’s “business flat tax” proposal in his presidential campaign would have been a VAT by another name, and then the DBCFT, as I discuss here, would have replaced corporate income taxation with a VAT plus wage deduction.

A further important point to reflect on here is that, at the global level, countries with VATs tend to have less progressive tax systems than the US, but more progressive fiscal systems. This reflects that VATs often help fund larger-scale social welfare benefits. But the correlation raises underlying causal questions, such as which caused the other insofar as there wasn’t independent causation for each. One might also note that, at the state level in the US, states that rely heavily on sales rather than income taxes seemingly do not tend to have more progressive fiscal systems. But this may partly reflect both (a) using sales taxes in lieu of income taxes, rather than both as distinct from just the latter, and (b) the lesser market power underlying states’ sales taxes than those of many countries, given the ease of moving between states or even just avoiding retail sales taxes via cross-border / online / mail order shopping.

What might be some of the leading theories regarding why this never happened? (The word “this,” of course, embraces a range of very different options – e.g., VAT as add-on and VAT as income tax replacement.) An initial list, pending the fruits of Mehrotra’s research, might include at least the following:

1) General VAT enactment obstacles – There’s no need for American exceptionalism to support the observation that voters around the world generally do not leap up and cheer when a new and potentially capacious tax instrument is proposed. The two main stimuli that have led to VAT adoption in other countries are: (a) replacement, as per the VAT’s introduction in continental Europe in the 1950s as an improvement on prior gross receipts taxes that imposed cascading tax burdens on interbusiness sales, and (b) outside pressure, as in cases where countries were pushed by the EU to adopt VATs as conditions of membership, or by the IMF to adopt them as conditions of receiving aid. Another example of the same phenomenon is New Zealand’s adoption of a VAT in the face of significant budgetary pressure.

In the US, (a) has been attempted  but perhaps the taxes targeted for replacement haven’t been unpopular enough, while (b) hasn’t happened to us at least yet. (Future fiscal crisis, or at least entitlements funding crisis, anyone?)

2) American exceptionalism – When one is speaking about American exceptionalism, the leading suspects include (a) slavery and the indelible sin of ongoing racism, (b) the importance of the frontier, among others. Each of these could arguably play a role here. (A) helps explain the lack of a broader social welfare system that would strengthen the need for VAT financing, given our heterogeneity’s effect on voter interest in helping the poor. (B) helps explain anti-government sentiment that might heighten opposition to higher taxes. But Mehrotra’s research may help to illuminate any connections.

3) The Larry Summers joke – Someone please ask Larry Summers: Did he actually make the famous VAT joke? I’m told that a mention of  this first appeared in the NYT in the 1980s, but apparently even this reference isn’t a direct quote but rather refers to the report that he said it.

The joke, in any event, goes something like this: The U.S. doesn’t have a VAT because conservatives view it as a money machine while liberals view it as a tax on the poor. But if only liberals came to realize that it is a money machine, and conservatives that ti is a tax on the poor, then surely we would get it immediately.

As I’ve noted elsewhere, the joke is “deliberately paradoxical. Why should each side be so fixated on the bad outcome, rather than the good one, as judged from its normative perspective? The underlying empirical claim would therefore appear to be nonsensical, if not for the fact that it also appears to be true.” With regard to current non-adoption of a VAT, including as a hidden component of corporate income tax replacement via a business flat tax or the DBCFT, I’ve suggested the relevance of risk aversion. From the standpoint of both Democrats and Republicans, a fiscal system with a VAT could be BETTER by their lights than the existing one if they get to control the other adjustments to taxes and outlays, but WORSE by their lights if the other side gets to do so. This creates a bit of anxiety and uneasiness about adding this instrument to the fiscal system even if one is currently in control.

4) Path dependence – Mehrotra will also, in the project, be exploring the idea of critical moments at which, perhaps, something just because it happens (or more specifically, for reasons idiosyncratic to that era), but then it has broader ramifications down the road because it has set the path. The QWERTY keyboard is of course the classic path dependence story. Assuming the literature is right, it was initially adopted to slow typists so they wouldn’t jam early machines, but is not suboptimal for modern keyboards yet locked in. As applied to the VAT, however, one question to keep in mind is whether, or to what extent, recurrence of the national consumption tax issue implies fresh causation each time – with or whether out common explanations, e.g., from something in the “American exceptionalism” area.

I don’t rule out possible future U.S. adoption of a VAT, although I consider it neither imminent nor especially likely. Or, to put it differently, if there are 100 most-likely U.S. national futures, let’s say in parallel universes any of which might prove to be ours, some of them surely feature a national consumption tax, with or without the name, although I’m not here offering to bet on just how many or how few. (That would be a subjectivist, rather than a frequentist, measure anyway.) Multiple pathways to a national consumption tax might include (1) conservative control and it replaces a lot of income taxation without Graetz-style adjustments to retain progressivity, (2) liberal control and it funds new programs such as free college tuition or Medicare For All, and (3) fiscal crisis where it’s deployed to “save Social Security and Medicare.”

Recently enacted New York State budget law, and some of its federal income tax consequences

New York State has enacted a new budget law, Senate Bill S7509C, that is available online here.

Of particular interest to many of us may be (1) Part LL, starting at page 47, establishing a Charitable Gifts Trust Fund, and (2) Part MM, starting at page 56, which establishes an Employer Compensation Expense Program.

The Charitable Gifts Trust Fund creates two distinct accounts, one called the “health charitable account” and the other called the “elementary and secondary education charitable account.” The moneys contributed to each (or otherwise accruing to it) are held separately from each other and everything else under the state’s purview, under the joint custody  of the state comptroller and the commissioner of taxation and finance. These moneys generally are required to be expended only for specified services that relate to the purposes indicated by the accounts’ names.

Starting in 2019, by making a timely contribution to one of these accounts, one can qualify to receive an 85 percent tax credit against New York State income tax liability with respect to the amount contributed.

I haven’t yet had a chance to do any serious analysis of this provision – pertaining either to how it works, or to its effects on federal income tax liability. But suppose one makes a $100 contribution to one of the funds, thereby reducing one’s New York State income tax liability by $85. Assuming a favorable federal income tax analysis, this yields the contributor a federal charitable deduction in the amount of $100. Depending on the relevant marginal rate, this could potentially reduce one’s federal income tax liability by more than the $15 difference between $100 and $115.

For this result to follow, the federal income tax measure of the charitable contribution would have to be $100, not $15. But there are both administrative and case law precedents in support of this result. (And note that, when a charitable contribution is deductible under New York State law, one generally does not have to reduce the federal value of the contribution by the state tax saving.) The use of the funds would also need to have economic substance, compared to simply paying state income taxes. But if you read the new law carefully, you will see the aspects of such substance that a contribution to either of the funds has – in particular, given the degree of pre-commitment of the funds. Indeed the NY State legislature might receive useful information from contributions to the two programs regarding donors’ substantive policy preferences.

Under Part MM, the Employer Compensation Expense Program, employers that are required to withhold income taxes from their employees’ wages can elect to pay a special payroll tax that equals a specified percentage of the payroll amounts paid to covered employees. The percentage is 1.5% in 2019, 3% in 2021, and 5% starting in 2021. Covered employees get state tax credits for their shares of the special payroll tax thus paid by the employer.

With the caution that my understanding of the provision remains very preliminary, the effect may be as follows. Suppose that I am a covered employee of an electing employer that is taxed at the 21 percent federal corporate rate, and that in 2019 the employer paid me $1,000 of wages, on which it paid a $15 special payroll tax. My New York State income tax liability declines by $15.

Deducting the $15 payroll tax as a business expense would reduce the employer’s federal income tax liability by $3.15 (at the 21% rate). Meanwhile, my reduced state income tax liability has no adverse federal income tax consequences for me, assuming that the transaction is respected for federal income tax purposes, if the extra $15 would havbe been nondeductible anyway. Also, the employer’s current year tax flows would not be adversely affected by keeping current on the new payroll tax, insofar as it comparably reduced state income tax withholding on its employees’ behalf, to reflect the expected reduction in their ultimate state income tax liabilities.

Let’s assume that the federal income tax results here are indeed as stated. Why would the employer make the election, given that it’s still worse-off after tax under the stated facts? The main point here is that, as a general matter, employees may be willing to accept less pretax compensation when they are paid in a more tax-favorable, rather than a less tax-favorable manner. For example, suppose that my employer offered me a choice between (a) a higher salary but no employer-provided health insurance, and (b) a lower salary but with federally excludable health insurance benefits. It would be unsurprising if I agreed to (b) in lieu of (a), in part or even wholly by reason of the federal income tax savings. This is par for the course.

More broadly, it’s long-accepted Tax Planning 101 that parties engaged in arm’s length transactions with each other will often have the flexibility to determine which of them will bear particular tax consequences, either favorable or unfavorable. Thus, in my Tax I class, I have long emphasized what I call “collective tax minimization” – the fact that, so long as the transaction parties can duly adjust multiple transaction terms, they may mutually benefit from their structuring their agreements in such a way as to keep their collective tax liability as low as possible.

Thus, consider employee stock options. As a practical matter, they often can be structured to be either (1) currently deductible by the employer and includable by the employee, or (2) currently neither deductible nor includable.  (To simplify, let’s ignore here questions of future deductibility and includability, and of the possible effect on future employee capital gains realizations.) All else equal, (1) is better for the employer, and (2) is better for the employee. But if they can adjust the gross (i.e., pretax) value of the option grant to reflect whether they are choosing (1) or (2), then their interests may align.

For example, suppose that the employer faces the corporate rate of 21%, while the employee faces the top individual rate of 37%. Then option (2) is collectively better for the two parties combined than option (1). But, for each $100 of stock options granted, (2) is $21 worse than (1) for the employer (all else equal), albeit $37 better for the employee.

Not to worry, however – both are better off under (2) than they would have been under (1) so long as the option grant is between $21 and $37 smaller (per $100 of options that would otherwise have been granted) under (2) than it would have been had they chosen (1).

Obviously further legal analysis is required before one can definitively set forth the federal income tax consequences of employers’ electing to participate in the Employer Compensation Expense Program. And participation in the Program may not be the easiest thing in the world to establish and explain adequately to employees. But this provision, like that pertaining to the Charitable Gifts Trust Fund, has the potential to mitigate the adverse consequences to New York State residents of the 2017 tax act’s largely repealing state and local income tax deductions. And it does so within the 2017 act’s deliberate contours, which were based on the view that employer business expenses, like individuals’ charitable contributions, should generally be treated more favorably than individuals’ payments of state income tax liability. So both provisions can reasonably be viewed as wholly consistent with the intent behind the 2017 tax act.

Neil Young

In my last post I quoted Neil Young, because the IRA / age 59-1/2 issues raised by Damon Jones’ paper brought to mind “Tell Me Why” (one of two great songs that I know with that title) from After the Gold Rush, and in particular “Is it hard to make arrangements with yourself / When you’re old enough to repay, but young enough to sell?”

I’ve always interpreted those words as being about having a first home with mortgage, but who knows – odd stuff for a 25-year old to be writing a rock song about, albeit one who presumably already had some earnings from his musical career.
This in turn has gotten me started re-playing my two favorite Neil Young albums, After the Gold Rush and Everybody Knows This is Nowhere, especially when I get a chance to go to the health club (where I implement my ongoing anti-aging strategy). But Young may have a different such strategy. When I saw him a couple of years ago, doing a solo show at Carnegie Hall of all places, he looked like a crotchety old man (although still with energy and focus) shambling between piano, guitar, etc. from song to song, as he dipped into his rather deep backlist.
Back when I was a lot younger and Neil Young was, too (albeit more than a decade older than me), finding out about new artists could be tricky. You’d hear their names, but the good ones weren’t on the radio, and obviously there was no Internet or file-sharing. Unless your friends or roommates had LPs by particular artists, you couldn’t find out what their music was really like unless you took the plunge and bought LPs on a limited budget.
I knew CSNY fairly well, but considered them a bit mild and over-sweet compared to the prior generation (Beatles, Rolling Stones, Who). I also wasn’t entirely clear yet on Young’s relationship to the other three, except I knew he wasn’t always with them. But then one day when I was out in Berkeley visiting family, I took the plunge, $2.75 for After the Gold Rush (used) in a hippie record shop next to the campus (either Rather Ripped Records or Rasputin Records). I liked its raw weirdness right away.
Being less subtle in my youth, I once set “Only Love Can Break Your Heart” to start playing moments after a friend of mine walked through the door with his new girlfriend, about whom I (rightly) had a bad feeling, shortly after he had broken up with a longtime girlfriend who I thought was a great person. This is harder to do with an LP than, say, a Spotify playlist.
I remember once reading about Stephen Stills in relation to Young. Stills reportedly had a lot of trouble getting over the shock that this weird nerd guitarist whom he had invited into Buffalo Springfield ended up becoming so much bigger a star – an outcome that he had initially rated at zero percent probability. (In fairness to Stills, I gather that Young is exceptionally frustrating to deal with.) In the Buffalo Springfield days, they would have all these fights about the direction that the band should take. Stills kept pointing out that Young (1) had no singing voice, and (2) wanted to play folk music in a rock band.
To which the correct answer, of course, is “Yes, but what’s your point?”

Tax policy colloquium, week 9: Damon Jones on responses to IRA early withdrawal penalties

Yesterday at the colloquium, Damon Jones of the University of Chicago Harris School of Public Policy presented “How Do Distributions from Retirement Accounts Respond to Early Withdrawal Penalties?”, an empirical study using IRS data. (Damon’s coauthors were Gopi Shah Goda and Shanthi Ramnath.) But before getting to the paper, a bit of background:

“Is it hard to make arrangements for yourself / When you’re old enough to repay, but young enough to sell?”

So asked Neil Young at age 25. But by the time you’re 59-1/2, while one hopes you’ve repaid and built up some equity (if you’re a homeowner), are you still young enough to sell? It can be a transitional age, as I know from recent personal experience – rather late to start a new business or career or start living in a new place (except for a few relatively privileged and successful people), rather early to be thinking about retirement, and – at least for significantly older age cohorts than my own, when people tended to marry and start raising families (if that was their path) by their early to mid twenties – a bit late to be putting one’s kids through school.

It’s thus always seemed a bit odd to me that individual retirement account (IRA) early withdrawal penalties – discouraging withdrawals that undo the IRA provisions’ aim of encouraging retirement saving – cease to bite when one reaches the particular age of 59-1/2.  That’s awfully early from a retirement standpoint, yet a bit late for some possible uses of the funds – e.g., handling major life cycle expenses or charting a new course in life.

I gather that the use of this age dates from 1974 ERISA legislation, which included an IRA provision although prior to the full-fledged IRA boom. This was a time when “normal” Social Security retirement started at age 65, with “early” starting at age 62. I’m told that someone or other who was guiding the ERISA legislation, possibly at the staff level, apparently figured that age 60 was about right for permitting people to withdraw retirement savings without penalty, perhaps on the ground that voluntary retirement savings had to be more leniently structured than the mandatory kind that Social Security offers, or else people wouldn’t opt in sufficiently.  The reason for then picking 59-1/2, rather than 60, was to make it seem more appealing still to the prospective participants, just as retail stores offer 99-cent pricing.

Anyway, IRAs to this day have early withdrawal penalties. For traditional IRAs, you’re taxed on the withdrawal (in effect, under normal income tax rules), but with the addition of a tax penalty equaling 10% of the amount withdrawn. There is a hardship exception to owing the penalty, but it’s fairly narrow – covering, for example, death or disability, unreimbursed medical expenses, and health insurance premiums while unemployed.

Evaluating the withdrawal penalty requires a deep dive into theories of lifecycle optimization – how would people generally be expected to optimize the allocation between periods of their lifetime resources?  And secondarily, why would the government seek to influence what people do? Here the main rationale is behavioral – e.g., if we think that people are prone to irrationally under-saving for retirement – but there are also aspects of possible market failure (e.g., difficulty in life-annuitizing sufficiently given adverse selection, or on the other side difficulty of borrowing against future earnings) and moral hazard (e.g., expectation of being supported at retirement if one under-saves).

It’s often said (as a convenient, if over-simplified, shorthand) that the goal is to increase consumption-smoothing, which at the limit would mean equal consumption in all periods, if all periods are otherwise the same and one has period-specific declining marginal utility of consumption. But of course there can be rational reasons, unrelated to market failure or moral hazard, for favoring higher or lower spending in different periods. These might range, for example, from one’s taste for consumption when young versus old, or alternatively for a particular pattern such as rising consumption or for periodic “binge” years. One also might have periods with especially high needs (e.g., to launch one’s children or pay uninsured medical costs).

But the standard conclusion, which I certainly accept, is that the main aim calling for policy intervention with regard to saving – leaving aside the question of borrowing against future anticipated resources – is to push people towards greater retirement saving. Only, one should think about the possibility that they will want to take back some of the savings sooner than anticipated. Now, just to make voluntary retirement saving more attractive than it would otherwise be, one might want to allow some of this. If the saving is voluntary, rather than mandatory like Social Security (assuming effective barriers to borrowing against expected future Social Security benefits), then one reason for allowing early withdrawal – even where it might undo the policy to a degree – is to reduce people’s reluctance to participate. But in addition, unanticipated shocks may contribute to early withdrawal’s being apparently optimal in some cases, and not just “leakage” that reflects the reasons for expecting too-low retirement saving.

The Jones et al paper makes an ingenious use of IRS data to examine the effect of the 10% early withdrawal penalty on behavior around traditional IRAs. It examines withdrawals during a 5-year window for people, around the year in which they turned 59-1/2. As it happens, given the period covered, these were people born between 1941 and 1951. The data includes people’s birthdays, how much they withdrew in a given taxable year, and what they paid in penalties. Since the actual withdrawal dates aren’t known (other than whether they led to a penalty), the key distinction is that between people who turned 59-1/2 early versus late in the middle year. If this date was early in the year, then (a) any penalty paid could have been avoided by waiting not all that long, and (b) there was more time in which withdrawals could be made after the penalty had ceased to apply (a distinction that one could imagine not mattering all that much, but in the data appears to have mattered).

A central finding was that people very much did respond to the early withdrawal penalty. In other words, it discouraged prior withdrawals, as it was meant to. This would have been an obvious result if we could assume that people are well-informed and acting rationally, but given questions about that it was worth establishing empirically. One imagines that the prominence of this date, perhaps including in brokers’ solicitations and the like, would help to fix it in people’s minds, along with the strong evidence from behavioral literature that people hate “penalties.”

But a secondary finding in the paper is that a distressingly high percentage of individuals, many of them low-income, incur the early withdrawal penalty when it seems relatively irrational to have done so – e.g., when one’s turning age 59-1/2 must have been relatively close, rather than being far enough in the future (such as December) that even a 10% penalty might have been preferable to, say, multiple months of high interest rate credit card debt.

Especially as applied to lower-income individuals, the reason for the penalty is so that it won’t be incurred and people will retain their retirement saving. But it uses a cliff, and if it’s being incurred too frequently that might indicate a need to rethink the discouragement design.

A related issue concerns the main reasons for favoring both greater retirement saving and an ability to access funds pre-retirement (and also potentially to borrow) in response to great needs. Just as we want people to have adequate retirement saving, so we want them to be able to meet major medical needs, smooth consumption when they lose their jobs, handle disability expenses, etcetera. And both choice failures and market failures (along with having low lifetime income) may undermine their doing this adequately.

When one is thinking about these various needs, whether incurred later or sooner, there are two complementary perspectives that one could have in mind. The first is optimization. Is one making the best use of one’s lifetime income, treating that as fixed? In principle, improving someone’s lifetime optimization, such as by adjusting for choice failures or solving market failures that the government is better equipped than private firms to address, can make her better-off at zero cost to everyone else. (If this sounds paternalistic, it is – but if one wholly rejects it then one should question the existence of anything like Social Security. A key mitigating idea is that, much of the time, one will only be forcing people to do things that they wanted to do anyway. E.g. I personally am not being forced by Social Security to over-save for my retirement. It’s in effect just a floor on my retirement saving.)

The second approach I’ll call adequacy, for want of a better word. We may want to make sure that people can meet basic needs of retirement, sickness, disability, etcetera.

The optimization perspective arguably supports having Social Security as a forced retirement saving program that could in principle be actuarially fair as to everyone (although of course in practice it transfers resources between age cohorts, different types of households, etc.). The adequacy perspective might call for, say, paying demogrants to retirees, and then separately deciding on the financing for this benefit as just one more input to one’s overall distribution policy. Likewise, it might call for a general safety net approach to earlier needs arising from unemployment, accident, sickness, etcetera.

While we employ aspects of both approaches in the U.S. fiscal system, a more generous social safety net and approach to adequacy might ease (although not eliminating) our concerns about optimization, by mitigating both the worst failures of optimization and our conviction that there are failures (given the connection between consumer choice and reasonably presumed utility). Thus, for example, returning to IRAs and early withdrawal penalties, the case for allowing hardship withdrawals without penalty would be eased if the needs that most obviously might trigger this approach were better approached by our fiscal system from the adequacy perspective.