Part 2 re. yesterday’s NYU Tax Policy Colloquium discussing Daniel Hemel’s Law and the New Dynamic Public Finance

 The immediately preceding blogpost offers part 1 of my reflections concerning the paper by Daniel Hemel that we discussed at the NYU Tax Policy Colloquium yesterday. Herewith part 2, discussing just three of the particular issues that the paper raises. (There are actually lots more, if you wish to take a look.)

Relevance of age differences; age-dependent tax rates – Whether you are looking at the same person at different life stages, or the average individuals at given age levels at a particular point in time, immense age-dependent differences become apparent. People at different ages generally differ in their average income and wealth levels (and degrees of dispersion), labor supply elasticity, and it is plausible to say utility functions. This has recently prompted a large economics literature on age-dependent taxation, including age-varying tax rates. This literature may be identified with NDPF, although much of it self-labels as OIT. Moreover, while it deals with how people change over time, to some extent the changes are predictable in advance, whereas NDPF often is focused on stochastic change.

As the Hemel paper notes, people might be expected to want more insurance via the tax system against “ability risk” in their 50s and 60s than in their 20s. Also, current labor supply elasticity is probably higher in the earlier period, given not just “gap years” and such but also schooling that aims to increase future earnings at the expense of current earnings. Then of course people tend to become more interested in retirement, and perhaps less able to earn current income or to have hopes of doing so in their future, as they enter their 60s and beyond.

There is nonetheless a widespread intuition or feeling that age-dependent tax rates are inappropriate. Their use therefore tends to congregate in particular side-realms. Examples include the earned income tax credit’s being limited to people between the ages of 25 to 64, and the Social Security rule (for many years) under which benefit payouts would be reduced if one had current wages.

This topic is rich for further exploration. One area that comes to mind is income averaging, which existed in the federal income tax from 1964 to 1986 but then was generally eliminated. While the use of income averaging is not inherently age-related, those rules were designed to cover up-and-down swings in how much people earned in a given year (although, to benefit, you needed to have the low-income years first). They were intended and designed (albeit imperfectly) not to cover the case where, say, you were a currently impoverished law or medical student whose income then shot up, not due to volatility but because you had newly entered your high-wage years. But it’s actually not obvious that they should be so limited.

For example, consider two individuals who enjoy the same lifetime earnings in present value. But A has expensive schooling until age 28, whereas B enters the workforce at age 22 and earns less per year but the same in PV due to the 6 extra years. Insofar as we think that lifetime income is the proper gauge, they ought to pay the same lifetime taxes in PV, but A might end up paying more due to graduated rates. Indeed, A might even be worse off in a lifetime sense if she has to backload her consumption due to the difficulty of borrowing against “great expectations.” (We also have to think about the total and marginal welfare implications of A’s six extra years slaving away in law or medical school rather than entering the paid workforce immediately.)

Now suppose we change the facts so that A simply postponed working, in order to have fun traveling the world, knowing that she could earn more once she started. Here it seems that she is actually better-off than B in a lifetime sense, given that in her voluntary non-working years she got to enjoy herself rather than slave away in school. But it’s not clear that the possible impact of graduated rates to the concentration of the same PV earnings into future years gets us quite to the right place. 

Anyway, age-dependent taxation is bound to be part of the analysis here, however it might come out.

History-dependence in unemployment insurance (UI) and Social Security disability insurance (SSDI) – The paper also discusses certain history-dependent rules that apply in UI, SSDI, and also in such areas such as tort compensation for injury. If you lose your job or become disabled, the amount that you are entitled to collect depends on lost wages, which are discerned by looking at past wages. Therefore, if you and I both become entitled to collect UI or SSDI, or to receive tort compensation for lost wages, the one who used to earn more is presumably going to get a large payout. The paper notes that this might be viewed as peculiar since it provides what one might deem regressive payouts. For example, if A is a high-low (i.e., one with low current earnings but high past earnings), whereas B is a low-low, we might think A better-off overall, as she presumably is in a lifetime earnings sense, yet we give A more $$ than B. The paper discusses NDPF-derived reasons why this might increase the tax system’s incentive compatibility (by offering a positive expected payoff, otherwise reduced by the tax system, to being high-wage in Period 1).

Focusing for convenience just on UI, I have tended to think that it makes sense as a government program despite its regressivity compared to offering all people who lose their jobs the same payoff. Suppose that people are averse to the risk of losing their jobs – and not just to being involuntarily unemployed in the current period – because it is costly to suffer a sudden and unexpected negative shock to their earnings. This might result, not just from psychological habituation to a given wage level, but also from having pre-committed to a spending path that presumed wage stability. (E.g., consider buying a home with a high mortgage or else paying a high monthly rental, and sending your kids to an expensive school.) Aversion to a downward shock would suggest buying insurance against it, but suppose that moral hazard and adverse selection prevent this insurance from being available at reasonable terms. But suppose the government can better address the adverse selection problem than private insurers would be able to. This  can create a straightforward case for the government’s offering and indeed mandating the insurance on actuarially fair terms, on efficiency grounds and even without regard to distribution.

Traditional versus Roth IRAs – In a traditional IRA, you deduct the contribution and then are taxed on the distribution. By contrast, under a Roth IRA, there is neither deduction nor inclusion.

It’s well-known that these two methods are present-value equivalent, assuming a fixed rate of return and constant tax rates. For example, say the money held in the IRA will exactly double during the multi-year savings period no matter how great or small it might be, and that the relevant tax rate at all times is 33.3%.

Under a traditional IRA, you contribute $150 as this costs you only $100 after-tax, and then withdraw $300 that the distributions tax reduces to $200.

Under a Roth IRA, you contribute $100, this costs you $100 after-tax, and you withdraw and keep $200.

Nonetheless, in real world scenarios the two can play out quite differently. Tax rates may change between the contribution year and the distribution year. Also, the scaled-up traditional IRA would earn a lower overall rate of return if, say, you had a special opportunity to earn more than the normal rate of return but it ran out once you had invested $100 in it. (To show why this is plausible in real world scenarios, suppose that the scaling-up issue, outside the IRA context but resulting from tax rules that operate to similar effect, would require Jeff Bezos to in effect create 1-1/2 Amazons, rather than just 1, in order to maintain his extraordinary rate of return on a nominally larger pre-tax investment.)

The Hemel paper notes that, under traditional IRAs, the tax rate when one contributes is often higher than that when one receives distributions, reflecting retirement’s downward influence on one’s marginal tax rate. Certain NDPF-style considerations suggest that this is effectively backwards. This therefore might suggest policymakers’ favoring Roth over traditional IRAs, all else equal.

Hemel has also offered interesting arguments elsewhere that there are distinct grounds for policymakers to prefer Roth to traditional IRAs. Here, for example, he notes that management firms such as Black Rock may prefer the traditional structure, because they get to earn a fee on the scaled-up assets under management. In effect, they are charging the government their standard fee for earning the $$ that it will claim when the funds are distributed, but we might be highly skeptical that paying this fee is worth it economically to the government in terms of ultimately enhanced returns to it.

I read this portion of the article against the background of a prior view that Roth IRAs are often worse from a policy standpoint for two reasons. The first is that, in Bezos-type cases, taxpayers earn scarce extra-normal returns through Roth IRAs that result in their avoiding any tax on the rents (whereas they would have to pay tax on the rents under a traditional IRA structure). The second is that fixed-period (such as 10-year) Congressional budget rules cause Roths unduly to look cheaper for the government than traditional IRAs, because the revenue loss (from excluding distributions) is largely pushed outside the estimating period.

Without purporting to resolve the traditional versus Roth issue based on any one issue alone, I would note that, at least in principle (and subject to policy change risk for rules that have a deferred application), one need not base the traditional IRA deduction and inclusion rules on the taxpayer’s contemporaneous marginal tax rate. Suppose, for example, that one wanted a net positive tax but was concerned that taxpayers’ marginal tax rates would decline between the two periods due to retirement. Then one could mandate, say, a 20% credit for the contribution and a 30% tax on the distribution – or, for that matter, equal percentage credits if one wanted the PV of the net tax to be zero. In short, the question of when one provides partial reimbursement (such as via deductions), and when one imposes positive tax liability (such as via the inclusion of distributions) is not indissolubly tied to the taxpayer’s income tax MTR in the relevant year. Subject again to the question of political risk, this actually might leave one with more scope than otherwise to apply NDPF-style analysis to the question of how the overall set of transactions ought to be taxed.