Saturday, August 22, 2015

Giving Donald Trump an ounce of credit

Dan Diamond has a piece on Donald Trump's unemployment whopper--he claimed the "real" unemployment rate is 21 percent, even though the official rate is only 5.3 percent. Here's Trump's full quote:
"Don’t forget in the meantime we have a real unemployment rate that’s probably 21%. It’s not 6. I’s not 5.2 and 5.5. Our real unemployment rate–in fact, I saw a chart the other day, our real unemployment–because you have ninety million people that aren’t working. Ninety-three million to be exact.

"If you start adding it up, our real unemployment rate is 42%."
Diamond runs down the list of what Trump might be thinking of. The headline 5.3 percent figure is the U3 measure of unemployment--the ratio of those who are looking for a job to those who have or want a job. There's also the U6 measure, which includes those who have some work but would like more, but that's only 10.6 percent. Diamond concludes Trump is probably thinking of the fraction of the labor force--those who have or want a job--that does not have a job, which works out to 37 percent. "Factor in Trump's tendency toward exaggeration, and that's pretty close to the number he quotes to TIME." Indeed.

But Trump wasn't exaggerating, and I think it's pretty obvious what figure he was citing. We all have a tendency to interpret statements from generally stupid individuals in the stupidest possible way, overlooking more obvious and less dumb interpretations. We did this a lot to President Bush, I will admit. Anyway, here's the percent of the population that isn't employed:
The percentage of the civilian population that does not have a job, according to official numbers.
Just read Trump's quote again: "people that aren't working"--he's clearly talking about this employment-population ratio. The number is down to 41 percent now, but with rounding it was 42 percent as recently as 2014.

Diamond is right that this is not the best measure of unemployment. Trump's measure includes children, the disabled, and retired senior citizens who don't work for reasons that have nothing to do with the health of the economy. And Trump understands that too--that's why his unemployment figure is 21 percent, he guessed that half of those who don't work aren't unemployed for economic reasons.

Economists have given this issue a lot of thought which is why we've come up with a very specific definition of "unemployment"--those who want jobs but don't have them--to capture a more meaningful economic indicator. And the flaws with that statistic, which we are well aware of, is why we've come up with tons of alternative measures, like U6. I spend quite a bit of time communicating economics to non-economists, which is why I care enough about this to write a post about it. Non-economists usually aren't aware of these definitional issues and often interpret the headline U3 measure to mean the percent of the population that doesn't have a job. I'm frequently asked how many of the 5.3 percent unemployed are children and retired people (none!) and, when I explain the definition, everyone always want's to know the "real unemployment rate" by which they mean the percentage of people who are non-employed. The US has always been a bit puritanical, and (now that women are part of the workforce) people tend to think of this as a place where almost everyone is working full time--they are usually quite shocked at how large the ratio of non-employed people is, and this is exactly what Trump was playing to.

So, yes, more than 40 percent of the population is not employed. Of those, 37.4 percentage points--over 9 out of 10 of Trump's "unemployed" people--are not in the labor force, meaning that they are not looking for work. They are children, retired senior citizens, the disabled, in prison, going back to school, wealthy enough not to work, and yes, some are "discouraged workers" who would be looking for work but don't think they can find a job. But of those who have a job or looked for one in the past month, 94.7 percent were the former.

Friday, August 21, 2015

On bitcoin block size

Lots of outlets have covered the recent controversy over bitcoin block size, but none of them have really given me much detail on it. So here's my attempt to make sense of it. My single biggest source generally is the bitcoin wiki, and based on my remaining open tabs more specific sources are here, here, here, here, here, here, and here. Also, just for fun, you can monitor new bitcoin transactions in real time here.

I've discussed before how the bitcoin protocol works, but in that discussion I focused on how bitcoin uses cryptography to validate individual transactions. The current debate is about something related but slightly different: how those individually validated transactions propagate into the public ledger when there is no central command to secure that ledger.

When you purchase something with bitcoin, your bitcoin wallet creates a record of that transaction, consisting of to and from addresses along with cryptographic signatures. You upload this record to a payment processor, who is really just some random guy with a computer that is hosting a bitcoin node using the open-source bitcoin software.

Anyone can host a node, provided they have adequate hardware (just an ordinary computer with at least 50GB of hard drive space) and internet connection (ordinary, cable internet service will work). A node consists mostly of a copy of the complete bitcoin block chain, which is the public ledger containing a record of every single bitcoin transaction that has ever happened, which currently is about 40GB worth of data. It's called a block chain because the ledger's contents are broken up into a sequential chain of blocks, where each block contains a hash of the previous block along with records of some transactions. A hash is just a number produced by applying a hash function to some text, where two identical texts always produce the same hash number, and two identical hash numbers had to have been produced by identical text. This allows validation of the block chain: for each block you just apply the hash function to the previous block and compare the result to the hash contained in the current block; if they match, you do the same for the previous block, and so one all the way down the chain until you arrive at the original, first-ever bitcoin block.

So when you send your transaction file to be verified, the node host (ie, a miner) will add it to a block it is currently building. When the node is finished building that block, it pushes it to the bitcoin network, where others also validate it's contents and add it to their block chains. Others can validate the block's place in the block chain by following the chain of hashes described in the paragraph above. If the block's contents are validated (see previous), and the chain of hashes is valid, you are half way to completing the transaction.

There is no minimum number of transactions that the miner can include in the block--the block can be considered complete once the miner has solved a time-consuming proof of work algorithm, which basically prevents spam from being entered into the block chain. That said, there's an incentive to include as many transactions in a block as possible, since you get fees on each transaction, but only solve the difficult problem once. But there's a limit: blocks are limited to at most 1MB, or roughly 2,400 transactions.

So once this is pushed to the network, other nodes all check to see that it is valid by checking that the solution to the proof of work is correct, checking that the hash chain goes all the way back to the original block, and finally checking that that hash chain is the longest of all possible hash chains (by convention, they all agree to honor only transactions along the longest chain--this prevents double-spending). But an issue can arise here. If two miners are building blocks at the same time, and try to push them to the network, chances are that they'll both contain a hashed reference to the same parent block since the other has not yet been published--the block chain is forked. So at most one of these can become part of the longest chain, and the other is invalidated--all of the transactions that the other one contained must be added to another block and the whole process must be tried again, until those transactions become part of the longest chain. Only after enough nodes have recognized that a transaction is part of the longest chain does it finally become valid.

Note that we have two constraints: first, the network limits how fast blocks can be added both by rejecting simultaneous additions and by adjusting the difficulty of the proof of work algorithm so that a block can be added only about once every 10 minutes. Meanwhile, each block can contain only about 2,400 transactions because of the 1MB limit. Thus there exists an effective upper bound on how fast transactions can be added to the block chain at about 4 transactions per second, perhaps considerably less when other factors are considered. And that's where the block size debate comes into play. According to this guy's calculations, we'll probably reach that upper bound in 2016 or 2017, probably causing transaction fees to spike as the market rations bitcoin transactions. One way to increase the capacity of the network is simply to increase block size and with it, the number of transactions that can be added per block. This would mean that the block chain file would grow more quickly, perhaps increasing hardware and network requirements for hosting a full node. On the other hand, it would also mean more revenue from mining, without necessarily increasing transaction fees.

But that is not the only way the network can expand capacity. In my view, the likely outcome of failing to increase block size is increased centralization and the emergence of Bitcoin Banking. In fact, you don't need a formal transaction entered on the block chain for control of a bitcoin to change hands. The way a Bitcoin Bank would work is this: you have bitcoins and deposit them at a bank (or, more cheaply, you buy bitcoins through the bank). Then the bank would maintain an off-network block chain as you spend those bitcoins, and at the end of each day would merge it's off-network chain with the official block chain. But if it is sufficiently large, it can consolidate considerably. For example, instead of a separate transaction every time one of the bank's customers spend coin at Kroger, it could simply have one transaction for the full amount all its customers spent at Kroger. This would represent a massive consolidation in the number of transactions, number of blocks required, and therefore the amount in transaction fees. In a world where transaction fees are high (and most bitcoin transactions are legal/legitimate), I think this kind of centralization is inevitable.

But then, centralization might be inevitable anyway. A centralized off-network block chain administered by a trustworthy bank doesn't require all of the resource-consuming proof-of-work and decentralized network verification, and thus is inherently lower cost regardless of the maximum block size of the main bitcoin block chain. To the extent that people want to use bitcoin at all, there is an arbitrage opportunity for any trust worthy institution to centralize it. Economics tells me that arbitrage moves markets.

Wednesday, July 29, 2015

What Scott Sumner believes

I will confess that I haven't actually been reading much of Scott Sumner. I was vaguely aware of his views through commentary on other blogs, but it wasn't till I read this post that I realized that he thinks the Federal Reserve actually caused the Great Recession:
"I've always thought that it was patently obvious that the Fed caused the Great Recession with a tight money policy that allowed NGDP expectations to collapse in late 2008."
Obviously, Sumner knows that, unlike the ECB, the Fed didn't raise interest rates prior to the Great Recession, but he thinks there was "passive tightening" where the Fed failed to lower interest rates in response to the macroeconomic shock, and in that sense caused the recession that followed.

Sumner continues:
"But other people apparently don't see it as being at all obvious. They look for alternative explanations. And yet when you ask them why, they tend to give these really lame 'concrete steppes' explanations, such as, 'The Fed didn't raise interest rates on the eve of the Great Recession, so how can you claim that tight money caused the recession?' Or they show themselves to be completely ignorant of actual Fed policy, and claim that the fed funds target was at zero when NGDP expectations collapsed in 2008. It wasn't."
To a certain extent, Sumner is just playing with words: the Fed "caused" the recession by not engaging in enough monetary stimulus to prevent it from happening. That doesn't lessen the need to search for theories to explain the shock that necessitated the Fed's reaction, but that's besides the point. When you look at NGDP you do see the Fed lagging a bit:
The Fed funds rate did trail NGDP by a few months—it doesn't start to fall until a bit after NGDP starts to fall, and doesn't trough until a bit after NGDP troughs. But there's still something a bit weird about this claim.

Sumner's argument boils down to this: if only the Fed had lowered the Fed Funds rate a few months earlier, the whole recession would not have happened. But lowering the Fed Funds a few months late fails to produce a similar effect.

Apparently in the market monetarist view, timing is everything. I think that's the key reason mainstream economists are skeptical of Sumner's hypothesis. Monetary-induced recessions typically have sharp upturns once monetary policy is loosened. Sumner writes
"But as for the rest, the overwhelming majority who think nominal shocks do matter, I’m mystified. Take the AS/AD model that you see in McConnell, Mankiw, Krugman, Cowen and Tabarrok, Hubbard, or any of the other textbooks. Why do we even teach this model if confronted with an almost perfect example of a depression caused by tight money, we simply don’t believe it?"
Problem is, that's not what those models say. Those models predict a symmetrical positive effect on NGDP when monetary policy is loosened, like we saw in the early 1980s with the Volcker experiment. But there was no sharp upswing in 2008 as the Fed floored interest rates. Hence, there's definitely something weird here that conventional theory doesn't explain. If the Fed dropped the ball, why did the ball shatter instead of bounce?

Aside: my inner web programmer must criticize Sumner's blog API. One of my big pet peeves is when websites use querystrings incorrectly, as Sumner's blog does. I've made some corrections:
Everything before the ? in the URL is the URI path, which identifies a resource—in this case the blog post—within a hierarchical interface, while everything after is called the "querystring." A handy way to think about it, I think, is that the path should be required information to get to the resource, while querystrings should contain only optional, user-defined information relevant within the requested resource. For example, https://myshop.com/products/42 should give us product number 42 from shop.com, but if we want to supply a search term to filter products by, we might use https://myshop.com/products?search=market+monetarism which filters the products resource on our search term "market monetarism" that we provided.

Monday, July 27, 2015

Why do we hate artificial sweeteners?

There's a long history of health controversies surrounding artificial sweeteners, with critics accusing them of causing everything from cancer to diabetes to attention deficit disorder. Aaron Carroll reviews the evidence and shows, pretty clearly, that none of the claims are sound. Bottom line: there's no evidence that artificial sweeteners have adverse health effects, and no serious studies cast doubt on the zero-calorie proposition: consuming zero-calorie artificial sweeteners doesn't cause you to gain weight.

So why are so many people so against artificial sweeteners? I'd like to posit a hypothesis: people keep looking for reasons why artificial sweeteners are bad for you because they dislike the way artificial sweeteners taste. Personally, I think they taste awful. Bitter and harsh, and not really all that sweet. Artificial sweeteners taste a bit different to different people, depending largely on genetics. But we all generally do agree that sugar tastes good. Maybe all of this scare is just people looking for a way to justify their unhealthy preference for sugar over artificial sweeteners.

Friday, July 24, 2015

ARGH--Microsoft database edition

Ok, here's a question: how do programmers make non-web applications in .Net anymore?

I ask because Microsoft has killed off SQL Server Compact Edition, which I had previously thought was the standard way to make a database for a non-web-based program in .Net. But there's no SQLServerCE assembly in Visual Studio 2013 or 2015, not even as a reference you can add--if you want it, you have to download and install the .dll file manually. Every non-trivial program necessarily involves storing and manipulating data, and I have a hard time believing that most non-web programs don't have relational, query-able data that calls for a database, so what exactly are these non-web programs using?

Ok, obviously Microsoft left a way to have a local non-web relational database--in fact, they've now extended SQL Server Express so that you can create local SQL Server databases that work offline for your applications, and this is the standard (only) way to add a local database in 2013-5 editions of Visual Studio. And at first I was quite pleased with this because you can use the same SQL Server client namespaces, such as System.Data.SqlClient for a local database as you would use for a database on the web server, making it useful for development of web applications.

In other words, local databases in .Net programs are now "real" SQL Server databases rather than an alternative made specifically for offline applications. There are advantages to this--for example, Compact Edition databases did not support stored procedures whereas real databases do. On the other hand, I'm not sure why developers of offline apps would care--stored procedures are really a way of handling data complexity, which is important on a server-side system with data streaming in from disparate sources, but unimportant in an offline app where you have full control over the format of the incoming data. And the difference is not benign: whereas Compact Edition databases were self-contained within the offline apps that used them, these new full databases have a hard dependency on SQL Server Express, an entire separate program which must be installed on every computer before they can run the offline app.

And that's what has me puzzled. I've installed a couple offline programs with local databases onto other peoples' computers, and in every case, on both new Windows 8 operating systems and older ones, it has prompted me to install SQL Server Express, thus revealing two things: 1)that this isn't just something that Microsoft has built into new operating systems, so that the dependency doesn't matter, and 2)no program developers are using Microsoft's new system for local databases. For comparison, this is now the code to create a database:
string filename = @"C:\Users\User\Documents\testdb.mdf";
if (!System.IO.File.Exists(filename))
{
    string databaseName = System.IO.Path.GetFileNameWithoutExtension(filename);
    using (var con = new SqlConnection("Data Source=.\\sqlexpress;Initial Catalog=master; Integrated Security=true;User Instance=True;"))
    {
        if (con.State != ConnectionState.Open) con.Open();
        using (var command = con.CreateCommand())
        {
            command.CommandText =String.Format("CREATE DATABASE {0} ON PRIMARY (NAME={0}, FILENAME='{1}')", databaseName, filename);
            command.ExecuteNonQuery();
            command.CommandText =String.Format("EXEC sp_detach_db '{0}', 'true'", databaseName);
            command.ExecuteNonQuery();
        }
        con.Close();
    }
}
That's a lot harder than before. You now have to connect to the computer's SQL Server instance, create a database and then detach that database to it's own file. Then you can connect to the new database.

But the bigger problem is this: SQL Server Express, as best I can tell, does not support transparent encryption. That's a big problem when you think about it. Sure, you can encrypt data before putting it into the database, but then the DBMS can't tell what it's value is when you query it, which severely complicates a lot of the functionality you'd like a program to be able to have (the only way to do "in the past year" type searches is with a web of surrogate keys; if encryption is done correctly, the only way to find a search term is to query and decrypt all the data one-by-one until it's found). Between the dependency on SQL Server and the lack of usable encryption, there's a strong temptation to continue using the old Compact Edition databases. But that's problematic: if Microsoft is no longer supporting Compact Edition, that means they aren't upgrading encryption algorithms and fixing security bugs.

In conclusion, it is now basically impossible to make a standards compliant offline application using only supported Microsoft frameworks. I get and agree with the web-first paradigm, but come on, that's ridiculous.