A Longer View On Academic Publishing Platforms And Innovation

2016 Mar 23opinion research business model

TLDR: Patentese values superficial bets on the future of technology and society, to the detriment of technology and society.

I've been watching the relationship between researchers, publishing platforms, and IP evolve with some interest. To choose some examples:

Elesiver buying Mendeley
Patent Trolls (for instance)
IBM Watson's attempts at culinary creation
Paul Allen launching AI2 and its Semantic Scholar

What technology will the next Amazon/Google/Facebook/Uber deploy? Today's unicorns consistently apply a novel assortment of known technologies to some large market beset by some inefficiency. In the absence of oracles, the next unicorns are guessable. If you can observe the pace of innovation in any particular field -- say the number of publications per year -- you quickly learn what technology sectors are interesting to academics and also receiving funding. If you see this rate of innovation increase, that signals that something new has occurred. And because researchers and funding agencies like to be fashionable, a quick semantic similarity analysis across the literature in that increase will give some sense of what the excitement is about.

Most researchers want to make 'life' better, and most people are happy to pay for a better life. Since researchers are generally only excited by progress towards their discipline's goals, and since those researchers inhabit the same reality as their eventual consumers, what excites researchers will probably, eventually, ideally, be valuable to society. So, if you notice increased innovation in some sector and realize that, by the nature of that sector, growth will impact many, well that's worth paying attention to.

Having some intuition, it's time to place bets. The form of the bet would differ by institution: funding agencies can target for some desired effect (say, Rep. Smith) and trolls could weight their acquisitions by maturity and potential scope (today's trolls as amateurs). So, such a learned-intuition technology is agnostic to the ends (as ever), but, the actors differ greatly in their ability to amass the underlying database, and in the rewards for applying it.

Who is positioned to leverage the learned-intuitions? Funding agencies are doubly disadvantaged: grant reporting is sporadic and does not approach the rigor of (generally privately-held) peer-reviewed journals, while the ends will always be subject to the shifting winds of bureaucratic debate. (Moves toward open access, data, and analyses would remedy both of these, though they may be stymied lobbying.) With the government's manifest inability develop and apply new technologies, pessimism is warranted. Certain, other entities are advantaged as this relatively cheap method can extract more value out of already-held repositories. I picture this as a(n) hourglass, where fields of research may be analyzed (both semantically and in the author/referenced/viewer graph) to identify emerging trends. This is the broad, upper funnel. The observed trends may be entered into patents, where the patentese (the stilted language encountered in patents) can hide the absence of a reduced-to-practice, coherently-understood innovation. This is the hourglass' neck, with one ideal being a single patent that draws inspiration from many observations and thereby is able to make broad claims across products (the hourglass' expanding lower chamber). A form of this speculative patenting occurs in many university tech-transfer offices today, who grasp at any IP in a projected-to-be-sexy market, but is greatly improved by the intuition wrung from a large database.

The problem, the social harm, is that the standard of proof differs between the literature, patents, and the market, and so do the rewards. Researchers, 'the literature,' broadly value interesting, well-posed, and thoroughly-explained experiments. New works are valued by their novelty, by the degree to, and manner in which they solve the associated problem...and not by their sweeping claims. Good work is not immediately and individually rewarded, but appreciated in aggregate through greater grant success and honorariums. The market has a related interest in things that work, that solve the consumer's problem...and little patience for those that do not. It does not reward ideas but their execution, and there is a federal agency to restrict claims to reality (the FTC in its consumer protection role). Between the researcher and market lie patents; on one side patents draw inspiration from disparate developments and on the other they seek to claim parentage of broad swaths of future products. The reward is (US, typically) a 20 year monopoly over all possible renditions of the claimed idea, far beyond that which was realized during the original application. Whereas both the literature and the market incent specificity, the patent system incents vagueness.

Combining the aggregation of the academic literature into large databases in machine-readable forms with big-data analyses can* yield patentable claims. While there are probably big-data ways of evaluating market potential for determining the risk of particular claims, I suspect researcher interest to be a good proxy. (At least in some domains, as great interest and excitement in the newest particle or complexity does not a market indicate.) I imagine the typical result to be patents like Myriad's BRCA-1 test or the CRISPRs; things that are close to the literature dressed up in patent language. The patent application is not going to be reviewed for actual utility (as FDA does) nor is the patent examiner going to verify that the claims are possible, only that they are plausible given the state of knowledge. As the burdens and perverse incentives (see the paper) of the examiners are widely known, entities might craft patent applications whose background summary and prior art are not representative of the literature but tilted to their benefit. Again, it does not matter (to the applicant) whether the claimed innovation actually functions, but only that it appears plausible. The risk of discovering this (potential) impracticality is reduced by the patent thicket, where the number of granted patents is more important than their quality (courts are similarly burdened in testing the leveraged claims).

Without reforming the incentives, rewards, and norms of the (US) patent system, I fear that they will become an even larger vehicle for rent-seeking. Who's the master of these databases, and really, the knowledge they contain? For, as I mentioned, the semantic analysis of the literature may be put to other, more socially-useful purposes. It is important to remember that the purpose of patents is "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;" the very advance of technology has rendered sub-optimal the current form of the patent system. I am not against patents, but I do ask that they be useful.

*an assertion I think true, whether this is possible today or tomorrow is debatable