Searching for information: If I were Neeva

Almost from its start, Google Search has been regraded as best-in-class, delivering more relevant results to both general and specific queries than competitors. Maintaining this relevance and dominance has required large and ongoing investments in the search algorithm and web indexers, investments recouped thorough the use of user data for targeted advertising. In this, Google is solving two simultaneous search problems: users searching for information they don’t know, and advertisers searching for users who may purchase their products.

Many users are disconcerted by this (~anonymized) use of their search history to display ads, motivating various competitors to Google Search. DuckDuckGo, for instance, also sells advertising space in search listings, but they don’t use user information in ad display. Lacking Google’s profitability and scale their results are less relevant, but sufficiently for many users. Startpage is similarly motivated, paying Google to display Goggle search results while showing generic, not user-specific, Google ads. In this they’re like a pure Google Search, one that isn’t trying to maximize profit or pay for bad ideas.

In this vein, I was interested to see the launch of Neeva as a ‘subscription search’ service, paying for Google Search results without ads or (Google) tracking. While some people will be attracted to this, it seems like a minor differentiator, as the primary user experience is still delivered entirely by Google, one subject to cancellation at Google’s whim.

If I were to start a search company, I’d start by recognizing that many individual searches are not successful (1) because either the user doesn’t know what they’re actually searching for (2) or how to phrase the question in sufficiently specific way (3). I’d focus on the value of questions: searching for information is valuable because it is an attempt to take what I know and learn something else.

That is to say that while I’m sometimes scatterbrained (*multitasking), I rarely conduct multiple, unrelated searches at the same time. Successive searches, observation (1), are an indication of the failure of the prior searches to produce a sufficient answer, yet Google and every other search I’ve seen neglect this temporal aspect continuing to show results you’ve already rejected. Most of this is due to the too-simple interface of text box + contextualized hyperlinks, preventing search from attempting to group results by their textual similarity, vintage, or domain. It’s nice if the answer can be in the first three results, but that is often not the case and search should help us better understand how various results compare to each other. This naturally leads to a longer interaction with the search results — the thing Google pays to serve — as the user is encouraged to see how the results differ and through a click or two to delve into the differences before leaving the search engine. And here’s the critical point: if the search engine can show the user the point at which it cannot determine which of two groups of results is more responsive to the users query, that is precisely the limitation of the search engine and the kind of information the search engine needs to learn to serve better results. Keeping users on the search page for that initial screening allows the collection of data to make search better. Not to sell ads, but to make search better!

Building off (1), (2) realizes that many searches have as their target a vague notion, something that may be only half remembered or a conjecture the user is looking to verify or disambiguate. Here, the search limitation of one query = one set of results is limiting, as multiple queries might be more helpful in circumlocuating to what the user is actually asking. Google attempts this with the ‘other users searched for’, but these suggestions are almost always entirely different questions. When the user hasn’t made a sufficiently specific query, the thing to do is to ask them for more context, not suggest a different question. In recognizing when the results are insufficiently specific and facilitating this narrowing, the search engine is again able to gather more information to more directly answer the user’s question, again making search better.

In the other direction, (3), some searches are nearly impossible. Searching for anything to do with Microsoft Excel, for instance, invariably returns pages of low-value clutter. Google search provides no ability to filter out SEO clickbait nor any facility to distinguish between the needs of beginning and advanced users. The problem isn’t in making the query exact but specifying the kinds of resources to exclude in a way the search engine understands. Now, Google and other engines have long offered boolean keywords or the ability to negate (-) a search term, but because of the interface it can be challenging to verify that the query has been parsed correctly. Just as rephrasing a question before answering communicates understanding of the question, search engines should attempt to communicate to the user that their pages of results are not just wasting the user’s time.

So if I were Neeva (or DuckDuckGo since they’ve written their own search engine), I’d attempt to make search a better experience, by providing better results and context than Google does. I’d classify users according to their search history/sophistication and prefer results that other users in their class have found helpful, while at the same time making clear that the results are tailored based on their search history and offering a way to remove the class restriction. Search that uses my and other users search history to become better, in a transparent way, that’s what I’d subscribe to.

The future is just waiting to be built

Last fall I started getting a strong vibe that 2020 would be a transformative year. It is not disappointing. So many seeds planted over the previous decades are now sprouting. Some, like coronavirus, are quick-growing weeds that must be immediately and uncompromisingly dealt with. Others have grown slowly, and are only just beginning to break the surface; it’s hard to tell what fruit they’ll bear but, weed or wheat, their emergence shows us what we have to work with.

Marc Andreesen summarized one of these seeds by saying: “It’s Time to Build.” This raises some basic questions: what have we been doing and what should we stop doing so that we can start building? What should we build? And, why is now the time?

Continue reading “The future is just waiting to be built”

When hardware outlasts a business model

I just shared some thoughts about the Sonos debacle and, were this an isolated incident, it’s possible Sonos or another manufacturer could survive in this decision. But already in the first weeks of 2020, I’m seeing similar stories in other parts of technology. UnderArmour is ending support for its scale, wristband, and heart monitor products. Charter/Spectrum is tired of being paid to secure homes, giving affected customers one month to rip out their old hardware and switch to a new service.

It is really incredible that when Amazon, Apple, Google, and lots of venture-backed companies are developing hardware and software platforms to better know what people are doing in their homes, Charter decides to shut this business down. I’m particularly impressed that they couldn’t find a buyer for any portion of the technology stack. To be clear, the home security product was originally built by Time Warner which Charter acquired two years ago. Since Time Warner didn’t design this hardware itself, it’s likely standard home security products with Time Warner stickers all over it. Given this, the primary costs in selling these customers is migrating them from Charter/TimeWarner’s account system to the buyer’s and switching the home security products connectivity from Time Warner’s DSL internet to some other service. This shouldn’t be that hard.

Continue reading “When hardware outlasts a business model”

Standardizing the Future

Apple’s resistance to standardized cellphone charging is disappointing and discouraging: Apple doesn’t want to be prevented from engineering its own, ever smaller connectors for ever rounder rectangles. Ignoring the great progress in wireless charging, the better question is why is Apple lobbying when it could be improving its devices?

One of the oddly unsolved problems in cellphones is their daily ability to overcharge their batteries. While other high-demand battery devices – cordless power tools – have solved this by making the chargers more intelligent and thus able to stop charging when the battery is ‘full.’

Continue reading “Standardizing the Future”

Planned Senescence

I’ve followed the Sonos planned obsolescence drama with interest. I’ll admit some sympathy for the company: they have a spreadsheet somewhere that lays out the ongoing maintenance costs and, as they regularly receive telemetry from all of their devices, know very well how many devices are affected by the policy. They undoubtedly feel a bit betrayed by their customers: their policy is significantly more generous than other connected device manufacturers, many of whom don’t deliver a single update.

But at the heart of this problem is a design decision: what is the correct interface for a speaker or other piece of audio hardware? The outraged customers have a point: making sound doesn’t get old and if the hardware shows no signs of degradation, why should it stop working?

Continue reading “Planned Senescence”

myPipes

Given the end-to-end encryption of https, 3rd party transport middle men can’t determine what a user is doing without deep packet inspection.  DNS operators have some of this information, as they must translate your request (www.youtube.com = 216.58.192.238) to connect you to your destination.  So, is it good to change your DNS servers from your ISP (moral paragons that they are) to Namecheap / Dyn / Google?

What do these services do with their requests?  They know your IP and what you sought; do they sell this to ad networks (or are one..Google)?

Alternately, the pro-user direction is to tell you who you’re contacting; is Alexa/ the CPAP machine/etc. reporting on you?  They can tell you how frequently a new session is made and what other domains are contacted at that same time/with the same user string.

More locally, this connection monitoring is an opportunity for router makers and OS providers; they have been able to block access to arbitrary domains or IP ranges for years, but that’s a rather technical endeavor.  Better would be to help the user/bill-payer understand who they’re working with.  An egocentric google analytics.

If I Were: Arm&Hammer

‘Fresh box for baking!’
100tsp/box

..I’d sell baking soda in 1 teaspoon packages.   They already realize the need (‘Fresh box for baking!’) but choose to sell 100tsp per container for $0.94.  Other parts of the box tell me to change boxes every month, so, according to Arm&Hammer, 95% (5tsp = 5 batches of cookies / mo) of the product they sell to customers is intended to be wasted.  I’ll gladly pay $1/5tsp/mo to have a less self-contradicting package!

If I Were: Smuckers

https://images-na.ssl-images-amazon.com/images/I/91ibjd6JSLL._SY606_.jpg
Smucker’s Grape Jelly

Seriously. Sell jelly in squeeze tubes. Just be done with it already.  OK, get fancy and sell PB&J in a dual-tube.  Tube packaging efficiency is so much better, and, if you do the nozzle right, you don’t need to refrigerate because no air will enter (see Franzia/bagged wine).  TetraPak and others have this figured out; why are we still discussing this?

Howard Hughes Medical Institute @ Stanford

After my PhD I left UW-Madison for Prof. Mark Schnitzer’s group at Stanford and HHMI, where I worked with laser physicists and biologists to design automated neuroscience systems to increase experimental throughput.  These robotic systems are more than a convenience, as they can eliminate the need for sedatives, permit re-sampling of the same neurons over long durations, enable simultaneous observation of multiple, disparate brain regions, increase experimental controls, and reduce experimenter workloads.  I found this work interesting because these systems naturally inhabit unexplored design regimes and required varied and creative systems, mechanical, electrical, and software engineering.  Here’s a longer overview of their research.

My primary project was reconstructing, improving, and rearchitecting the fly picking robot, seen here in it’s original form.

This robot is the first step of an automated fly experiment system, where we need to gain custody of a freely behaving fruit fly and prepare it for subsequent experiments.  This paper has more details on the overall vision.

I also designed and built a 5D remote center-of-rotation kinematic mechanism to enable observation of challenging areas of the mouse brain.  I’ll include a longer discussion in a later post.

I chose to live in Palo Alto and bike commute 6-8mi every day.  This was great for my health (collarbone excepted) and avoiding the commute helped make Palo Alto more tolerable.  My second apartment was near Page Mill and most weekends I biked west and up one of the great Portola climbs.  Running the Stanford Dish, or in the Santa Cruz Mountains was also quite fun, though I missed the forest and verdant trail running I had in Wisconsin.  And it was great being 3 hours from Tahoe skiing, though I was never convinced that it was winter when it was 3 hours away.  There is much to criticize on the quality of life in the Bay Area: I found the prevalent socioeconomic class distinctions jarring and I became increasingly doubtful that they would act to fix their broken citiesAlas.  I have some thoughts on the way out of this more generally, we’ll save them for the longer post.

If I Were: Macy’s/JCPenny

The first way to view the ongoing struggles of department stores like Macy’s, JC Penny, Yonkers, and others is by likening them to their internet adversaries. From this vantage we see their hip, sprawling, prime-retail stores as grossly inefficient warehouses, bleeding margin on

  • customer acquisition: advertising to get people in the door
  • labor: presentation, refolding, cashiers, and cleaning
  • storefront: cost/sqft to be in the (strip)mall

These costs recur, so that for any item we can imagine (and if we had the data could calculate) the daily cost to sell that item, or similarly view how every additional day on the shelf further decreases the potential profit on that item.  Now, the whole idea of department stores is that you can use profits in one seasonal department to offset losses in others, so this is simplistic, but it communicates the basic challenge of retail.

One way to move more product and offset the cost of physical retail is to also sell online, using your brand loyalty to compete directly with the online-only retailers.  In some cases this loyalty can sustain higher prices, but in many cases it seems that department stores must price-match online-only retailers product-for-product.  And since the department stores’ cost of inventory is higher than online-onlys’, they must accept reduced profits. (Many were able to make up this loss by their close relationships with leading brands, potentially giving them access to wider product variety, better product targeting to regional stores, and likely better terms.)

But people still shop for clothes in-person, suggesting that the stores provide some value that they’re not capturing today.  So the second view on department store struggles is their historic value proposition of convenience, selection, and reasonable cost/frequent sales.  With retail items costing slightly higher than online, we’re left with an immediacy that next day shipping can’t quite match and a product selection that, while decreased in breadth from online, can be classified and filtered to a much greater extent by personal criteria.

Given this, I wonder how a showcasing model would change these underlying businesses.  Instead of selling customers in-store items, the retailer should prefer, say, two-day shipping from the regional distribution over the depletion of the in-store inventory.  I think the retailer has a choice as between selling an in-store unit with 50% of the original margin remaining versus inventory from a more cost-efficient warehouse where, say, 95% of the original margin remains.  If ship-to-home is the default, advertised-sale price, the retailer could still sell in-store items at slight $5/5% markup, as a soft preference for selling items from the retailer’s most efficient units.  Moreover, by shifting the retailer’s distribution strategy from inventory-on-shelves to more efficient warehouses, they might better compete with online retailers by aping their efficiencies–in no world does it make sense to expose 5 identical products of every single item to customers, this is just an artifact of the era when the store was the warehouse.

So, if I were Macy’s/JCPenny, I’d seal the deal in person and deliver in two days.