Also this doesn't work as a good Falsehoods Programmers believe thing subject because Falsehoods programmers believe are not about technologies but about non-technological things that are commonly needing to be handled in programs - hence Falsehoods programmers believe about:
Phone Numbers (sort of technical but it's not falsehoods about how phones work, but rather about how phone numbers are structured and what they 'mean'),
Good possible future Falsehoods programmers believe about:
In fact I am currently dealing with a falsehoods programmers believe about versioning of laws and standards at work.
This is less what programmers believe, and more a "Things to keep in mind" sort of list. But I suppose the less accurate title is a bit more click-baity.
That, when searching for a string, I don't want exact matches to appear in the results.
If your search ever DOESN'T return exact matches (barring common misspelling correction), you're doing something seriously wrong.
Even worse, Google sometimes shows results that don't even contain text you've specified an exact match for with double quotes, e.g. "find me"
There is a tab-style link directly below the search box called "Tools" in the search results page. Once clicked it displays a few settings and one of them can be set to "Verbatim".
Choose that and your search terms will actually be what is searched for, as opposed to some arbitrary subset of it. I wish this was better documented.
It's infuriating, because presumably I don't always happen to see the unknowingly hidden option.
If I search for "restaurants", I want search results that ARE restaurants, not search results which have the word "restaurants" in them.
What do you want to have happen?
It used to be that searching literally "restaurants" (i.e. with quotes) would search for an exact match (particularly useful for multi-word searches in those days), but no more. It's taken as a 'hint' or something, I believe, but not an absolute instruction.
This is even more frustrating when you do a date constrained search and google tells you there are no emails from that date, but if you page through manually, it’s there. I feel like gmail is constantly gaslighting me.
> Foobar Canteen is a 2 Michelin star restaurant located in the heart of Soho.
This used to be how search engines knew what was a page about restaurants and what wasn’t.
But in any case, the problem with not returning exact strings is those times when you do need exact strings. Like researching a famous quote, passage of text or software error message.
However, as someone who already learned to translate my desires into keywords, it's freaking annoying.
That we want well known standards like CTRL + F in a browser to be hijacked and replaced by default with a custom search experience that's a lot worse than a browser's search.
Try CTRL + F'ing on Stripe's documentation: https://stripe.com/docs/api/plans
I would argue that, and I don't think it's a particularly tricky discussion; If your site design subverts the normal, expected behaviour and functionality of the browser to such an extreme degree, then you created a poor user experience.
The only issue is that in Firefox, it is only equivalent for the first search; once you close the bottom bar, subsequent F3/CTRL-G just do "find next occurrence" and do not display the bar anymore. Chrome always displays the search input on the other hand.
Edit: since talking shortcuts, in Firefox ' (apostrophe) is like CTRL-F but searches only hyperlinks (and you can cycle through in case of multiple matches with F3/CTRL-G) which is extremely useful for quickly navigating pages via keyboard only.
0: In Google Sheets, for example, Ctrl-g opens the JS-driven find bar, or, if it's already open, advances the match.
Unfortunately, Alt+F is trivially overridable by web pages (Twitch.tv in this case -- to move to the search bar), so that doesn't really work.
Chromium devs have no idea what the impact of their decisions are... and judging by the issue trackers they don't care.
I'm not sure if you can hook into the native CTRL + F search tool and see what a user typed (my gut says no way there's an API for that), so I guess Stripe just wanted to track as much information as possible on what people are searching for, even if it makes the user experience a lot worse.
The docs are indeed viewable without JS (in a limited way) but the default experience relies on JS to render text.
We don't render all content on the page at once for performance reasons, which is (as a sibling speculated) the driving reason for overriding cmd+f/ctrl+f by default.
I hope to write an engineering blog post soon about how we build the Stripe api docs, with some focus on the performance and UX tradeoffs at play here.
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:67.0) Gecko/20100101 Firefox/67.0
Whoever thought it was a good idea should get shot out of a canon.
Decisions like this are why I do not support adding more functionality into web browsers. Most web developers have proven to be inept and incompetent. As demonstrated by this dumpster fire of a “feature”
(Mobile, cannot invoke keyboard on page, JS disabled.)
And behaviour may change.
Just tell us.
One that takes multiple seconds to get a response on a search and it's all contained in a tiny modal dialog box that has no skimmability and when you click one of the results it does a new page load to bring you to the results. Stripe is usually a superb developer experience. Truthfully I have no idea how it ended up in production as a default option.
One of my favorite sets of local search bugs involve interpreting "near me" as "near maine".
There's an anecdote from early days of Google Search where a certain domain was ranking 1st for an unrelated query (i.e., a false positive). The managers refused to move ahead before that got fixed, but the bug/edge case proved a head scratcher for several weeks on end.
Lastly one of the engineers solved the problem - by buying the domain and taking it offline.
Point being, if you can fix the problem outside of the code domain, do just that.
 sadly can't seem to find it - mostly getting spam articles related to SEO
In the early days of Froogle, a shopping search engine made by Google, searching for "sneakers" always yielded a garden gnome wearing sneakers, one unit on sale, as the top result. This was considered bad, as someone searching for "sneakers" probably wanted to buy sneakers, not garden gnomes. The whole team tried to fix it, but they didn't want to just hardcode an exception. It eluded them for a while. Finally, it was not there anymore. They asked around for who had solved it, no one answered. Finally, one colleague arrived late - and placed the gnome on their desk.
Does Google have some kind of cultural allergy to special-casing or writing fallback rules around its recommendation systems? I ask because Chrome's spellcheck still lacks a lot of words that you can find in an abridged dictionary; it seems as though fallback rules like "the first hit needs at least one keyword match" or "never flag words found in Merriam-Webster as unknown" are basically never employed.
So, when one says "Hey Google, show me the front yard." Instead of showing the camera feed-- one gets information about a bar in LA called "The front yard".
Me: Hey, Siri, how long does it take to drive to Yellowstone National Park?
Siri: OK, one option I see is US National Commercial Real Estate Services on W Park Run Dr.
(Yes, that's verbatim — I screenshotted it)
About the only thing I would add to it is i18n concerns.
A few quick ones off of the top of my head:
- Words are separated by whitespace or dashes.
- Customers only ever enter ASCII.
- Customers only ever enter accented characters with/without accents.
- A "Unicode-capable" system will happily take in any valid unicode.
- A "Unicode-capable" system will pass through any valid unicode undisturbed.
- Software systems perform Unicode normalization.
- WinNT API is UTF-16.
- There is 1-to-1 mapping between uppercase and lowercase.
- Unicode collation algorithm is optimal for every single language.
- Unicode collation algorithm is optimal for multi-language document sets.
- Distinguishing/coalescing plural and singular forms of words is easy.
- There are separate plural/singular forms of words.
- Words have stem and optional suffixes, but not prefixes.
- Soundex etc. works for every language.
Or that there are just two plural/singular forms (1 and many) for translating strings, or that which form to pick is clear.
While English has one form for 1, and one form for 0/many:
- French pluralises 0 the same way as 1,
- Czech has a form for exactly 2-4 items,
- Irish has forms for exactly 3-6 and 7-10 items,
- Polish has a form for all numbers that end in 2-4,
- Russian has a form for all numbers that end in 1,
- Arabic has forms for exactly 0 and 2 items, ending in 03-10, and many more.
A strings table will need at least 10+ variants if you want to translate strings referring to number of items.
...and those I had to classify as "problems I probably had and didn't recognize" or "will surely encounter soon"
So often we underestimate this thing...
"Users won't want to turn search highlighting off."
Maybe it's just me, but this seems distracting.
It's by no means smart, doesn't handle misspellings or anything, but it works reasonably fast and predictably. This is basically how almost every desktop app with a search bar works. This is how word processors and editors work when users search within the document.
All of the "Falsehoods Programmers Believe About ..." genre articles do.
The way to use them is not to view them as an immutable checklist that all programs must conform to or else they are forever and always nothing but total crap, but as a list of things professionals should at least have some clue about, and that you should generally make deliberate decisions about, rather than accidental ones. Are you a pizza place with ten locations in a single state? Then by all means, take US-only addresses, and hard-code the time zone on your web site, and probably just ignore search, and expect first & last names or whatever. Just do it as a deliberate tradeoff, with an understanding of what it may take to undo it later.
Are you working in an international company serving customers all around the world, with the need to provide some search functionality? Well, you probably need to be able to fulfill a lot more of the relevant lists.
"It has x, y, z features that the Google search has..."
"I didn't know Google did that."
There's even an explanatory popup with how to do anything fancier than a straight text search.... But they just really use numbers and names. Advanced feature usage is once in a blue moon.
At least they're happy with the search function, but lesson learned - for a lot of usages, people aren't expecting much more than a simple text match.
I'm not sure you're doing your users a service implementing search with SQL LIKE. I think it's probably better to divert them to Google, use a full text SQL index, use a managed search service like Algolia, or not do search. Otherwise, you're just promising them functionality that is almost always going to fail them.
Why is that?
Users have been pretty heavily conditioned to use search in specific ways that are different than finding a text in documents. They have a broad range of needs that a wildcard 'find in files' search doesn't really support. And most frustratingly users expect a single search bar to support them all. Some needs are known item - finding an item by name (like contacts on your phone). Other needs are informational - finding a fact or idea by expressing requirements. Sometimes its about getting a survey of information about a topic, or sometimes its about compare-and-contrasting different products.
The primitives available in SQL LIKE don't really lend themselves to solving any of these problems. There's no concept of relevance ranking, there's exact, direct, case-insensitive search, not to mention it's going to do a full table scan on every search...
(You'd have my ear more if we were talking about full text search features in SQL.)
The difference with find on page is that it’s obvious and transparent what is being searched and the expectations of the interface. Trust me when I say that a search bar to a layperson on your site is them thinking “oooh I can google”
I think it more goes back to the real most important lesson of programming - you must know your customer and their needs first. If a google-like experience is what your customers demand, then you better understand all of that stuff and build it. If they just want to search for names and ticket numbers, any more advanced intelligence is a waste of time that could have been used to build other features that the customer actually wants.
Next, make it context-specific. Don't put a search bar at the top of the page suggesting that this bar can search for everything. If you use a simple implementation like a SQL LIKE to implement search, put the search bar right next to the thing that is displaying results from the table. Make it look like it's filtering the table.
Finally, label the search bar using words like "Keywords," which also suggest to users that they should be typing keywords instead of a more complicated natural language phrase.
Mostly I think this whole thread demonstrates the point of the original article, but I appreciate your response.
I have written search engines for a couple of sites that combined serve about a million uniques a year.
It's not great, but it's not terrible, and took less than a week. People search for places and names, so it's quite easy to match them.
We looked at one of the open source engines, but it was a lot of effort for not a lot of gain, and essentially adds another significant moving part to go wrong.
not windows 10
Other than that, the solution space is just as wide open as regular programming. It's just in many ways more frustrating because nobody knows what they really want from search, they just "know it when they see it" and no two users really can agree on what a good result is! :)
Search can be considered an additional feature just like any other - Yes? How do you falsify this?
Search can be added as a well performing feature to your existing product quickly - Yes if you're using a CMS with search already there like Drupal, or you can use that thing where your search uses/directs to Google.
Search engines don't work like your standard RDBMS with SQL and whatnot. You can't just make a SQL query with a LIKE operator and just call it a day if you want modern, featureful searching.
But a search engine is absolutely a database. Lots of things are databases even if they aren't RDBMS and can't be queried with SQL.
Although, as a side note, I have seen some interesting projects that allow you to query things like file systems and operating systems using SQL, or at least syntax largely inspired by SQL.
Adding a feature by using a product that already has that feature is not "adding a feature to a product". It's "doing nothing since there's nothing to do". ;-)
Using Google search for pages might work for simple sites that mostly host text content, but not for things like "find all foos that are between 20 and 30 kg".
God, how I hate that authorization woes find a way to make everything else 5x more complicated.
After conversations with two or three product managers, it became clear that the best course of action was to do nothing at all. I'm definitely not an expert on search or human behavior, and running through all the possible interpretations of how to handle misspelled words and what the customer wants was way more work than I was prepared to do.
I'll even point out that my initial suggestion was, "Let's just copy Google and do, 'Did you mean to type _______?'" Even that was met with, "what if the customers X" "what if the customers Y" etc. etc. Wasn't worth the time (at the time).
Also, while many items in the list are insightful, I find what bothers me in this and similar lists is when you could swap anything for "search" (or "time", "addresses" or whatever the other lists happen to mention).
See for example, replacing "search" with an X:
- Choosing the correct X is easy and you will always be happy with your decision
- Once setup, X will work the same way forever
- Once setup, X will work the same way for a while
- Once setup, X will work the same way for the next week
- The default X settings will deliver a good X experience
Or even enough context to interpret:
> Search can be considered an additional feature just like any other
Is that a falsehood? - what does it even mean?
"<non-trivial feature F> can be considered an additional feature just like any other"
And everyone will agree that's probably false. They could have written "search is almost never a trivial feature, and you should take your time to consider complications", but I suppose that wouldn't sound as a cute as a "Falsehoods Programmers Believe" list.
[ ] Disable fuzzy parsing hacks (reject my queries if they have bad syntax).
[ ] Don't search for sound-alikes; assume I spelt everything rite.
[ ] Respect the non-alphanumeric characters in my query, which I put there for a reason.
Really, this is a falsehood? Like, I want the same query to give the same results given the same dataset always. When do you not want that?
Of course this is false; please consider:
- customers expect to see in search results whatever new information they added/updated in the system (this is related to "Customers don’t expect near real time updates");
- customers expect "personalized" search results; having built up a history of searches centered around particular subjects (say, programming), you'll expect much different results for "string" than the general population gets;
- customers expect new/more results having logged in, or having gained new permissions/roles;
- customers running "knowledge" or "command" queries ("what is the weather?" "password 16") expect varying results
I might dash off a search for "sneakers" when I am researching footwear. A week later, I might be thinking about movies and enter the same query string, expecting IMDB results.
Not always the desired behavior. This should be toggleable. It becomes very difficult to find results outside of what google thinks you want.
Bug reports, newspaper articles, blog entries, sports scores...
If I google "waitrose closing time" I want the closing time of my local supermarket today.
When I googled that yesterday, I got a different result, and that's what I want.
Perhaps the user is a business user rather than a developer and once profiled correctly results can be adjusted.
A belief, or a system of beliefs, is but a model. It's virtually guaranteed to be wrong. It also may very well serve the important function of being simple enough to handle in-core, while at the same time being close enough to substitute for the real thing.
I would go a step further and say all formal models are proven to be wrong. After all, that's what Gödel and Turing kept going on about.
We can't prove any non-trivial program ever halts or does not halt. In fact, we can't (or don't) prove much about our programs we run anywhere.
All programs are a collection of assumptions. To bring this back to the topic at hand, if all of our search assumptions are useful to some meaningful number of people then it really doesn't matter how many "falsehoods" we trip over. Those falsehoods fall away, becoming mere insignificant edge-cases. Satisfying all people all the time in all cases is a fool's errand.
Articles like this are good at letting you know your blindspots so you can choose your blindspots rather than succumb to them. But don't let it become dogma.
Your point certainly holds true for any physical entity as far as we know - probabilistic quantum effects, Heisenberg's Uncertainty, chaotic systems, and all that.
However if you were to model a theoretical entity, and given a few more constraints (like strict computability, which precludes a turing-complete systems), you can indeed have correct models. Alas, in practice this is a rather rare example.
If you're Google, differentiating 'or' as in either from 'OR' as in Oregon is a task you need to take on. But if you're writing a National Park lookup tool, you probably just don't want to worry about that case. In that case it's still worth knowing; you might be able to save users some time by at least showing clearly how you reinterpreted their input.
Very much so; engineering is all about choosing the trade-offs, and hopefully improving them in the future. The list also helps with solving some of the unknown-unknowns problem in regard to what the customer expectations may be; even whole new domains of expectations (like immediacy of update, or handling of accented/non-english characters).
As far as I can tell, Google got rid of the special-cased "OR" in the general search - right now it's a word, not a predefined/reserved symbol.
They were able to do so by adding "implicit OR-like" operator between all the words in the query. Not quite an implicit OR, not quite an implicit AND; something bit more complex in between.
The words of the query get weighted against matches both on their own, but also as adjacent words (higher weight) and whole phrases (yet higher weight). All in all the problem got solved by improved matching & sorting algorithm, not by somehow smartly detecting when "OR" is meant as "OR", or OR, or or.
The problem got solved in the match scoring/sorting domain, rather than in the query parsing domain.
One that tripped me up a few years ago: non-programmers think that 'strings' are long fiberous things that cats play with. The connection between the word 'string' and the concept of text is not intuitively obvious to people who don't already know the lingo. Seems obvious now, in retrospect.
But how do you deal with the following situations?
-- talking to people, since you have no opinion
-- understanding the people around you , building a mental model of them
-- general confidence
That's a big tragedy in our society, that you're expected to have a definite opinion on everything. Myself, I have very few strong opinions, and those that I have I hold loosely. When someone asks, I usually try to sketch the space within which I believe the answer lies (e.g. "I suspect X, but then there's Y and Z, and also V I'm not sure what to do with"). This has a nice side effect of making strongly-opinionated regulars suddenly unsure about their own opinions.
A UX person thinks of search as a text box "like google". However, a lot of search UIs have a lot going on when you start typing and when you get results back to refine search results, DYM corrections, breakdowns/aggregations, suggestions, etc. A lot of these features require careful planning and design and are not necessarily easy to bolt on if you don't.
I've also had to do basic things like patiently explaining the difference between sorting and ranking and humbly suggesting that, maybe, having a multi column layout with sortable columns isn't necessarily the right thing for presenting search results where the output is a list of stuff in order of relevance.
Engineers are easier to deal with once you sit them down and talk them through how stuff works.
But we've had search engines as a major part of our lives for about two decades now. Most of us use one at least daily. We're familiar with the complexities of search engines and how they differ from simply searching a document for an exact string or even a regular expression. Many programmers like me work with tools like analytics and log aggregators that expose the complexities of search to us in a way that's more intimate than the veneers of Google and Amazon.
Maybe I'm just lucky in that my experiences have dispelled these notions of search being easy or simple. But I hope I'm not alone.
Also, there's a disparity between what search is and what your users expect. Technically, I could make a really simplistic "search engine" that amounts to a SQL LIKE query. It may not be good or what users might expect coming from Google/Amazon/etc, but it would be a search engine. (Oops. Looks like my pedant hat slipped back on when I wasn't looking.)
- customers are always searching for a specific item, rather than an entire category
- customers know that a search engine for one kind of item (e.g. products for sale) won't also search the entire rest of your website
I laughed, but I don't think this is a correct representation of something many programmers genuinely believe. It's worded in such a way that it's clear this is a joke. Not sure if I should read the full list if it's just going to be jokes like this one.
REs and FSMs equivalent.
In practice, the common "RegEx" implementation implement a lot of extras, that break the theoretical backing, and also exhibit highly non-linear behaviors. Cf. this excellent paper by Russ Cox: https://swtch.com/~rsc/regexp/regexp1.html
Most Regexp implementations in the wild are more powerful than textbook regexps, so they not only encode all languages accepted by DFAs, but can also encode other languages. E.g. back-references are not a feature of regular languages.
This equivalence is one of the fundamental findings of CS, and exposure to this concept is pretty much mandatory for acquiring a degree in the field. Sadly, this perspective is not often shared in the bootcamps and autodidacts, even though it's moderately documented in https://en.wikipedia.org/wiki/Regular_expression#Deciding_eq...
But the more mindblowing aspect is that you can use nondeterministic Finite Automata for the same purpose.
Edit: I don’t understand the downvotes though.
Just like "falsehoods programmers believe about websites" wouldn't make sense if you were using Wix...
Putting limited effort into creating a mediocre search feature doesn't mean that you believe these falsehoods; it just means that you're too resource constrained to put serious investment into creating and improving a high quality search feature.