The main challenge in UI design is balancing between:
- Allowing the user to perform desired operations in as few steps as possible
- Without cluttering the screen with all kinds of irrelevant controls
These constraints can be at odds:
Every time you hide some feature away in a menu, you are adding steps the user has to perform to achieve what they want
On the other hand, presenting every single feature on the screen all the time just clutters the UI and makes the user feel confused
It's no good to let the user decide what goes on the UI either: you've now just burdened the user with the job of creating a UI for your application.
Other constraints:
- Make sure all features are "disocverable" by just casually playing around the UI.
This rules out the idea of "hiding" things until the user triggers some kind of magic incantation.
Command pallets and shortcuts are great ways for enabling power users to perform operations quickly, but the product must have other ways for exposing features that is predictable and boring. Menus are a great way to do this. A user casually playing around with buttons and menus will eventually discover features, and they will also eventually learn the shortcuts for them.
> The main challenge in UI design is balancing between:
- Allowing the user to perform desired operations in as few steps as possible
- Without cluttering the screen with all kinds of irrelevant controls
Interestingly, I just watched a video by David Khourshid (founder of stately.ai; xstate/state chart/state machine fame) on how LLMs can interface with state charts of your application logic to find the shortest path to accomplishing a goal.
It doesn't seem like too far of a stretch, if your application is designed to support this use case, that an LLM can convert intent to shortest path for a user. I'm very excited about the possibilities here personally.
The great thing about menus and similar “universal” components is that the user only needs to learn one (or a very few) ways for how to discover things. One problem with modern UIs is that each app looks different and/or is laid out differently, and that there are a multitude of different-looking navigation elements and popup-menu buttons and/or sliding gestures. And of course that it’s often not clear what is just a label or graphic indication vs. a clickable button.
The lack of clear distinction between different widget types (buttons, labels, input boxes) is one of my biggest pet peeves with "modern" UI design trends.
Desktop app complexity had grown over a decade so that Microsoft Word, which started with a simple single-row toolbar, now was taking over 30% of a typical PC screen with a sea of icons and controls. Meanwhile the growth in PC sales meant there was an influx of non-corporate users who were not being trained to use the apps.
That’s how we got Clippy (essentially similar to the text input box proposed here), “progressive”menus that hide advanced actions and features the user hasn’t been accessing, as well as the explosion of task-oriented panels in Windows XP.
The execution of these ideas left a lot to be desired. (The Office 2006 Ribbon UI was actually a meaningful step forward because it contextualized and prioritized the UI rather than hiding stuff. Microsoft has solid data showing that Ribbon improved the user experience.)
Maybe language model AI can give this kind of UI a new lease of life.
> The Office 2006 Ribbon UI was actually a meaningful step forward because it contextualized and prioritized the UI rather than hiding stuff. Microsoft has solid data showing that Ribbon improved the user experience.
It’s great for learning. But the different panels and varied icons, make it annoying to scan and find things…
compared to the traditional nested toolbar with consistent sized icons and labels
On the flip side, the variety of shapes and layouts makes it easier to recognize a feature you’ve used before.
Humans need some variety to make sense of their environments. Imagine if every tool in a toolbox was the same shape and size and color, or if every room and hallway in a building had the same plan and decor. It would be a lot harder to find what you’re looking for.
Icons/graphical selectors are difficult for me personally, but I don't object to them as I understand they are helpful to most people. I'd love to see brief textual labels, always displayed in the same order and disabled when they don't apply. Make the icons optional so I don't have to take up space with them.
This puts everything in the same place, every time I want to use it. Under CUA, it was a basic tenet - don't move things around or hide them. Menus used to provide this, now nothing really does - everything wants to subtly (or not so subtly) shift the controls you are provided based on context, which means it always requires some mental processing until/unless you learn state+position.
The exception to this (for me) is the context menu, which you will assume by default only presents contextually relevant options.
I've struggled to enumerate the spectrum from one to the other as I've worked through the impact of GPT on my work over the past year. What the author describe here is a great attempt at getting to the core concepts:
* mixing and matching aspects of GUI and CLI in a single unified interface
The key, to me, is that when designs combine elements of recognition vs. recall, those designs need to link the CLI and GUI in bi-directional ways such that direct manipulation of GUI objects impact model queries with near zero latency in the CLI while query and manipulation of the models in the CLI show changes in GUI "at once". Direct manipulation matters in both directions owing to perception.
Of course, I'm probably describing ancient history in terms of Logo or Borland's early attempts at bi-directional tooling, but I think a tip of the hat and a little retro-computing software is not a bad way to approach the problem.
> the variety of shapes and layouts makes it easier to recognize a feature you’ve used before.
It has the exact opposite effect for me. It makes the whole thing harder to navigate and even more difficult to find the exact thing you're looking for. It increases the cognitive load required to do anything except the most common actions significantly.
If I'm typing in a doc then Ctrl-? and type the name/keyword for the feature I want is much easier then navigating through the zoo. The zoo is fine when I don't know what I'm looking for and I can browse the pictures for inspiration.
I just do not understand the love for the ribbon. It's a junk drawer of icons and if you don't already have an idea of what your feature "looks like" and roughly where it lives spatially. It's painful to use.
It just feels like a visual representation of keyboard shortcuts. Spend enough time in an application and you learn how to get to stuff efficiently, but it doesn't help you if you just want the basics.
> Microsoft has solid data showing that Ribbon improved the user experience
do they also have data filtered to show powerusers? just because metrics go up doesn't really mean it got better. I'm pretty sure google also has metrics "showing" that it got better at finding stuff, which I really can't agree with. my ms-office literacy was much higher pre ribbon, and it never cought up
It’s a trade-off. The Ribbon demonstrably enabled average users to discover and access functionality that they never knew about or understood before. The numbers were very significant. (I think I read about it in Steven Sinofsky’s newsletter called Hardcore Software. He was in charge of Office back then.)
Power users by definition tend to invest in learning the software. They’ll be initially annoyed at changes, then adapt. And in the era of continuously delivered web apps, there really isn’t any UI stability left for a lot of the widely used software (sadly).
OS X had OK to the right, Windows had OK to the left. Now Apple has removed OK buttons in favour of writing the action on the button, which makes much more sense. Inspired by BeOS maybe?
So instead of "Do you want to save before you quit?" "OK / Cancel" it is now "Do you want to save changes?" "Cancel / Save"
Different operating systems have different conventions.
Windows centre- or end-aligns all these buttons (depending on the context), with Cancel last (generally mapping to having the primary action first). That has been its convention since at least the start of Win32 era. If you’re targeting Windows, this is probably what you should do.
Another platform end-aligns all these buttons, with the primary action last.
Still another platform stretches buttons, with the primary action last.
Others have cause to split things, start-aligning any cancel/negative button, and end-aligning others.
If you’re targeting a particular platform, you should generally aim to match its conventions unless you have very good reason not to.
With the way Windows has always laid things out and designed its buttons, its ordering is, I think, objectively at least mildly superior, following reading order and with no major stylistic difference between the buttons. Platforms that have switched to doing things other ways have normally (and more recently I think it’s consistently) been laying things out differently and/or designing their buttons differently. On mobile devices especially with their right-hander bias, limited maximum width and screen-based navigation, primary action to the right is objectively superior, for tactile and UI flow reasons.
Not sure. When I wrote it the other way I was thinking windows did it the way I said (I only use windows at work). Turns out they do it how I think is backwards, but looking at it I don't think it's a big deal.
There isn’t one standard set of guidelines. Platforms evolve, new ones come, guidelines will always be contradicting each others in weird ways (like this button positioning).
I have a HP wide format printer where after loading in a roll of paper, it presents a "Paper loaded successfully" dialog, with an "Ok" button to dismiss the message. The fun part is that the dialog also dismisses automatically in a couple of seconds and the UI underneath has the "Unroll paper" button in the exact same position where "Ok" used to be. I must have accidentally unrolled the paper immediately after loading it in dozens of times by now.
The only justification in my mind that makes sense for having it one way and then the next screen is the opposite is in the case of destructive actions (e.g. formatting a drive). It can help prevent users on autopilot from nuking things.
The only example of this that comes to mind is factory resetting a Nintendo Switch, but I'm sure there are much more.
If you're worried about autopilot, The button should move around so that auto pilot clicks have no effect, not change meaning. Or, require text input to confirm an action.
Around the time they released that I happened to attend a talk from someone on the team that came up with it, presenting their "solid data". It's so long ago that I can't remember the details but I definitely wasn't the only person in the room thinking they were doing some extreme cherry-picking to make that case.
Haven't seen the data, but I assume that not all users got it better (that's kind of impossible in most scenarios?) but that on average it got better?
Me personally like the ribbon concept and feel like it was a good improvement (because I could see the long term benefits) even if it meant I had to re-learn significantly.
That data may be very interesting to differ between average, mean and median. If most of the people are not pro-users, the average may increase significantly, while the mean may not.
> “Make the easy things easy and the hard things possible”
Unfortunately, that's not the principle in action. Modern UI/UX philosophy is more of "make the easys thing easy, and make the hard things out of scope of the product".
Unfortunately, keyboard usage became harder, to the detriment of power users. Personally I also find the classic menus much easier to scan visually than the ribbon icons which have varying sizes and are laid out in two dimensions (no clear scanning path).
Then why they made vimperator-like controls for the ribbon? Press Alt to display the tags, then enter the corresponding tag. Works in Windows Explorer too.
It's a pretty useful feature, clearly aimed at power users. It makes the ribbon both discoverable and productive, in contrast to most other controls, which are typically either or.
They also clearly made large steps towards keyboard-only controls. In Windows 8 and contemporary Office versions, they bound most settings (previously only accessible through GUI) to PowerShell objects. In modern Windows/Office builds, you can control practically everything from the console or keyboard, or script it - something that was traditionally missing in the Windows ecosystem.
And of course they vastly expanded the PowerToys. I just don't see them ignoring power users - they made great improvements for the advanced Windows and Office usage.
The ribbon keyboard navigation is significantly slower to use though than the classic menus, because you need to release the Alt key first, and also many sequences are longer. With the classic menus, many commands were like Alt+B+C where you could press those keys almost simultaneously like a chord, in a fraction of a second. The new keyboard shortcuts always feel clumsy and awkward to press in contrast, and they don’t build muscle memory well for that reason.
You don't need to release the Alt key. Not sure I understand any other points either, as "clumsy and awkward" sounds really opinionated. Ribbon tags are just ordinary vi-style modal shortcuts that were proven to work well before Microsoft even existed - you memorize mnemonics/syllables, and utilize your touch typing skill with them. Unlike Win32-style menu navigation shortcuts, they're mostly on the home row, and/or represent a mnemonic.
For the speed, you should really use the traditional non-modal shortcuts (which never went away), not Win32 menu navigation. However this question is discussed to death in vim vs emacs debates - the minuscule speed difference is irrelevant for one-time actions, what matters is the cognitive load and muscle memory.
The valid use case for non-modal shortcuts (as opposed to combinations) is manually repeating the same command multiple times over and over, but in text editors this is the strong anti-pattern, and in Word you can use F4 for that, which is much more convenient and can be done just by holding the key.
You do need to release the Alt key because the old menu access keys still work (invisibly), and are activated otherwise if you don’t release the Alt key. I think part of the subjective clumsiness is because the ribbon has higher latency than the classic menu had. This is a general problem with newer UI elements in Office. They sometimes have a latency such that they don’t register key presses if you press them to quickly. This means that you have to consciously have to insert a small delay between key presses, or have to wait for visual feedback before pressing the next key. This is something that wasn’t an issue 20 years ago.
> You do need to release the Alt key because the old menu access keys still work (invisibly), and are activated otherwise if you don’t release the Alt key.
This is a backwards compatibility feature overriden by Ribbon keys, isn't it? When you hit Alt+I (Insert menu in Word 2003) you get a specific pop-up warning about that (in Word 2019 you do), and can use the invisible old menu relying on the old muscle memory. But when you use an existing Ribbon combination, it takes precedence. For example, hitting Alt+W without releasing Alt calls the View tab on the ribbon instead of the old Window menu, so you don't issue two different commands.
The reason tags feel like having higher latency is because the tag appears either on release or after ~500ms of holding. But there's no input delay, it's purely visual. If you memorized it, you just touch type the combination at any speed.
Classic Win32 and Office menus are prone to actual input delays, though, if implemented incorrectly (e.g. starting slow operations on opening the menu).
> That’s how we got Clippy (essentially similar to the text input box proposed here), “progressive” menus that hide advanced actions and features the user hasn’t been accessing, as well as the explosion of task-oriented panels in Windows XP.
You're implying those were bad decisions, but could you also explain what exactly was bad about them?
I think Clippy legitimately was a bad idea: It was a clumsy attempt at anthropomorphising the computer, and it also came over cartoonish and game-y: Effectively, used UI patterns from a video game in a context where users wanted to get serious work done. (Interestingly, it also foreshadowed a lot of later trends in tech: Clippy was asynchronous, interrupting the user with notifications even if the user never did anything; and it foreshadowed the ability of programs to actually analyze content and not just be dumb typewriters with the infamous "it looks like you are writing a letter" message. I guess Tech learned it's lesson from the backlash, namely that it needs to be more covert and subtle about those abilities...)
However, the idea to show possible actions in the context where they are useful always seemed like the most straightforward thing to me. Maybe i'm biased as a programmer, but it seems natural to structure an UI in an "object oriented" manner, where the user is presented with certain types of "entities" and can manipulate them using certain actions: E.g., if you have a file, you can copy, delete or move it; you can also launch a program to edit it. You can do the same with a folder, but you can additionally navigate into the folder, etc.
The same pattern in a lot of programs designed for creating media: The entities would be text selections in word processors, shapes and layers in image processors, 3D objects in 3D designers, etc.
That was the basic idea behind the task panels in WinXP I believe. So what was bad about them?
> and it also came over cartoonish and game-y: Effectively, used UI patterns from a video game in a context where users wanted to get serious work done.
Hey hey hey. To an enthusiastic 12 year old like me, Clippy was pure gold! I could spend hours having fun with that guy. The doggie version was cool too.
UI is no longer user-driven as in giving the user the best possible experience. It is business model driven, I.e. utilizing dark patterns ("Yes" | "Ask again later" dialogs), data/telemetry collection for advertising, showing "pro" features pretending to be available just to tell they require upgrading when trying to use etc.
I really like the site owner's "digital garden" metaphor. One of the problems with social media is they've constrained personal websites to a standard format, and this has molded creativity around what a personal website should look like.
For example, if people were somehow forced to make their own sites, they would find that the content they have to share is not quite blog, not quite portfolio, not quite photo gallery, etc.. sort of like geocities of the 90's. It's actually a hard problem.
In general, I really enjoy personal websites because they show different people's attempts at solving this problem, and the digital "garden concept" is a cool metaphor.
I identify as a strong proponent of function over form, but I have to say, that is one pretty website. The web used to be littered with design-heavy sites, although few worked as well as this does. Either way, I wish there were more of them these days.
VS code (developer interface) has this feature where you activate a text input with a hotkey. It is used to trigger specific actions. You can type text and it will give you the actions that match the text. You also see a list of the actions you triggered that way previously.
This is great because I don't have to remember how to trigger a specific action and I can also discover new actions just by searching.
Ultimately it doesn't change the UI, but it sounds quite similar to what the author describes. There is some potential there with a search bar that is more about intent than text matching.
This brings back memories from my old days of VS programming (.NET C# or C++) with IntelliSense when I used to bless every object instance with Ctrl+Space.
In the early 2000s Microsoft did a lot of research in to "Inductive User Interfaces". [1] The closest approximation still around would be Wizard interfaces.
Creating those UIs was so rewarding, as you really had to focus on the goals of the user and figure out the branching that could get the right user to the action or activity that they needed to perform.
My first "startup" was a tool for life insurance agents to recommend products to their prospective customers. IUIs were perfect for scenarios like that. You could abstract away a lot of complexity.
I don't think you need AI for this, VSCode and lots of other applications already have a "functionality search bar" built right into the program. (Often euphemized as a command palette).
I myself don't use them much but I can see where they are useful to the "new users" the article speaks of, to discover and unlock features as the user gets more comfortable with the program.
I've long since taken to preferentially using the search bar instead of bothering to learn the location of buttons e.g. in the Office ribbon menus. It's a true blessing.
I might just be me but as I get older I'm finding it harder and harder to navigate UIs even though I've been an early adopter of technology throughout my life. I wonder if it is a sign of cognitive decline or just the consequence of having to remember an increasing amount of perpetually changing UIs.
Huh? No way I'd go back to the 90s for UIs. I'm not saying our modern UIs are perfect, and there is a lot of changes for fashion reasons, but there are generally far better these days on whole.
Well, either you were there in the 90s and had a very different experience than I did, or you weren't there in the 90s and are romanticizing what you have read about UIs back then. I mean, you can strip out a vast amount of functionality from a modern UI and get back to the simplicity of the 90s, but you'll hardly get anything done.
Funnily enough the new GNOME is quite great according to the listed requirements - everything is discoverable, Fitt’s law is taken into account (the corner has infinite size), but at the same time it is very good for power users, you get super+search, super+mouse wheel to change between desktops, and the best, really smooth gestures on laptops to change between neighboring virtual desktops.
If you add that the whole thing is extensible, I really feel like all its criticism comes from “it’s different than what I used”, if not more malicious views.
Wow, I disagree with you on almost every point! :) Which is not a bad thing, people naturally have different perspectives after all. I tried to like GNOME for literally years, but as a power user found that it got in my way more often than not:
* Features that I got accustomed to and used on a daily basis got removed in order to clean up some module or another, under the justification that "nobody used it anyway." Well, nobody asked me apparently? Requests to bring these features back were politely declined or ignored.
* GNOME seemed to go all-in on a "hybrid computing" experience because the developers believed that the future was portable devices that you could both carry around with you and then dock to a keyboard and screen at work or home. That never materialized (or hasn't materialized yet) so now a lot of GNOME functionality is stuck _between_ these two worlds and doesn't mesh cleanly with either.
* The GNOME devs' answer to functionality not invented here is to just make an extension for it. Except that installing extensions is (or was?) a burdensome affair, and GNOME breaks the extension API often enough that you might not be able to upgrade your GNOME if the extensions you use haven't been updated by their maintainers.
* As a desktop user, I want to be able to resize windows. But, there are no longer any window borders. Instead, there is an area that you can grab with the mouse to resize the window. But it's invisible, you can't _see_ it unless you hover your mouse over it. And it's on the _inside_ of the window, meaning it actually covers up application controls. Thus, you can place your mouse cursor directly over a scroll bar, click, start dragging, and find yourself resizing the window instead of scrolling the document.
* Don't get me started on the thicc window title bars.
* I prefer focus-follows-mouse and this is just super-duper broken in GNOME, in a number of ways.
I think software developers would do well to get away from the idea that their one project can possibly be suitable for all users. We clearly have users with different goals and levels of experience and it's not a bad thing to more narrowly target those users.
Your first point is literally what I mentioned, “it’s not what I’m used to”. Gnome got completely revamped, and that in itself should not be a criticism of the new thing itself.
Gnome did take into account this hybrid way, but as I mentioned didn’t compromise on anything for that. It’s perfectly usable on both desktop and laptop.
Gnome is an opinionated desktop environment, and the more customization you allow, the bigger the surface area of your software is. I think gnome struck a good balance between being customizable to a degree, but not making their unpaid job all that harder. I use vanilla gnome for what its worth.
As a power user you likely have your hand on the keyboard, so a resize is just super+right mouse, which has a huge target area of almost the whole window. Same for moving.
I have tried focus follows many years ago, but can’t say anything about its current state.
Was it really peak UI? Software is so much more complex today and accessibility and UX is actually a thing now. I feel like if you tried to make a modern program with a 90s UI approach, it would have glaring flaws.
I don't have that much trouble with modern UIs, but that's because of accumulated generalized experience with UIs and programming, plus modern UIs are simpler by virtue of most new software being perpetual MVPs / toys. However, I've also noticed I have much less patience for "UI innovation".
When I was younger, I was under less pressure and could blow some time having fun with new software. These days, I have very little free time, and usually have a specific job or task to do for the new software - the UI is effectively standing between me and my goal, so the more I have to learn about it, the more irritated I get.
It's always irritation and never hopelessness, because I have 100% confidence I can figure the UI out given enough time - I just hate having to spend that time.
Note: that's not the same as "simple UIs == better UIs". Complex and powerful UIs with good manuals are the best for me. After all, if I used some software once, there's a very good chance I'll need to use it again, and again, and then some, probably in one or few sessions. Whatever time and frustration simple UIs and toy software save on first use, come back tenfold when I realize the UI is so dumb it doesn't have any way to batch repetitive operations.
Slightly differing UIs are not interesting enough for your brain to activate problem solving and learning. But they are different enough for you to get slightly confused in minor ways.
I think that’s why some people streamline their UIs as much as possible: vim/emacs/terminal everything, same bindings everywhere etc. all optimized for muscle memory and comfort.
Part of Englebart's great demo was about how they'd designed different kinds of interfaces for adults and children. Adults would have sequences of (chorded keyboard) commands that they could learn to do work rapidly. Children would have UI like is ubiquitous today - it could be explored and would be discoverable.
I never knew what research that was based on, but it sounds like you'd prefer the grown-up experience that most companies just don't make. The closest we see are hotkeys for things like Office or our IDEs, but those can lack consistency from one version to the next and certainly from one piece of software to the next.
Not only do I need to guess which of the 3 horizontal bars is the resizable one, it's literally a single pixel in height and very hard to click.
Flat design is such a regression in terms of UX it's difficult to comprehend how intuitive the Windows 95 style was, largely from reliance on universally familiar spatial metaphors.
That's ignoring the fact that none of the icons have labels so you have to learn the Mayan alphabet to figure out what they do (and these icons change sometimes too!)
It also seems like preference dialogs have just sorted given up any pretense of trying to make sense. Instead of navigating a hierarchy, I'm supposed to guess which keywords to search for in order to find the correct option I want to change. It's a real downgrade. The affordances of search boxes are notoriously ambiguous for this exact type of task. I want to change the pixel scaling on the screen. What do I search for? Screen? Display? Resolution?
>> That's ignoring the fact that none of the icons have labels so you have to learn the Mayan alphabet to figure out what they do...
Remember when the file menu was removed and you couldn't figure out how to save your work? Oh, it's that decorative symbol looking thing in the corner above everything else. And when you click it, there isn't a menu at all, but a huge control panel thing appears.
Every user that creates something wants to save it. This is not where you put the save command.
Amen. Obsidian is a good example of what you are talking about. So many different icons I have to remember plus the way to search my notes involves clicking on what appears to be a text label (Files) to reveal the search icon.
Wow, that is a really good example of how bad things have gotten. That software is both confusing and very ugly.
One thing I've thought about is some kind of "Quick tooltip", like a toggle to make the UI always show tooltips instantly on hover. That could be an accessibility feature built into the OS.
Dark themes have a habit of being bad at this, the way that in your screenshot there’s no distinction between panel title and panel contents, and intra- and inter-panel borders are all the same. You’ve just generally got fewer grades to practically work with than in light themes, even if you don’t mess things up through a badly misguided sense of æsthetics (which they have).
A light theme would be much more likely (though certainly not guaranteed, since the flat craze began) to have a light grey background on the panel title, a white background on the panel contents, and no border between the two (or at least a lighter grey than the inter-panel border), so that the only border (or at least the strongest border) would be above the panel (that is, above the panel title), making things much clearer. (The panel toolbar would also tend to get a light grey background, similar to the panel title’s or between that of the panel title and panel contents.)
Nice article. An alternate is completely unrelated to AI/GPT and rather, the basic use of a command bar/palette. Provides a lot of new ways to approach UX.
Yeah, I definitely think the biggest productivity gains I've had in my entire life were the command bar in VSCode and the integrated start menu search in Windows 7. Keep the UI simple, but searchable, even better if I never have to take my hands off the keyboard to find and execute a command.
On a somewhat related note, does anyone else find scrolling on the page really laggy? Not unusable but definitely noticable. I'm on desktop Firefox.
The menubar being standardized across most apps, which enables that search function, customization of the key shortcuts for any menu item in any app, and enables the creation of third party apps that present menu items in a different way[0] is one of the most underrated features of macOS IMO.
It’s the Wild West when it comes to menubars on Windows and Linux, with there being more ways to implement a menubar than can be counted (if the app in question even has one and isn’t using one of those terrible hamburger menus instead). There’s no hope of fixing this under Windows but for Linux, I could see a new XDG spec that standardizes how menubars are exposed which would enable macOS/Unity style menu searching without apps being patched to work with a single DE.
I've always found it weird that a "function search" hasn't become more mainstream.
It's easy to implement without AI. Just make a big list of your functions, describe them verbosely, and give them a link - now let the user search for what they want to do and send them there.
I remember Rhino3d having this 10 years ago and it was amazing while all kinds of daws, editing programs, desks etc. would be a jungle to navigate or just have a "docs" or "help pdfs" - such a waste of time to look through.
In general this kind of navigation has been overlooked for decades.
Current UI designs have business constraints that just can't allow for much change. The user is not the customer, the user can't be bothered to invest time to learn as there are multiple apps competing for their attention, if the user is familiar with some UI design you can't give him a completely different design regardless of how much better it is and so on.
In spaces where the business considerations are different (aviation, medical devices, mature open source projects, etc) the UI is pretty good for the target users.
A sidebar to this includes abusing "easy" to equate to "requiring less energy" to operate. An increase in stress equates to more energy used. To wit, being forced to find differences in ui functionality across mobile and web for the same app. Or clicking on a control that just jumped location because of a slow page render. Or maybe just gratuitous (debatable value) in ui changes between windows versions.
OG get off my lawn stuff: When I first started writing code back in 1989, I worked on a tui desktop app that was extremely refined to facilitate data entry from a piece of paper. Sort of a microscope split your vision between paper and instrument. Everything about the tui was custom and optimized via tons of user feedback. By far that app still represents an example of the least energy required to get something valuable done on a desktop. Yes, specific use case to a large degree, but the tui framework we built suggested ease of use for tuis in general.
Redoing the app in windows killed the experience to a noticeable degree. The point is, suggest that generalized ui tool constraints will force the user to expend more energy, in some way at some point. Obvious point, but there is no perfect ui solution, especially in today's world. Only trade-offs.
I like where they're going, but I think the model is too simple. There's no "a user". Software is a response to populations of users.
An MVP is always focused on a very specific audience, the early adopters for it. It can be minimal because it's for a group of highly motivated people eager to solve a specific problem. They tend to be both experts (in the specific problem space) and explorers (in their willingness to try things out).
By the time a product has spent a few years growing, it is often addressing the needs of a much broader set of users, which presents a much bigger UI challenge. But it's not just a UI challenge. Take film/video editing, for example. The original film editors were all using physical gear to splice actual film together. [1] The initial target for computer-based video editors was those same expert film editors. One of those editors who made the switch pointed out that people knew they didn't know how to use his complicated set of physical tools, but assumed that since it was on a computer, they too could edit film. [2]
That wasn't the case, though. If you want to go from addressing the population of "expert film editor" to "person who wants to be come an expert film editor" it's a radically different problem. A very valid solution to that is keeping the interface as is and saying "go take a class".
Then you also have populations like "person who doesn't want to become an expert editor but wants to share a tolerable video of Bobby's first birthday". That's an argument not for a modal interface, but for a tool built for that audience and no other. Maybe it's a cut-down version of the expert one. But just as likely as it's something that does not work with a pro editor's mental model at all, but with that of the existing user.
I'm working on this a lot right now, I have long tinkered on my personal productivity and data interface and I'm trying to make it accessible to others. It basically puts an end to context switching and deletes that cost from 80%+ of my workflows. It's a "no" first interface so it really removes clutter and cognitive overhead.
I think with NLP and with AI guides we really need to change our unifying OS interface away from "adhoc work tools catagorized by icons" and go to interfaces categorized by intent or users data structure (that maps to their problem/intent/world as it grows and develops), so much lexical work has been done, so many concepts propogated via guis, that I think composable lexical interfaces should return with visual and nlp aids.
The ideas are interesting, the illustrations are highly distracting.
> If the software is at a certain level of complexity, new users will only learn parts of it or not use it at all.
Being talked about complexity next to a cockpit of a plane that mandates pre-training, with an exam leading to the acquisition of a license, followed by a mandatory minimum number of flight hours per month to keep the right to fly alone isn't the best illustration I have of "new users will only learn parts of it"
And it goes on for every bullet point, there will be a plane cockpit next to it...
But yes, I totally agree that more thought needs to be given to the learning curve and discoverability of features of other non life critical, non regulated software.
I feel like you could strip down a UI, stick a language model on a help menu and you'd be most of the way there with this already.
I have a minor hesitation here. The user inputs an endpoint, and the program is then supposed to connect a user with the tool to complete it. Solutions often have different ways of being reached with tradeoffs associated with each method.
Also, I wouldn't underestimate the ways in which, for some types of workers, the affordances of tools are part of the creative process. There's a way in which a product will be less thought through when the journey from conception to completion is cut short.
Amusingly, for a long time, emacs has effectively been this, but with a fuzzy search tool. Very common to just bring up the command minibar and start typing words looking for relevant things.
Going back, many old machines were like this. They all had some form of "apropos" command that we seem to have completely forgotten about as things got bigger. Indeed, the idea of man pages and the like seemed to get skipped out on with dos, such that windows also didn't really have it ingrained heavily. The docs were almost certainly there, but they were often specific in formatting to each program? (Or is my memory just off?)
We're[1] currently working on a project where we're taking a very similar approach. Without shedding too many details, but we're designing the product for a startup where the UI changes (GPT powered) based on the user's intent. Now the challenge obviously is to find the sweet spot where a 'user driven ui' becomes a natural intuitive way of using the product. There aren't many good examples out there that do this, but it's certainly fun to explore.
> we're designing the product for a startup where the UI changes (GPT powered) based on the user's intent.
This might just be me, but one of my pet peeves in modern UI design is when the UI changes automatically based on context. Whatever the UI is, I want for it to be that way until/unless I specifically ask for it to be different.
For example, control panels that are context-oriented are OK, but changing the panel automatically depending on detected context is not.
Really good article! I like how it framed simplicity and complexity of apps, and how users want to stay in a certain zone.
I know the example was trite but future UIs will just allow the user to type or speak what they're trying to do. For example coloring in the circle/triangle will literally require you to say just that.
Whenever I see how people are cropping, masking, and adjusting an image, it already feels so outdated with what AI feels like it will be able to do based on text/voice input.
The "can we help" toast sliding down at a glacial pace when dismissed made me laugh out loud. But I just gave up at the Terms and Conditions scroll speed.
I don't think products necessarily accrete features because the existing users need them to do more "stuff".
Additional features tend to target new/different subsets of users in an attempt to increase the size of the user base (and thus value to the creators).
So how do you choose the "MVP" features to expose to each subset of users?
Great article, but it's somewhat ironic that the article about UI has such a laggy JS/CSS heavy UI. Why on earth does this require React? A basic HTML page with a minor CSS polishing would do the job.
Kept looking for LLM references in the 'what do you want to do' part and was surprised to not find any. Absolutely perfect place to put tool docs in the context and let users ask the LLM how to do what they want; bonus points for fine tuning the model on the docs and use scenarios.
"There would also be a text input. If the user couldn't do something with the UI, they would enter what they are attempting to do. Using a natural language processor, like ChatGPT. The software would return a list of features that could help. Each has a button to add them to the UI. Going back to our painting app. Imagine you wanted to add color to a shape. With the current UI, you can't. Enter "Color a shape" into the text input, the paint bucket tool will become available to add to the UI."
- Allowing the user to perform desired operations in as few steps as possible
- Without cluttering the screen with all kinds of irrelevant controls
These constraints can be at odds:
Every time you hide some feature away in a menu, you are adding steps the user has to perform to achieve what they want
On the other hand, presenting every single feature on the screen all the time just clutters the UI and makes the user feel confused
It's no good to let the user decide what goes on the UI either: you've now just burdened the user with the job of creating a UI for your application.
Other constraints:
- Make sure all features are "disocverable" by just casually playing around the UI.
This rules out the idea of "hiding" things until the user triggers some kind of magic incantation.
Command pallets and shortcuts are great ways for enabling power users to perform operations quickly, but the product must have other ways for exposing features that is predictable and boring. Menus are a great way to do this. A user casually playing around with buttons and menus will eventually discover features, and they will also eventually learn the shortcuts for them.