My impression has been with frontend tests is they come in two flavors: extremely hard to write to the point of impracticality, and useless. Most orgs settle for the latter and end up testing easily testable things like the final rendered dom and rely on human qa for all the hard bits like "does the page actually look like it's supposed to, does interacting with the UI elements behave correctly, and do the flows actually work."
All largely stemming from the fact that tests can't meaningfully see and interact with the page like the end user will.
> extremely hard to write to the point of impracticality, and useless
> Most orgs settle for the latter and end up testing easily testable things like the final rendered dom and rely on human qa for all the hard bits like "does the page actually look like it's supposed to
Not disagreeing with you here, but what ends up happening is the frontend works flawlessly in the browser/device being tested but has tons of bugs in the others. Best examples of this are most banking apps, corporate portals, etc. But honestly, you can get away with the frontend without writing tests. But the backend? That directly affects the security aspect of the software and cannot afford to skip important tests atleast.
>can't meaningfully see and interact with the page like the end user will
Isn't this a great use case for LLM tests? Have a "computer use agent" and then describe the parameters of the test as "load the page, then navigate to bar, expect foo to happen". You don't need the LLM to generate a test using puppeteer or whatever which is coupled to the specific dom, you just describe what should happen.
i got an agent to use the windows UIA with success for a feedback loop, and it got the code from not wroking very well to basically done overnight, but without the mcp having good feedback and tagged/id-ed buttons and so on, the computer use was just garbage
Depends. Does it represent end users well enough? Does it hit the same edge cases as a million users would (especially considering poor variety of heavily post-trained models)? Does it generalize?
All largely stemming from the fact that tests can't meaningfully see and interact with the page like the end user will.