I'm highly skeptical of an approach that involves training users to rely on a black-box ML system. That just makes them ever more dependent on technology they can't possibly understand and puts more power in Google's hands. By being the sole arbiter of what is "tricky," Google gets to blacklist the entire Internet.
It would be better to help users understand the URL. I don't mean expecting users to parse the syntax on sight; I mean finding ways to display or represent it so that the important information is easier to see and fraud is easier to spot.
[https] [www] [example.net] [foo.html]
It could go even further and obscure the contents of the first, second, and fourth boxes, until you mouseover or focus it (but all of the boxes should appear light red in background for http, and light green for EV, even if you can't see the text in them), and the last one should be far from the one before it, to avoid e.g.:
[https] [www.example.net] [example.org] [foo.html]
[https] [www] [example.org] [www.example.net/foo.html]
Clicking on any box (or the regular Ctrl+L) could turn it back into one box (for easy URI copying) and defocusing it will revert it again. Power users could set a knob to simply always display the 1 bar they've been looking at for the last 25+ years.
Maybe there could even be a conditional 5th area for the query parameters (GET variables) which isn't even shown by default (without input area focus), who knows.
[https] [news] [ycombinator.com] [reply] [id=19032043&goto=item%3Fid%3D19031237%2319032043]
https://example.com.phishing.com -> [https] [phishing.com] [example.com] [foo.html]
[https] [com] [phishing] [com] [example] [foo.html]
Or ditch the protocol and not render http at all by default.
Blacklist is easy to understand, as long as we trust Google (lots of us don't) everything would be fine.
With ML, not even Google have a full picture of what's going on.
I don't even think that youtube necessarily should get an autoplay prompt on first use, but it's pretty convenient that ML-based approaches like this are used instead of much simpler approaches.
Lots of research is going into creating adversarial data given known ML algorithms, as well. If this address bar ML is running on the client (it'd have to, right?) then it's not hard to do a training run against it to come up with custom tailored URLs/sites to get the ML to classify your attack as good.
ML equals diffusion of responsibility.
An ML solution is a completeness vs correctness trade off. ML can make the blacklist virtually infinite long, whereas a human team would likely burn out (and make more/different mistakes).
I'm not the OP but I personally don't remember any of that because I'm not an American (like a major part of populace on the Internet) and I've never used AOL. And maybe AOL failed in America exactly because it did the things you mention they were doing, i.e. "controlling and blacklisting" a large part of the Internet.
In the 90s while mass AOL CD mailings were going out there was fear that "AOLization" of the internet would happen.
The same incentives for AOL curated and walled garden are present today for Google, Facebook etc.
If Google is trying to make their own private internet on top of the public internet I'm sure a few antitrust regulators will start asking about their hold on search and ad markets.
google did it with youtube, if they do it to chrome i don't know if they can handle the developer frustration that will ensue(i'll put a nice red fullscreen browser incorrect banner on my website if users visit from chrome)
> Correction January 29, 10:30pm: This story originally stated that TrickURI uses machine learning to parse URL samples and test warnings for suspicious URLs. It has been updated to reflect that the tool instead assess whether software displays URLs accurately and consistently.