Hacker News new | past | comments | ask | show | jobs | submit login
Microbrowsers Are Everywhere (24ways.org)
271 points by cpeterso on Dec 17, 2019 | hide | past | favorite | 54 comments

I think Google is the original inventor of link previews. It used to have them in the SERP pages if you clicked the magnifying glass. I think it was called "instant previews".

I actually created and launched an API on hacker news about 9 years ago which does this, you can check it out here: https://linkpeek.com

My service makes one request and caches the result. Using Pyramid (Python) and Nginx. Lots of the code is open sourced in various libraries.

The problem with trying to sell this tech is that all the biggest players build their own in-house solution, so the target market is really smaller sites, but they have a hard time paying, and also there is a lot of competition in this space now.

Funny little anecdote: When I was at a hackathon, I was sending my team members, over WhatsApp the hosted ngrok link each time I accidentally closed the tunnel. The WApp microbrowser didn't store cookies and as such would get stuck in the redirect loop from the index (yeah, yeah, it was a hackathon). This would lead to my WhatsApp Web client crashing before the message was even sent. It would only happen when I sent my badly programmed auto-login URL, and no time else, so I'm quite confident it was aggressively failing when it didn't manage to get the information it needed and straight up crashed the web client.

That's interesting, sounds like WhatsApp Web loads the preview client-side?

Yes. To be specific, if you are using the Web based WhatsApp (or the 'Desktop' app), it will not load the preview until your phone has pinged that it's online. There are many times when I wait for the preview to load (because if you send the message without the preview, no-one else gets one either), only to have to wake my phone and voila, the preview then loads on Web WhatsApp.

Aha! I wondered why the appearance of previews was inconsistent, now I know what to do to make them appear, thanks!

This may have been a product of Erlang's 'let it crash' philosophy: https://wiki.c2.com/?LetItCrash

> It’s almost 2020, advertise one favicon size. Remove all the other <link rel="shortcut icon" and <link rel="icon" references.

Which one should we keep?

Unfortunately all of them since there is such a mess of support across browsers. SVG should be king by now but it's not.

Latest chromium does support svg favicons

Well, you don't need the 16x16, 32x32 and .ico ones.

>Finally, many platforms - particularly Facebook Messenger and Hangouts use centralized services to request the preview layout. This is in contrast to WhatsApp* and iMessage where you will see one request per user.

Wait, does this mean you can get the IP of most people you can contact on whatsapp by just sending them a link and waiting for their phone to check that link?

I can think of plenty of ways this could be abused!

You can get the IP of people on Whatsapp by video calling them and watching the SIP target addresses. They don't even have to answer the call.

You can also get their online/offline state by repeatedly requesting their client re-sends you their profile image, even if they have read receipts turned off.

No, it is the sender of the link that performs the link extraction / microbrowsing before sending it over.

This. If the link preview fails to load on the sender's device, it'll never appear for either party. The preview is sent along with the message – the receiving device never generates the preview.

If you're quick enough, you can see it in action:

Paste a link in a WhatsApp message. The preview might take about half a second to load. Hit 'send' before the preview has loaded, and it'll never show for that message.

Paste the same link again, but wait for the preview before hitting send. It'll stay attached to that message.

No. It is still the WhatsApp server making the request to your website - so you only see the server's IP.

Facebook makes one request and then caches the result. Other servers may not store the result.

[Here's how signal does it](https://hub.packtpub.com/signal-introduces-optional-link-pre...).

TL;DR, the app establishes a TLS connection directly between your phone and the site, but routes it through a proxy on Signal's server, thereby hiding your IP without being able to view the URL. I imagine Whatsapp works in a similar fashion

Doesn't that break end-to-end encryption?

So you remember the way we used to prevent people from linking images from our websites, we'd show a different image.

That means I can do the same with `og:image`, give a nice middle finger when linking my website (hey, a bit of 90s fun here)

Something that downloads information from the web is called an "agent". A browser is a specific type of agent called a "user agent". Calling agents "microbrowsers" creates an odd curvle circle: user agents are browsers which are agents which can be microbrowsers which are user agents, etc. Just thought it was interesting

They cover this in the article, carefully describe a “microbrowser” as a subset of agents which are closer to full browsers:

“Microbrowsers are a class of User-Agents that also visit website links, parse HTML and generate a user experience. But unlike those traditional browsers, the HTML parsing is limited and the rendering engine is singularly focused.”

This seems like a reasonable distinction, especially since many people are used to thinking of “browser” as something which does a lot of work (parsing, JavaScript execution, etc.) which aren't implemented in this class of agent.

It's not so much a circle as just a newly coined term for "agent":

    user agent ~= browser

    agent ~= microbrowser

The terms on the left seem more technical/archaic, those on the right seem more populist/accessible.

On the other hand, since this one blogger seems to be the inventor of the term "microbrowser", it's still going to be ambiguous: easily confused with potential alternative definitions such as "lightweight browser", "headless browser" (e.g. selenium webdriver), "browser with low market share", etc.

I can't tell if common things getting redefined is mostly from clueless journalists or clueless tech workers that think they are inventing everything for the first time in their first year on the job.

I think there is a lot of both, but there's also so much churn and so little historical knowledge propagated through the profession that people just keep reinventing wheels because they don't know they've already been invented. Every revolution of the hamster wheel does seem to get us a few inches ahead, but refighting the same battles and redoing the same work gets very wearisome. And I'm not even that old; I don't know how the people twice my age that are still in this business can avoid being hopelessly jaded.

I keep thinking that we might see a second "information age" the day we can reliably find something in the large ammount of data we have been piling up recently (is there such a thing as the "data age"?). Neural networks might be the beginning, but if we stop reinventing wheels, we could become a whole lot more productive.

Are there good documents regarding "capitalism" (I guess; constantly reinventing better wheels) vs other systems (reusing already designed wheels, however square), in domains other than financial: what the tradeofs are, how much worse your existing wheel can be before you start to feel it, potential for innovation, etc. Free software might side with the later, but I'd like an analytical exploration.

My favorite is how Node.Js now finally has implemented most wheels. It was a rough ride for a while. Seems every language is a very effective silo, or every runtime, rather. The .NET languages stand out.

The latter, I think.

I feel this might be a good way to explain to non technical people all sorts of software is “browsing” the web for small bits of data “micro” without their specific permission. So I think it’s more helpful than harmful.

Anybody knows what to do with an SPA that, without js, only displays "loading..." and doesn't generate any preview on any platform (whatsapp, telegram...)?

This is for a hobby project, so rewriting it to do server side rendering etc. is not an option.

> This is for a hobby project, so rewriting it to do server side rendering etc. is not an option.

Rewrite it to do server side rendering.

Sorry if this seems facetious, but if you want it to be accessible, it needs to be server-side rendered.

Some crawlers have some (limited) JS-rendering abilities; notably Googlebot; but it's not full. Some crawlers may use a fully-fledged Chrome-headless but latency would make this uncommon.

In general, JS-dependent SPAs have always been a terrible terrible idea for anything public facing that needs to be crawled (they're fine for web applications, not for web pages).

Sorry I didn't mention: it is a web game. Accessibility without JS is not possible.

In that case, JS-only is fine, but presumably you're talking about serving fallback content (bots/crawlers are unlikely to be playing your game)

Yea, I will try some better fillers than just "loading...", and the metatags stuff.

I also might consider serving some other non-game pages statically, like the leaderboard, user profiles, etc.

Well, SSR. Almost all indexing bots (Google, 'Microbrowsers', etc.) will require it or at minimum heavily recommend it.

I doubt you will need to rewrite it if you use React or VueJS (or any other big framework) as they have SSR support.

SSR is definitely the best bet, but because you say that's not an option, I'm assuming you're just hosting static files without a proper server. In that case, a static site generator might be an option.

For React it seems that Gatsby is very popular: https://www.gatsbyjs.org/

Depending on the complexity of your hobby project it might even be feasible to write a simple script that runs node, outputs HTML files for each of your pages, and uploads them to online storage.

You might look into GitHub Pages too, although I'm not familiar with its capabilities: https://pages.github.com/

Assuming you have some sort of control over the root index.html file, write the appropriate `meta` tags in the header there. This article is old, but has a good, illustrated view of what tags do what. https://medium.com/slack-developer-blog/everything-you-ever-...

I'm not a big fan of apps automatically initiating connections to any 3rd party website. If this leads to an increase in sites that remain functional without javascript and won't complain about a lack of a referer header I'll be glad, but I suspect that while 'microbrowers' today aren't usually running javascript and try to protect user's privacy, eventually we'll start seeing website previews being used to hack devices and steal user data.

The request is made on the back end so I’m not sure any of your comment holds up.

This is something that I implemented in Swift while working iOS engineer at XING.com 2 years ago, it was really a fun widget to work on, that recognized the link and loaded title/body/descriptions along with their `og:image` for the image preview.

Now in my own product [1] I was a long time without implementing the right tags on my landing / marketing page, and blog posts. I wish I would have done this since I launched, but just started doing this 2 months ago since I switched to Gatsby for our marketing page. They have a very easy to implement SEO component [2], that allows you to focus more on the content than on tweaking the meta tags for social.

[1] https://standups.io

[2] https://www.gatsbyjs.org/docs/add-seo-component/

I installed unity 3D and it wanted me to sign in to get my free license. Sure. So I picked Google sign in from the list. This opened a little browser inside the Unity launcher that immediately complained that it was insecure and would not allow a Google Sign in attempt.

Sure felt silly.

Lots of solid tips for web developers in this article.

Very useful and solid info, now if we had a tool for generating and testing these link previews locally for our stuff... There's some APIs and various libs/WP plugins for generating link previews, but nothing that covers the functionality of major messagers and websites. Any takers?

I'm wondering if having a bit more of a preview results in fewer "headline is misleading" disputes?

“ However, the real gold for marketers is from word-of-mouth discussions. Those conversations with your friends when you recommend a TV show, a brand of clothing, or share a news report. This is the most valuable kind of marketing.”

That’s saying quite a lot.

Not sure I’ve ever heard the term “microbrowser.” It makes for an interesting headline I guess but it’d just as well be fine if it focused solely on opengraph and friends.

Didn't Microsoft Calculator used to have a browser built into the help menu somehow? I remember it being exploitable to get around stringent GPOs....

Not just the calculator but any app that used the Microsoft Help system.

TLDR: Article describes how link previews are rendered by various social media / chat sites and provides useful tips for web developers on how to optimize the preview experience.

Does anyone know of any open source microbrowsers?

I'm just here to say I read that as microbrewers and I was much more excited about that article. The brewing boom here in Colorado is outrageous. I would love to learn about the confluence of technology and brewing, and what if any tech small breweries are leveraging to give them an edge.

Thanks for digging up those links, I'll have to read those articles later!

The top comment on that third article about the brewery explosion in Vancouver could just as easily be about Denver - the story is nearly the same. As I understand it here in Denver, a change of law make it much easier for breweries to operate tasting rooms which can be quite profitable and spurred massive growth.

Yay! I just used the search box at the bottom, then skimmed through the results:


HN is perhaps the website where I find whatever I'm trying to, if it exists.

I read that in the title and the first few comments. I was trying to figure out what microbrewers has to do with use agents.

In both cases it pays to use them responsibly.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact