
Software compatibility and our own “User-Agent” problem - nmjohn
https://www.sigbus.info/software-compatibility-and-our-own-user-agent-problem.html
======
userbinator
_and our linker prints out a slightly strange string in the help message._

Strange, or actually helpful? It would've been more devious if the message it
was looking for actually contained more... potentially copyrightable content;
here's one recently-mentioned example:

[https://dacut.blogspot.ca/2008/03/oracle-
poetry.html](https://dacut.blogspot.ca/2008/03/oracle-poetry.html)

This also reminds me of
[https://en.wikipedia.org/wiki/Sega_v._Accolade](https://en.wikipedia.org/wiki/Sega_v._Accolade)
and
[https://en.wikipedia.org/wiki/Lexmark_International,_Inc._v....](https://en.wikipedia.org/wiki/Lexmark_International,_Inc._v._Static_Control_Components,_Inc).

Anyone who has experimented with Hackintoshing may also recall the
"SMCDeviceKey", a "magic cookie" that serves a similar purpose of attempting
to use copyright as a blocker to compatibility.

~~~
jaflo
Could you elaborate more on the SMCDeviceKey? What value does it store that
infringes on copyright? A quick Google search didn't yield much.

~~~
DiThi
Something like "our hard work is protected by these words, please don't steal
(c) Apple" without spaces or punctuation...

------
cesarb
It's not only lld, the clang compiler has a similar issue. It pretends to be
an old version of GCC (4.2) with the predefined macros __GNUC__ and
__GNUC__MINOR__, since a lot of software check for the presence of the
__GNUC__ macro or its value to enable modern features. That occasionally
confuses software which expects features first introduced in gcc 4.1/4.2 to be
present when these macros are defined (see for instance
[https://bugs.llvm.org/show_bug.cgi?id=16683](https://bugs.llvm.org/show_bug.cgi?id=16683)
or
[https://bugs.llvm.org/show_bug.cgi?id=24682](https://bugs.llvm.org/show_bug.cgi?id=24682)).

------
hyperpape
The first thing that came to mind was Poul-Henning Kamp’s “A Generation Lost
in the Bazaar”:
[https://queue.acm.org/detail.cfm?id=2349257](https://queue.acm.org/detail.cfm?id=2349257)

~~~
muxator
Exactly!

When I first read the passage about libtool I thought it was a joke, and the
illusion of order got a cold reality shower:

> the tests elaborately explore the functionality of the complex solution for
> a problem that should not exist in the first place. Even more maddening is
> that 31,085 of those lines are in a single unreadably ugly shell script
> called configure. The idea is that the configure script performs
> approximately 200 automated tests, so that the user is not burdened with
> configuring libtool manually. This is a horribly bad idea, already much
> criticized back in the 1980s when it appeared, as it allows source code to
> pretend to be portable behind the veneer of the configure script, rather
> than actually having the quality of portability to begin with.

~~~
kuschku
And in the modern JS ecosystem, everything got even worse.

With webpack, babel, TS, grunt, gulp, and all together, thousands of tools
just glued together.

In the Java world, at least, every few years everyone rewrites the tooling
from scratch, so wehad running javac by yourself be replaced by ant, ant by
maven, maven by gradle. But, that's it.

In the JS world, you see nodejs launch a python script launch a ruby program
launch a perl wrapper for a shell script launching another nodeJS process
(e.g. during preprocessing of templates).

You end up with configs that are copy pasted because no one can exactly
explain how all the tools work together, much less configure it.

In the C/C++ world everything got a bit better once we got CMake and Ninja,
now CMake builds one file, which ninja then reads, and based on which it
executes G++. Bazel/Blaze even removes this intermediate step entirely.

~~~
couchand
Your argument isn't internally consistent. Webpack, Babel, TypeScript, Grunt,
and Gulp are all implemented in JS (or TS for TS).

Sometimes there are npm packages that are just wrappers for other projects,
but those are usually short-lived and end up being rewritten in JS (except for
C/C++ libraries, which neatly compile to native extensions directly).

If anything, the common complaint is much more accurate: the tooling is
rewritten from scratch too often.

~~~
philipodonnell
Eh. Anything that stands between the raw code you write and the code running
on the production serer is a "tool" that has to be managed, configured, etc...
The implementation language isn't as important as the fact that the code you
wrote requires tools written by third parties executing a sequence of
intermediate steps to be in an executable form.

~~~
couchand
Good point, we shoud all just arrange bits of machine code.

~~~
kuschku
No, but ideally you’d have 1 program you execute to transform your code into
the final build, and that’s it. Maybe that program can be modular, but that’s
it.

Instead of precompiling Typescript and JSX to ES5, with Babel that to ES4,
then launching dozens of asset processors to turn your SASS, SCSS, and JSS
into CSS, with dozens of wrapped processes...

As mentioned, the Java world handles everything with 2 processes, the C++
world with 3.

Only the JS world manages to run in a single project webpack, gulp, grunt,
compass, typescript, babel, and JSX, all transforming the source, leading to
build times second only to old-school C++ projects.

------
spiznnx
A fun read if you haven't already: [http://webaim.org/blog/user-agent-string-
history/](http://webaim.org/blog/user-agent-string-history/)

------
brucephillips
Mimicking magic strings is a variant of the adapter pattern - pretending
you're something you're not for compatibility.

------
amgaera
_Also, since we cannot update the existing configure scripts that are already
generated and distributed as part of other programs, even if we improve
autoconf, it would take many years until the problem would be resolved._

I don't know autoconf and this sentence got me curious: why would it not be
possible to regenerate existing configure scripts using a fixed version of
autoconf? Are those scripts likely to be manually edited after they've been
generated?

~~~
FooBarWidget
Yes that is possible; the configure script is for the most part autogenerated.
But the maintainers of said programs have to actually do so, and users have to
actually download the latest version. It would still take many years for fixes
to propagate throughout the entire ecosystem.

~~~
amgaera
The article talks about the broken linker check being an issue mostly when
_adopting lld as part of the standard build system of an operating system_ ,
such as FreeBSD. For FreeBSD to switch to lld, wouldn't it be enough to fix
the version of autoconf available in FreeBSD and use it to regenerate
configure scripts in FreeBSD's build system (when building and packaging
programs for inclusion in its repositories)?

------
smelendez
Maybe this would be a good job interview question for PMs.

If you were creating a new class of software where this could be an issue,
what would you do?

~~~
ikeboy
Any questions of feature compatibility are either tested directly, or
requested with an explicit API.

~~~
ikeboy
With browsers, this wasn't really possible because you needed to give the full
response after a single request. But running locally you can test all you
want.

The problem is when you can't get everybody to agree on a standard API, so you
resort to hacks that make it work ASAP that then lead to more and more hacks
till we get the insane user agents of today.

------
matt_kantor
> There were two possible solutions. One was to fix the configure script.
> However...

> The solution we ended up choosing was to add the string "compatible with GNU
> linkers" to our linker's help message.

The right way to deal with things like this is to do both. Do the hacky-but-
realistically-shippable thing to get unblocked, and then also contribute
towards the "right" solution, even if it's way upstream from you. Otherwise
you're part of the problem.

------
username223
_He who fights with monsters should be careful lest he thereby become a
monster. And if thou gaze long into an abyss, the abyss will also gaze into
thee._ \-- Neetch, _naturlich_

You young 'uns may not remember it, but there used to be more systems than
Windows, Mac, and Linux, and forced OS updates weren't a thing. Libtool was a
way to try to make programs run on most people's computers, by compiling small
programs to detect individual features. It was written in a nightmare language
called M4, which would generate shell scripts, which would usually generate C
programs, which would attempt to compile and run.

~~~
rkeene2
libtool's goal is actually smooth over the fact that dynamic linking is hard
at compile-time. It's not a good abstraction and must be deeply ingrained into
your project because of this.

I typically do not use libtool but rather have an autoconf macro [0] to
determine how to interact with the linker. This has the disadvantage that each
"./configure" invocation can only produce either static or shared archives,
but not both (since the object files that make those up may require different
compile flags). libtool's solution is to compile the object file both ways,
but it does not really go well with the autoconf mechanism.

I also have a different set of macros for managing the ABI [1], and I'm not
sure how that's managed with libtool.

[0]
[http://chiselapp.com/user/rkeene/repository/autoconf/artifac...](http://chiselapp.com/user/rkeene/repository/autoconf/artifact/00e2f95c1836c5e7)
[1]
[http://chiselapp.com/user/rkeene/repository/autoconf/artifac...](http://chiselapp.com/user/rkeene/repository/autoconf/artifact/b9d44e1a68b091e3)

------
zaarn
User-Agent Sniffing. It's the worst solution, everyone involved will hate it
eventually and in the end it amounts to all software just continuously
improving the same-yness of the string.

~~~
LoSboccacc
so, what's the better solution to detect devices that work in weird,
incompatible ways (like safari/ios lying about the vh)

~~~
zaarn
As mentioned, it's the solution that works but everyone regrets working.

Lying about the VH is violating standards and should be handled as such. An
electrician does not think "but what if the customer suddenly changes their
house voltage from 210V to 123V" but rather "anybody doing that is insane and
I'm not going to be responsible if they do that".

The fact that we have to do this is a testament to how bad the standards
situation has been in the past and still is.

The proper way would be to have one function that every browser supports and
which you can ask about such specifics. Writing browser.is-doing("vh-lying")
instead of having to work around the idiocracy of the browser in other ways is
inarguably better.

------
digi_owl
Didn't MS skip Windows 9 because so many legacy programs out there assume that
"Windows 9" means Windows 95/98?

~~~
maaark
[https://searchcode.com/?q=if%28version%2Cstartswith%28%22win...](https://searchcode.com/?q=if%28version%2Cstartswith%28%22windows+9%22%29)
(previous discussion:
[https://news.ycombinator.com/item?id=8397664](https://news.ycombinator.com/item?id=8397664)
)

------
mjevans
Trying to re-solve the User-Agent issue, I think it would be much better if
browsers claimed which standards they conform to, with an accompanying
version.

EG: www/HTML<=5.1 www/XHTML<=1.1 www/CSS<=3.0 ISO/ECMAScript<=8

The string would be split on the field separator (any non-printing space?).
All exact matches for specifications would be compared and the result ANDed.
This way a range could be created by having a minimum supported version as
well.

~~~
ubernostrum
The issue, as always, is bugs.

I remember back when feature detection was gaining steam as the preferred
alternative to user-agent sniffing, and Safari had a showstopper of a bug that
meant preventDefault was present and callable but didn't actually do what
preventDefault was supposed to do. So you had to fall back to sniffing for
Safari to work around that (by hooking up a different event listener with a
'return false' instead of a preventDefault call).

~~~
jstimpfle
I wish people could just say "this browser is broken, we don't support it. Get
a fixed version."

Alas, people don't do that in business.

~~~
Crespyl
Well... what we get instead is "this browser isn't IE, we don't support it.
Get IE."; only now it's s/IE/Chrome/g.

------
lowq
What a sad ending.

------
anthk
Uh, that why I use the Opera Mini 4 (or 5) user agent on DIllo: it helps a lot
on Youtube (you can watch thumbnails at least, among the comments):

At ~/.dillo/dillorc

http_user_agent="Opera/9.60 (J2ME/MIDP; Opera Mini/4.2.13337/458; U; en)
Presto/2.2.0"

~~~
gkya
When I was recently writing a Python script to ping the urls in places.sqlite
to check for link rot, I had to change the urllib's useragent to mimick a
modern browser, because otherwise lots of web sites redirect me around or
return errors (even 404).

~~~
UncleEntity
Yep, one bored day I was playing with the oneom.tk API from python and it took
a long while to figure out why I could get a page in the browser but the
python API fetch wasn't working. Until I set User-Agent to "Mozilla/5.0" that
is.

Why anyone thinks it's a good idea _not_ to send API data to a python User-
Agent is kind of beyond me but who knows...can't really complain since they
provide data for free and I really was just playing around without actually
needing to do anything productive.

Actually found a bug in urllib3 (that I probably should get around to
reporting) -- page sends http unless you set Accept to 'application/json'
which urllib3 doesn't let you do.

~~~
haikuginger
urllib3 lead maintainer here. I wasn't able to reproduce your bug, at least
with the main development branch, when I passed `Accept` as part of the
`headers` dict. If you could file an issue with a reproduction case, I'd
definitely appreciate it.

~~~
UncleEntity
Eh, maybe...

http = urllib3.PoolManager()

r = http.request('GET',
"[https://oneom.tk/data/config"](https://oneom.tk/data/config"),
headers={'Accept': 'application/json', 'User-Agent' : "Mozilla/5.0"})

r.headers['Content-Type'] 'application/json'

r = http.request('GET',
"[https://oneom.tk/data/config"](https://oneom.tk/data/config"),
headers={'Accept': 'application/json'})

r.headers['Content-Type'] 'text/html; charset=UTF-8'

~~~
haikuginger
It looks like this API in particular expects all requests to have a User-Agent
header, and we don't set one by default. Setting any user agent appears to
work; the issue isn't with the `Accept` header.

You can reproduce with curl with the following command:

    
    
        curl -v 'https://oneom.tk/data/config' -H 'Accept: application/json' -H 'User-Agent:'
    

That'll nullify the default curl user agent and should produce the same
results you were seeing with urllib3.

~~~
UncleEntity
That makes sense, I kind of figured they were doing something screwy. I'm
guessing a blacklist on certain User-Agent settings.

When I was messing with it I could get it to work with curl and (eventually)
urllib but urllib3 was no bueno until I tried to reproduce the problem and
just copied the header with the User-Agent field from the urllib code.

