Hacker News new | past | comments | ask | show | jobs | submit login
Don't touch my clipboard (alexanderell.is)
718 points by otras on Feb 18, 2020 | hide | past | favorite | 312 comments

It's not just a browser thing. Apple Books does this with their e-books, which is infuriating if you're working with a coding book and just want to copy-paste stuff into your editor/terminal. You get something like:

    “ghci> putStrLn (pretty 10 value)”

    Excerpt From: Bryan O’Sullivan, John Goerzen, and Donald Bruce Stewart. “Real World Haskell.” Apple Books. 
When you only copied:

    ghci> putStrLn (pretty 10 value)
Note that the quotes around your actual selection aren't even the ASCII quote character; you get horrid unicode quotes that are easy to miss if you're just trying to run a bit of code in your REPL. This isn't even a DRM ebook, so it's not like Apple is being compelled by contract to insert a citation. It's awful, user-hostile behavior that removes one of the main advantages of digital-vs-hardcopy coding books (copy/paste), and AFAIK there's no config that lets you disable it.

Wonder if an author will rename themself sudo rm -rf / with the proper escape codes.

You my friend, are an evil, sadistic meglomaniac. I love it. Remind me not to make you mad.

Bobby wants his tables back!

Legend has it that all data related to little Bobby Tables has been lost.


`rm -rf ~` is disastrous enough on macOS with a bonus of not having to authentiacate sudo.

How do people know the result of running this command without running a VM? Is it possible to run OS X in a VM these days?

The command's intention is so obvious I wonder why there is no warning from the os, the shell program or the terminal for such, easily blacklist-able commands. Or is there?

I've learned that personally one night fighting messed up rebase conflicts and for still unknown reason I copied and pasted a path for `rm` and ended up issuing `rm -rf ~ /repo-path /unneeeded-dir` - it took me a couple of seconds to realize that this command takes unexpected long time to execute and then my eye caught a view of Finder window with shrinking list of folders in home directory :) Time Machine helped me quite a lot but unfortunately with a nasty surprise that it doesn't backup dot files :-/ Lessons learned.

PS: it turns out you can restore dot files from the Time Machine backups but it's not enabled by default:


> Time Machine helped me quite a lot but unfortunately with a nasty surprise that it doesn't backup dot files.

Wow, that's good to know!

When I learned about this command 10 years ago I tried it on a Debian system that I had no use of anymore.

There's a big warning before it lets you execute the command.

That's true only of distributions that alias rm with rm -i.

Now, almost 20 years ago only RedHat did it. And it felt wrong to me :-/

Modern versions of rm require you to pass --no-preserve-root. According to Wikipedia [1] this has been the default (in upstream) since 2006. Of course it took distros some time to actually update to the GNU utils 6.4 (especially long-term support systems like CentOS) but it's been a decade since the change should've been implemented everywhere.

[1]: https://en.wikipedia.org/wiki/Rm_%28Unix%29#Protection_of_th...

Isn't that only a GNU and FreeBSD thing? I think other Unixes will still let you rm -rf /.

On FreeBSD you can do

    sudo dd if=/dev/random of=/dev/mem
and it will do exactly that, write random crap into your memory without any sort of safeguard, causing a spectacular crash and a console that looks like it's having a seizure. Linux won't let you do that unless it's been compiled with a flag to enable full access to /dev/mem and /dev/kmem.

I want to try that dd command. For fun. Just to see what happens on OpenBSD.

No lasting effects are there? Besides a crash, afterwards I shouldn't have any memory issues, right?

I tried this in a FreeBSD VM and it booted fine afterwards. OpenBSD wouldn't let me do it at all for some reason.

Obviously no warranty on bare metal/systems that are actually used for anything.

worth noting that it can be trivially workarounded by adding an asterisk at the end: `rm -rf /*` still works.

Fair enough, but on the other hand there's nothing preventing you from just putting --no-preserve-root in the command either. I see the feature as something to prevent accidents, not as a way to secure the rm command.

That version will helpfully leave behind any .files lingering in /

Maybe run `sudo chmod -R 000 /` instead. Can't get charged with destroying any data, but it's a huge pain to get a system working again from that. Only done it twice; hope never to do so again.

Curious to know how you recover from that? How do you what permissions to assign back to files and directories?

Depending on how you define "fix," the common way (if the machine has already been rebooted and you're locked out of the session) is to boot via a live distro, mount the filesystem, and change permissions to get a usable system back. There are various other methods depending on the machine state and requirements, too, so it's definitely recoverable from a working machine standpoint. Changing permissions back to what they used to be is a rabbit hole that varies depending on your distro, machine-specific requirements, and any special permissions setup by you and your admins.

It's a pain-and-a-half. I'm not sure any system ever recovers fully, but you basically run chmod -R's on most of the important directories and then fix the many services that will surely fail once you can get a shell.

More specifically, I used a bunch of these links until I could get a shell, then manually-fixed the rest.



Can confirm. I kept getting permission errors on something I was testing, and in desperation ran `chmod -R 777` thinking I was in my project dir. I was actually at root. After a bunch attempts to recover, I ended up saving the data I cared about and doing a fresh install. 0/10 would not recommend.

Curious to know what circumstances led up to someone doing that and if they were charged for other violations.

A modern shell should prevent you from directly executing pasted code.

Apple Books has so many of these ridiculous little incursions against good taste. Just to vent, I made a list of all the ones I found: http://macos-design-review.com/books.html

It really does seem like software at Apple is being designed by people who just aren't familiar with the platform.

It also messes with dictionary lookup! It adds this nonsense if you select more than two Chinese characters, and so it makes it really hard to look up four-character Chinese idioms in my dictionary (which are, inexplicably, frequently missing from the built-in “lookup” dictionary).

Apple Books is so close to being a nice reading interface, but there are so many stupid little bugs. Highlighting is another horrid little bug that can easily wipe out a full chapter’s worth of highlights with one tap...

The entire Apple dictionary system is a PoS AFAICT. It only works with perfect spelling and it only works with single words. How about finding closest match and how about letting me select lots of text and go through all the words.

I can only guess no one on the team actual uses the feature.

As a language learner I use Rikaikun for Japanese on my desktop browser. While I'd really like it if I could also run a similar extension in my mobile browser I'd be okay of Apple's dictionary actually worked with a fuzzy search and multiple terms... but no....

I solved the Apple Books problem with an automator script, using the "Copy to Clipboard" action. Then it can be assigned a shortcut in Keyboard preferences. https://imgur.com/a/sG2isap

Thanks for the method. I’ve written up detailed instructions for it at https://apple.stackexchange.com/a/382603/21473.

As you can see in that answer, I found a better method to assign the keyboard shortcut. Assigning the shortcut within App Shortcuts instead of Services lets you use the normal ⌘C shortcut in Books while not affecting copying in other apps.

Limiting the action to "Books" seems to not affect other apps as well. The main reason I did not use ⌘C is because it doesn't work when the context menu is displayed (and it gets displayed immediately after selecting text).

This was really helpful, thanks for sharing!

On Windows, clipboard items have hidden tags added when they are copied, so the program you are pasting into can make these decisions, instead of the program you are copying from needing to manipulate the copied text. For example, if I copied that code from a Web page and pasted it into the console it would paste just the code, but if I pasted it into Onenote it would check the tags and add a line underneath with the url the code was copied from.


I hate this behavior! Switched to a different eBook reader on my iPad because of it. I like copying interesting snippets to OneNote.

Out of interest what did you choose? I'd love something different that syncs between MacOS and iPad for ePub files.

> aren't even the ASCII quote character; you get horrid unicode quotes

Why should they be ASCII, and what’s wrong with Unicode?

Because the next step when you face this problem is doing a search-replace of the quotes to remove them. ASCII quotes are on your keyboard, so you can actually type the command to remove them.

Unicode quotes probably aren't, so it's extra annoying to remove them, and they're not even the same character for start and end so you have to do it twice.

The time spent for removal is then vastly higher than with ASCII quotes, which in the context of unwanted characters may well qualify as "horrid" imo.

Yes, this is pretty much what I meant. The "horrid" part is that I would like to do a quick

(or similar) in vim to delete the quotes on a multi-line block. Even that is annoying. But with the unicode quotes I have to do a bit more conscious searching, which is a distraction.

Of course, the fact that the text is modified in any way is user-hostile. I don't mind re-typing a code example (since I'm trying to learn it), but I like to copy-paste sometimes (e.g. data literals).

I paid for a DRM-free ebook that's meant to be copy-pasted. We have a universal paradigm for how plaintext copy works. Apple charges a premium for good design. Ebook readers are generally supposed to get out of your way and let you focus on content. There is no reason for Apple Books to have this behavior.

PowerShell acknowledged smart quotes and dashes in its language design and allows proper quotes interchangeably with ASCII straight quotes. I always found that an interesting design choice, although I'm not sure how useful it actually is to allow people to copy/paste code from dubious web pages. On the other hand, they'll do that anyway, so why bother making it harder?

This is going down a rabbit hole. “This” is an English quote, whereas „this“ is a German quote; note that the open quote character in the English quote is equal to the close quote character in the German quote.

Then there is a less-often used style of quoting, similar to the French style, but: «this» is the French quote, whereas »this« is the German version. Yes, the open and close quotes are swapped.

I applaud the idea, but just want to point out that there are dragons lurking in the shadows.

All of “”" parse the same as ", so bracketing doesn't really enter into it. This is just to prevent smart quotes from destroying code, not to have more levels of nesting.

Additionally note that «this» is mostly used in French online and in Switzerland, while « this » is more common in professionally typeset works in France. Those are U+202F NARROW NO-BREAK SPACEs between the guillemets.

Don't forget that the French quotes should have (non-breaking) spaces inside « like this » - only when used with the French language though, other languages usually don't use such spaces.

I've never liked this sort of thing in code, because as things stand they're a bit of a faff to type, and some tools don't support them very well. But it's a shame, because there's loads of types of quote mark and bracket in Unicode, not just the ``...''-type pair commonly used in English. So you could use one type for string delimiters and then use the other types, unescaped, in the string itself.

(The Unicode quotes also mostly come in matching pairs, so, for good or for ill, they'd in principle be nestable.)

isn't that an input problem not an output problem? Is it giving you the quotes directly from the ebook or is it converting them on copy-paste? I'd assume there was a bug on the input side, that some tool wrongly converted `printf("Hello World")` to `printf(“Hello World”)` so what's in the ebook is wrong already. If so the issue is further up the chain.

In this context, the selection within the ebook contains no quotes, whether ASCII or Unicode. The problem is that Apple’s Books app adds quotes and attribution text around the selected text. Books knows that the text it is putting in the clipboard isn’t what you selected, and it doesn’t care.

It's not changing quotes in the code. It is adding quotes. You have to delete them, not search-and-replace them, and it doesn't matter what kind of quotes they are, you have to do the exact same thing.

Search and replace is a pretty reasonable way to delete a quote; lot's of editor UI's make that much more convenient than finding the end of some blob of text you just pasted into some pre-existing context.

Depending on the regex engine, it may not be that much higher.

sed -i '/<open quote>|<close quote>/d'

I think they are talking a about styled quotes, the kinds that point inward, which a REPL won't interpret as an ASCII quote and will throw some type of Syntax error.

Ok but those are standard, valid characters like any others. Why are they ‘horrid’?

Some people hate unicode quotes because lots of software will automatically convert ASCII quotes to them when you copy/paste your code to Slack or email to share with others.

Because they result in a syntax error.

Even though the OP seems offended by the presence of the Unicode quote marks for some reason, they have nothing to do with the issue of Apple Books surrounding the copied text with those quote marks and adding an attribution line. If they'd surrounded the text with ASCII quote marks and added an attribution line, it would have still been a syntax error, wouldn't it.

Personally, I don't see any reason to hate typographically correct quote marks when used correctly, which (obviously) doesn't include code samples.

But they wouldn’t do what you wanted anyway if they were ASCII. What’s the problem specifically with them being Unicode?

The problem specifically is that they are not the ASCII quote character. There is only one ASCII quote character, and that's the one used by programming languages. Any other quote or quote-like character is outside the ASCII range, and must therefore be Unicode (or another non-ASCII code page).

I know they're not compatible with programming languages that you may want to use. But that's the languages' problem. They're perfectly valid, standardised, characters.

No, iBooks inserting them around commands is the problem.

Inserting ASCII quotes instead would result in exactly the same problem.

Not exactly the same. Since anything inside ASCII quotes is typically considered a valid string in most programming languages, if you delete everything after the final quotation mark you have a syntactically valid program. It won't do what you want, but it's not a syntax error.

So unicode quotes and ASCII quotes are not exactly the same in this scenario.

They both break what you were trying to do, and the fact that they do it in slightly different ways really isn't worth making a long pedantic argument about.

Programming is literally the art of pedantically telling a computer what to do, arguments about it are naturally going to tend towards pedantry. So that's my first point.

The second is that there's a significant difference between "won't run due to syntax errors" and spitting a pasted line back at you since it thinks it's just a string. In a less robust environment the first option might actually crash your environment, or leave you with subtle errors. Like in a shell, it might treat things like escape sequences and things will look off unless you reset them.

They're horrid because you can't easily type them on a keyboard.

Because for some reason terminals are stuck in the 70ies and don't accept those characters as quotes. Anything but ASCII trips them up.

Seems such an obvious interface to innovate, but it seems to run into terminal wizards sense of purity.

The problem, such as it is, is with languages. Terminals (mine at least) handle most of Unicode just fine; admittedly I've seen it choke on emoji, but punctuation, nah.

The vast majority of programming languages are defined in terms of ASCII and only ASCII. I don't care for this, personally.

I've given some thought to how to do quoting right in a programming language, and implemented «guillemets» as an experiment. But it's challenging, you need to decide what to do with all of “”‟„"″ and there aren't obvious pairings, like „this is a sentence” and “this is a sentence” and »this is a sentence» and «this is a sentence» and »this is a sentence«, it ends up feeling like rather a lot of effort for what you get in return.

Oh, one of those characters I typed isn't a quotation mark, did you catch which one? Hacker news won't even let me type two of them!

Having multiple almost identical ways of achieving things is a bug magnet in programming languages. Differing programmers will presumably use different styles (why else even support differences?), and if you mix code like that, bugs ensue. This a hassle not just because autoformatters are liable to make churny changes (which distract from real changes, which makes history harder to understand: bug magnet), but also because people will make mistakes e.g. when find-replacing (another minor bug magnet!). Then there's the fact that some of those quotes aren't symmetric - so you need to think of something to have happen when they're unpaired or used incorrectly, and it wouldn't surprise me if no matter what you did, you surprise somebody (bug magnet!).

Sure: these are all quibbles, and a language wouldn't die from all these minor cuts. But they're definitely downsides, not upsides. So: where is that upside? Why would you ever support something like this? "It looks a little nicer" sounds like a pretty weak argument compared to "it's inconsistent, hard to machine process, and may cause a few bugs"...

Many programming languages, certainly not all, offer single and double-quoted strings already. Which is, granted, a perennial source of annoyance for those of us trying to have a consistent formatting style.

I want «guillemet strings» because they use a matched pair, so you can «quote a string «within a string» without escaping» and I think that's a nice property.

I'm not really interested in supporting all the forms in which they're used in European languages, though, that would be a real hassle. Just the one that looks like the other matched pairs we use in programming.

“smart quotes” have ‟at least two styles” and really „three styles”, and the first two are really easy to confuse with "normal double-quotes". I don't want my users to have to deal with "why doesn't this compile”?, and if I allowed it to compile, now you have to escape all the quote characters inside any string, which is messy.

Raku, as a sibling points out, has bitten down on this bullet, and I respect that. I keep meaning to give it a spin, I'm fond of Parsing Expression Grammars and have good feelings about Perl from the early days.

Those work in Raku, with the exception of ‟at least two styles”.

    .say for q[‟”].uninames
If you use one type of quote, you can use the others inside of it with no problem.

    “double "quotes"”
(Raku doesn't use a tokenizer.)

Also note that you can use a variety of characters for quoting if you use `q`, `qq` or `Q` etc.

    q<<<<1 < 2>>>>  eq  '1 < 2'  eq  q^1 < 2^  eq  q%1 < 2%  eq  q「1 < 2」
Note that many Unicode characters which have a LEFT and RIGHT variant will also be paired when you do that. q⦑like this⦒

In Raku string literals are actually a [parameterized domain specific language](https://docs.raku.org/language/quoting). (The domain of creating strings.)

    my $buffer = Blob.new(115,116,114,105,110,103); # 'string'.encode()

    Q            # start with raw quoting
    :scalar      # enable embedding the value of scalars
    :backslash   # enable backslashing characters
    [This is a $buffer.decode()\n]
There are shortcuts for common uses

    'single'  eq  q [single]  eq  Q :single [single]  eq  Q :q [single]
    ‘single’  eq  ‚single’

    "double"  eq  qq [double]  eq  Q :double [double]  eq  Q :qq [double]
    “double”  eq  „double”

    「raw」   eq   Q [raw]

    < words >   eqv   qw [ words ]   eqv   q :words [ words ]

    << quote words >>   eqv   qqww [ quote words ]   eqv   Q :double :quotewords [ quote words ]
    « quote words »
This is not exhaustive.

:double is actually short for :scalar :array :hash :function :closure :backslash

:words splits on whitespace :quotewords splits on whitespace but respects quoted sections

    q      :words [ a "b c" ]   eqv   ( 'a', '"b', 'c"' )
    q :quotewords [ a "b c" ]   eqv   ( 'a', 'b c' )
Note that :double is the same as any adverb in the language, so you can negate a feature with :!double.

    qq [with newline\n]
    qq :!backslash [without newline\n]

    qq :!array [user@example.com()]
(It's actually rather difficult to accidently use @ or % variables.)

PostScript has something like the «guillemet strings», but it uses parentheses instead. The parentheses can be nested without escaping.

> I've given some thought to how to do quoting right in a programming language

Already fully designed and implemented in Raku: https://docs.raku.org/language/unicode_entry#Smart_quotes

Test online: https://tio.run/##K0gtyjH7//9Rw7ySjMxiBSBKVChOzStJzUtOfdQw9/...

A language or interface is defined by the set of symbols mutually agreed upon. If you allow Unicode that number simply explodes, thus bloating and complicating the language/interface and its implementation as well. In effect it is no longer the same language but becomes a different, more complex one.

The tradeoff is not worth it for programming languages the same way as learning all the scripts of the world is not worth it for one human being, just to accomplish tasks that don't need all these symbols in the first place.

It is a solved problem however. We chose not tot use it (in terminals).

The terminal can usually handle it, provided it's UTF-8. Most of the editors can handle it. It's very specifically the programming language specifications which don't include them. Even the ones that allow Unicode identifiers.

Also, they're difficult to type - my keyboard has a " key but not a “ or ” character. I had to copy and paste those from the unicode website since I don't have a suitable input method set up.

The idea that "terminal wizards" reject this is rather undermined by the fact that "terminal wizards" made open and close double quotation marks accessible as simply [Group2]+[B04] and [Group2]+[B05] on an ISO 9995-3 conformant keyboard.

Or [Shift]+[Option] [B04] and [Shift]+[Option] [B05] if there's no explicit [Group2] shift.

Open and close single quotation marks use the same keys.

"terminal wizards" have done quite the opposite of rejecting this.

That seems like an obviously bad solution. That is like writing [a(b{c]d}e) except that the quote characters look even more similar.

Its easy (solved) to map Unicode to control characters such as quotes (and when you think about it, quotes are often there to deal with ambiguities stemming from the limited number of allowed characters in ASCII). So to could have a terminal accepting such input, and a few helper function that normalize it into ASCII and so on.

After all, users of non-ascii languages (which is nearly everyone) already know how to deal with it without ambiguity. Its only ambiguous if you don't use encodings, and that should never happen anyway.

> and a few helper function that normalize it into ASCII and so on

If you are going to get rid of it anyways, why bother with it in the first place? Also, you then have the ambiguity/problem whether some symbols are mapped to the same ones or not.

> already know how to deal with it without ambiguity

I wish. Just dealing with ü vs ue vs \"u and ascii, win something, utf8 and utf16 is still annoying in practice. Or latex choking on accents, non-breaking whitespace or similar fun stuff.

The argument to do it is it democratizes users with their scripts around the world (and solve the issue of having 'the wrong' quote). Normalization solves the latter. It's really solved, but you would never know if you live exclusively on the command line. Which I think is unfortunate.

I disagree, it simply tells everyone they are wrong and the "normalized" one is the right one.

Oh, you used foo and föö as variable names, too bad those are the same now by normalization.

> It's really solved, but you would never know if you live exclusively on the command line. Which I think is unfortunate.

Tone down the condescension and insults, please.

Often in code environments, it's hard to tell the difference between " and “.

Don’t despair, there are workarounds for this. See the many answers to https://apple.stackexchange.com/q/137047/21473 ‘Don't want iBooks to always paste the “Excerpt From” of what I have copied’.

I recommend the solution in https://apple.stackexchange.com/a/382603/21473 (which is an improved variant of stassats’s solution in https://news.ycombinator.com/item?id=22355322). The method is to use Automator to set up a Quick Action that copies the text while bypassing the clipboard-modifying behavior of Books. Then you configure a keyboard shortcut of ⌘C for the Quick Action, only in the Books app, so you don’t have to change the way you copy.

If your shell has a function for copy and paste integration (like fish), just override it with a sed script to remove that crap. Same for vim’s clipboard integration.

Is this not a setting you can put off? I would say it is a reasonable function if it can be toggled on/off (I would prefer default off, YMMV).

I remember that kind of thing as far back as Encarta(?)

Back then, I was writing papers and it was kind of handy.

OneNote does this on the paste end, which is really nice for collecting references and quotes, as you get the source automatically.

“ghci> putStrLn (pretty 10 value)”

    Excerpt From: Bryan O’Sullivan,

My instinct is that this is for legal attribution purposes, i.e. a legal moat.

It sounds ridiculous, but in my experience, there are 1000 things that lawyers will identify as possible sources for legal trolling and this looks like that. Apple has a zillion dollars in the bank and they are sued daily. All it takes is for a judge who may nor not be knowledgable or a sufficiently grey area ... and you have a major problem.

So it could be risk mitigation: they are including the attribution in the copy so as to not be perceived as partisan to some kind of IP dissolution paradigm.

On the other hand, this could just be one of those wayward culty Apple kinds of things they think is 'good UI' when it's not.

Unlikely - see my comment elsewhere in this post.

So yes, I see in your comment that given the context of 'Apple Books' it's probably a contractual issue, I agree there.

Though I can see that a reasonable legal opinion might not support my more cynical view, when the risk dynamics are high, a different kind of logic creeps in.

I worked for a software platform that refused to provide usable snippets of code anywhere in the documentation for fear of liability.

We also 'perpetually sued' an organization that was infringing on our brand, even though they were really helpful to us (a user-managed fan-site which used our name in theirs) and otherwise had a positive relationship with them. Our 'perpetual legal action' was merely cover give the appearance that our brand was being defended, without which action, we could feasibly lose rights to it. So, literally suing people, while dragging out and 'nudge-nudge-winking them' to not worry about it, literally inviting the people we were suing to events, dinners etc..

I don't think most people understand the risk dynamic in many large organizations with respect to these issues, the calculation seems bizarre even to most regular product types, it really takes a legal view to understand this. And also the personal fears and biases of the executives.

> Our 'perpetual legal action' was merely cover give the appearance that our brand was being defended, without which action, we could feasibly lose rights to it.

Was that really easier than just licensing your trademark with a strong contract that preserved your rights while letting the website use your name for one specific purpose? Make the licensing costs $1 per decade or something if it is a question of money.

Misusing the legal system in this way seems like it could backfire if the third party did something you genuinely wanted to stop and they could prove your ongoing action was a sham.

I'm not a lawyer, and I was not involved, other than I knew there was a many-years-long legal action regarding branding against another company with whom we had otherwise a really good relationship.

My point is not about branding or lawyers, it's about risk.

Said company gave up a huge amount of money to patent trolls, and their lawyers were empowered to mitigate risk, with the backing of the CEO, their rationality being: "We make a huge amount of over here, why on earth would we allow that to be risked by speculative activity over there?" which is not entirely irrational, it just depends on implementation.

Everything is so gray, it's so hard to tell. Consider that we have no idea how open-source software licensing will work out because it hasn't been really pushed through the court system, and how limiting that ambiguity is for the entire industry.

The wrongest thing about this, from my perspective, is that my browser fires off a js 'copy' event when I press control-c. There are times when I've found it helpful that a browser can copy text to my clipboard when I click a button, but I can't think of a single time when I want a site to react to my attempt to copy text off if it.

Is there any way to configure my user agent (Firefox) not to do this? A hack is ok.

Even that won't totally save you. While you can't do exactly this, you can sure get the fun experience of the user getting text they didn't expect in the clipboard when copying without any JS at all.

Text is highlighted on the page based on the code order, not as it appears on-screen (at least in Firefox and Chrome). If you throw elements off-screen with some CSS, you can create a big disparity between what the user thinks they are selecting and what they get when they copy/paste.

e.g: https://jsfiddle.net/Lbc5gsjm/

Not quite as neat as you can't put it dynamically around exactly what you select, but my favourite example of where you could get pretty malicious with it is a “plain text” link that injects a suffix to the hostname when you copy it.

This works when you manually select the text, but it fails when you want to select by double-/triple-clicking. In this case the selection ends before the word “me”.

It’s awkward how easily one can break a basic functionality like select & copy.

There are plenty of other tricks that can be used and won't break the "triple click". I just threw together [1] as an example (and in that one, even if you don't triple click but accidentally select past the end of the line, you are also getting unexpected text).

Combine a few of those techniques with a fancy-looking text box that you are supposed to "click to copy" to get a command, and it becomes pretty easy to even write css-only "exploits" to put stuff in the clipboard!

[1] https://jsfiddle.net/gxosfn83/1/

Yes, that is another problem, although it involves CSS. I suggested before to have a ARIA view, which would cause it to use ARIA properties rather than CSS to decide the rendering. I have not thought of your example though, but it is a feature I would like to have. (ARIA view would do other things too.)

I think the wrongest thing is that this is a clear attack vector... make a site with helpful Linux shortcuts, then replace every copy with "curl malicious script and run it, plus a newline to make it run immediately"

> plus a newline to make it run immediately

I already mentioned this in another comment, but everyone should enable bracket paste mode in their shell to defend against this.


I always just type "#" first before pasting... That way it is just a comment I can inspect before running

Here, paste this into your terminal:

    echo "hello"
    echo "lol" ; sudo sl -rf /
(In case it's not obvious, the # trick will not help you.)

True, the comment trick does not work for multi-line pastes

So... don’t paste it in my terminal?

heh, yes. Admittedly, immediately after posting, I did change it from 'rm', which would not be a good idea to run, to 'sl', which ... well, try it (or `man sl`), it's fun.

If you're using Readline (most Bash or similar shells do), then C-x C-e will invoke the editor, and you can paste the output into that.

Pray for no escape, bang, or control sequences.

(Alternatively: "r! cat <paste> <ctrl>-D" will read into a vim session.)

Examine the output, trim the unwanted / dangerous bits, and run (or save to a file/script).

Since I run what are ... generously ... considered "bash one-liners" all the time, that key sequence is locked into my muscle memory. It's a convenient way to invoke an editor immediately and run code from it.

> :r! cat <paste> <ctrl>-D

Alternatively, call a utility that can read the clipboard directly. For Xorg:

  :r! xsel -b
For Wayland:

  :r! wl-paste

That depends on whether or not the clipboard is directly accessible or not. When invoking an editor remotely via SSH, for example, it's not. Though there's often the ability to past into the controlling terminal locally.

I do this frequently on Android via Termux, ssh'ing to remote hosts.

Thanks for the safety net tip. I might be more methodical than most, but in copying commands from external sources (ie, ~anything other than my own code or notes), I paste into a local buffer to examine and confirm contents. I do this as both as a means to generate documentation as well as for hygiene per se -- and it occasionally helps to sanity check my intention and prevent a footgun.

Bracketed paste is basically a local buffer (that won’t be executed unless you explicitly do so).

Thanks for this. I had never heard of it before, but will definitely be playing around with it.

As I also noted in a sibling comment, you don't even need JS to do that: https://jsfiddle.net/eaL153uz/

I might start to be grateful for the screen reading software - these tricks fortunately don't work when you use NVDA or Orca, because they have its own virtual buffer and they intercept the copy command soon enough that the browser has no idea about it.

I always paste into a text document to examine what's there. Then copy from that and paste.

Under X11, Firefox sets the primary selection correctly and middle-click paste is unaffected. This site also renders fine with JS blocked. That is a useful default to prevent this kind of crap from interfering with you.

What is infuriating is CSS that prevents text selection in the first place. That insult to usability should have never been adopted by browsers.

> That insult to usability should have never been adopted by browsers

As a web developer, it's kind of funny seeing people exclaim "this feature should not exist" after seeing a few abuses of it, even though they probably benefit from it every day without realizing it. Overriding copying is needed for wysiwyg editors and selection blocking is needed for many kinds of drag&drop interfaces and also editors.

Instead of complaining that useful features exist because they can be misused, maybe complain about the misuse itself - maybe even to the people actually misusing them. Shoot them an email and they might actually do something about it. Most of these misuses hurt accessibility (screen readers, etc.), so if nothing else, they'll see it as a way to reach more customers.

It can be useful. For example I don't want to select the line nunbers next to the code.

> I can't think of a single time when I want a site to react to my attempt to copy text off if it.

I think it is needed for some complex web app to handle copying non-text content. Such as images in wysiwyg editor, Google Sheets/Slides...

So trade those apps not working for immunity from JavaScript clipboard hijacking?

I'd be 110% fine with that trade and nothing of value to me would be lost.

Is it possible in Firefox? Anyone know?

Millions of people use applications with these kinds of features. A few more examples: the Scratch educational programming tool, website builders such as Webflow, diagram editors, image editors, etc. The list goes on and on.

The browser is no longer just a document viewer... That ship has sailed, and overall it is a good thing.

We can mitigate the risk of clipboard hijacking without burning down the house. By the way, I would guess that this is a minor risk in the grand scheme of things, as it seems that the worst risks are for technical individuals people copying programmatic commands (e.g. software engineers). For others, it is a real yet minor annoyance.

Perhaps there could be an opt-in setting for allowing a site to modify the default content that is copied from a selection?

> The browser is no longer just a document viewer... That ship has sailed, and overall it is a good thing.

No, it really isn't. Web apps and documents using the same underlying technologies doesn't mean they have to be accessed through a single frontend that provides the worst of both worlds.

> Worst of both worlds

I don't think that is remotely the case.

A decent workaround would be to have 2 clipboards. The regular untouched one and the special one. Then when you paste, apps which only take plain text will grab the regular one and apps which accept formatted copying will grab the special clipboard but also provide a "paste as plain text" so the user gets what they want every time.

You don't have two clipboards? It's been the default on *nix OSes for ages. Ctrl+C/Ctrl+V works, but there is also the select/middle-mouse clipboard. Great for crap websites that hijack one.

It sounds like you're conflating two orthogonal concepts: multiple named clipboard locations, and multiple data types on one clipboard. Both already exist, and are how clipboards on major platforms have worked for decades.

The Javascript interface isn't aware of these distinctions, though I'm not sure I want it to. Web apps like this that abuse one plain text clipboard will abuse multiple richly typed clipboards, too.

Those apps are surely already doing that - I don't think there's a way of copying arbitrary structured data to the web's abstraction of the clipboard. It doesn't address the problem the op raised - that browsers can sniff the copy event.

There are already (at least) two clipboards in many platforms.

X11/Xorg has the primary and secondary selections.

MacOS has ... whatever it's got.

A key problem with this is that the feature is covert, latent, poorly discoverable, and causes unexpected behaviours. Even presumably advanced users (myself, you, other HN readers) are poorly aware of this. Imagine trying to explain to your nontechnical Aunt Tilly or Uncle Kamlesh about these "different clipboards which treat what you've copied differently and access through this funky interface"?

If you really don't mind it will break some websites, then you really can disable it in Firefox.


Mind you that this breaks a LOT of things. Even stupid stuff like any textbox on Facebook will be broken.

That sounds like a net positive, allowing you to tell apart proper websites from privacy nightmares :-P

Maybe let users grant clipboard permissions, as they currently do for location, microphone, webcam or notifications?

There's the "dom.event.clipboardevents.enabled" preference, which you can set to false in about:config.

I don't know whether that's planned to stick around or whether it was added when the clipboard event support was still experimental and will be removed at some point. But I suspect the former, for precisely the reasons in this thread.

Note that disabling these events will likely break "smart" copy/paste in things like Google Sheets, which is one reason it's not disabled by default....

OK, it looks like I have already disabled that (although I forgot about it until I looked now).

about:config -> dom.event.clipboardevents.enabled = false

You can't do it through a user agent, though

I'm pretty sure this breaks copying from google docs.

I can highlight and ctrl+c without issue in their Writer clone. Never tried any others.

I've only ever seen that setting fix sites that try to prevent you from copying.

I’ve used clipboard events for legitimate uses - for example I use them in this pseudo Turing sandpit / visual programming system to move, copy and export content: https://steam.dance/ (you can hold shift for a selection box and once selected you can copy, etc).

But I wouldn’t be surprised if stuff like this is the exception rather than the rule. I feel like most news sites and blogs are improved with JavaScript disabled.

You're getting far enough into JS knowledge that I don't have.

Do you actually need clipboard events, or just dom.event.contextmenu.enabled for that?

Put another way, on a website like draw.io, you need to be able to hook into copy to allow lassoing and copying a set of shapes (and have them remain editable when pasted).

The entire ask (stopping the copyright, and allow clipboard integration) be achieved by allowing pages to augment the clipboard data, which can already be done, but not replace it. Pages would still need to be able to hook paste.

Being able to hook paste is a much more reasonable ask. If you have js enabled, they're going to be able to access the content after paste anyway, so it's reasonable to allow notification that a paste has occurred.

Be aware that some websites will completely break when pasting when this is disabled (e.g. Twitter, Facebook, probably more).

Pasting is completely broken on Facebook for me anyway, if I paste at the start of a line it deletes the whole comment box. Originally I thought it was related to this config setting but toggling it made no difference.

So many lost rants ... ;o)

You can turn off JavaScript. This has the added benefit of disabling most tracking and advertising.

Protip: in Safari you can assign a keyboard shortcut to toggle javascript (Preferences -> Keyboard -> Shortcuts -> App Shortcuts -> [+] -> Application: "Safari" Menu Title: "Disable JavaScript" Shortcut: "Cmd+Shift+J". >90% of websites immediately become massively more pleasant and you can quickly toggle it on when actually needed.

For mobile devices, easiest way is to use two different browsers (one with JS disabled).

I presume this toggle applies to all windows and tabs. As an example, if some page uses a meta refresh tag to periodically refresh it and also has JS (which wasn’t loaded before), the JS would get loaded at a future point when it’s enabled through this option (though it may be for some other tab in focus that the user enabled it for)?

A per tab setting that remembers the JS enabled state would be very useful, with the default being JS disabled.

Yeah, it's global, and unfortunately I believe that Apple has gimped extensions so something like NoScript is no longer possible. If you need granularity or whitelisting of domains, I'm afraid you need another browser. On the glass half full side, being a global switch means it's less mental overhead – you have a single additional modal bit instead of O(tabs).

As well as disabling most of the internet. It makes more sense to turn off the bad apis individually and block known tracking domains.

We've been through the blacklisting argument numerous times: spam, viruses, adware, malware, spyware, browser adblocking, etc.

It ... ultimately doesn't scale. The black-hat namespaces are too large, and treating as TOFU eventually proves untenable.

Advanced features -- anything beyond rendering basic HTML, and I'd be quite prepared to argue for a very limited subset of that -- should be expressly prohibited unless enabled.

The trick is to make enabling reasonably painless and consequence-free (e.g., enable into a sandbox). There are still the problems of users gratuitously enabling anything and everything (especially when prompted through website notifications, pop-ups, phishing and vishing social engineering attacks, etc.), as well as the problem of largely invisible second and higher-order effects.

But the key is to cut down on the effectiveness of such methods, to impose costs on websites for employing them, and to make black-hat attacks through these too expensive by reducing the herd susceptibility to the attacks (the unbocking would have to occur on a one-at-a-time case-by-case basis, e.g., it's expensive and has limited scalability).

I don't think we'll actually see this for another 5-10 years (my usual sane-suggestion take-up lead time, it seems), but It Would Certainly Be Nice To See.

I’ve recently been experimenting with having JS off by default on Brave. Many sites that I don’t regularly use or have an account/relationship with are much more pleasant without JS. There may be some visual breakage in some cases, but I’m there for the main content, which is mostly fine.

I had experimented with NoScript long ago, but found it a bit more cumbersome at that point because (then, before the uBlock days) I couldn’t really judge which scripts were necessary and which weren’t. I’m going to try it again.

One big plus with disabling JS is that all those ad blocker blockers and other annoying popups just don’t even appear, and that adds to a better experience.

>disabling most of the internet

I have JS blocked on some domains because of popups/other annoyances which make getting to the content a hassle. In those cases it's the exact opposite.

Currently, it's a relatively small and spammy portion of it.

> but I can't think of a single time when I want a site to react to my attempt to copy text off if it.

It's useful for things like rich-text editors (e.g. google docs), where the text is never on screen as text that you are able to copy in the first place.

What about the X clipboard, e.g. just selecting text and then being able to paste with middleclick - did you check that as well? I suppose it should circumvent that?

I suspect a complex WYSIWYG editor like Google Docs benefits from this feature since the selection behaviour is heavily customised (and it has to be in order to support most of its features). The browser can't comprehend most of the things you might be selecting and copying from such a document, so it makes sense to replace the clipboard with a more accurate representation.

Use reader mode, no JS is active there, and you can copy the code quite nicely. I always do this when I see this behaviour

> I can't think of a single time when I want a site to react to my attempt to copy text off if it

Perhaps in a rich text editor I want to copy as plain text; or perhaps I want to make a Markdown editor that you can select+copy the Markdown but then paste HTML. Just brainstorming.

If you use Greasemonkey or Tampermonkey, you can add the following to run before any other scripts:

        function(e){ e.stopImmediatePropagation() }

> I can't think of a single time when I want a site to react to my attempt to copy text off if it.

Any time you're not interacting with HTML text is a time it could be useful.

Specifically, things like copying a cell of data in Google Sheets for example.

Copying from HTML tables is already sane.

Google sheets is not just a HTML table though. And when you click on a cell, you select the cell itself, rather than having to highlight the cell contents. In order for copy paste to work as intended, this makes it necessary for Google Sheets to intercept Ctrl-C and Ctrl-V.

In that case there should be no actual text selected. The problem only occurs when there is text visibly selected and you press your copy shortcut. In those cases the only correct behavior is to copy the selected text.

about:config dom.event.clipboardevents.enabled = false and you're good.

The last sentence of the article is the absolute best:

> Ironically, a little reverse searching reveals that this code was copied verbatim without attribution from this StackOverflow post (see “Manipulating the selection” from the top answer). Maybe copy/paste isn’t so bad?

This happened to me recently and it was pretty embarassing - I copied a two paragraph snippet from a psychology paper and the website put an ad for CBD oil in my clipboard above the paragraphs.

The segment I copied was just long enough that it overflowed my chat window so I had no idea the insertion was even there and I sent it as-is.

The recipient called me about two minutes later and said "hey, I think you've been hacked - you just sent me a CBD oil advertisement."

So yeah, I'd love to see user agents address this. If I push a copy button (like github's clone repo button) then fine, I'm at the mercy of their javascript. But if I copy via ctrl-C or a right click menu, it should not not let the page interfere.

That's kind of hilarious actually.

This highlights the tension between the document-web and the app-web. What if the page is an image editor, word processor, spreadsheet? These app-web pages need custom logic for copy and paste. Unfortunately, bad actors (like what you found) ensure browsers cannot implement this stuff properly, because every feature is now a way to shove a new ad in.

>"the tension between the document-web and the app-web"

This is a huge factor in debates about things like the merits of CSS-in-JS, or the tradeoffs in "JAMstack" architecture. Pick any polarizing facet of web development and odds are you'll find this tension at the heart of the opposing perspectives.

And it's not black/white either. It's a spectrum. There are plain HTML documents on one hand, and highly dynamic applications like Figma or Google Sheets on the other hand, but in between are interactive documents and anything you can think of.

So these features are here to stay.

That just demonstrates that ads are not web-specific, but actor-specific, so that adblock technology should work at the OS level^, like firewall and antivirus. This would shift the problem to bad adblock vendors, but at least we could have few okay-today options to choose from.

^ not gonna happen for most platforms, as it’s walled gardens all the way down, but hopefully some day it will become obvious that adcrap tech stack is a wrong foundation to make computers and their economy.

It is possible to allow websites to implement custom logic within them without letting them mess with copy & paste going out of these sites.

< I'd love to see user agents address this>

Use Firefox, go to reader mode, copy/pasta whatever you want without worries

This hijacking “feature” that’s been a part of news and other proprietary content sites for more than a decade now:


Anecdotally speaking, I’ve seen it a lot less these days; mostly, I see it when copy-pasting from my Kindle app. I’ve rarely ever seen it for a site this general-purpose (and trivial) though

It’s been around for so long a clipboard-jacking-as-a-service company, Tynt (https://www.crunchbase.com/organization/tynt), launched, lived and exited.

If telling people not to do this was going to be enough to solve the problem, we wouldn’t still be talking about it. The browser vendors really need to shut it off once and for all.

For the lazy but curious, here's a random Wayback Machine link to their former homepage: https://web.archive.org/web/20140207123808/https://tynt.com/

And they have 100-250 employees?!

I'm sure most of it was just keeping wit the latest JS standard

Most of them are probably in sales.

> You can set up a minimum length, so not every word copied will be modified by the following, preventing a negative user experience when using your site.

You know what would prevent a negative user experience? Not adding things to my clipboard that I don't want. (Thankfully, that site didn't seem to.)

I can read & write a language that I don't have a keyboard for (I use a standard US 101 key layout), so I work around this by cut & pasting single characters around to fix up the missing accent characters. Usually even a partially fixed word is enough for the spell checker to kick in and correct the rest, making this surprisingly fast.

The thing that annoyed me recently is that if I cut & paste a single character in Google Mail, it'll "helpfully" surround it with spaces when pasted. Other applications like Firefox or MS Word don't have this behaviour.

I just tested it, and the GMail behaviour is wonderfully inconsistent:

Copying from GMail to to plain text (e.g.: a text editor) will result in a single character.

Copying from plain text into GMail will result in no superfluous spaces.

Copying from GMail to GMail or to any "rich text" editor will add the extra spaces.

So basically I'm copying plain-text, but GMail is abusing rich-text formatting to add "magic" behaviour someone at Google assumed is beneficial, but is just confusing and arbitrary.

PS: Similarly, GMail's spell checker now "fights" with the Firefox spellchecker and the result is that neither wins and obvious typos aren't resolved unless I turn checking off and back on a couple of times.

> I work around this by cut & pasting single characters around to fix up the missing accent characters.

Let me save you some time in the future. You did not identify the OS and language/script you have trouble with, so my answer has to be kinda generic.

First have a look whether your OS keyboard settings or a third party provides an en_US-101 compatible logical layout that adds more accented characters (typically accessed with right Alt¹) or extra modifier keys² (in addition to acute, grave, tilde), then install it and memorise the characters you need. Oftentimes, this is easy because it's mnemonic, e.g.

    AltGr+l → ł
    AltGr+d → ð
    AltGr+o → ø
    dead_tilde N → Ñ
    dead_double_acute o → ő
    dead_circumflex dead_hook O → Ổ
If that's not sufficient or too cumbersome, look into compose key³. That system is user extensible, more powerful because it provides not just accented characters, and to the most part has better mnemonics; e.g.:

    compose U G → Ğ
    compose " y → ÿ
    compose a e → æ
    compose t , → ț
    compose v z → ž
    compose I . → İ
    compose u o → ů
    compose c | → ¢
    compose . . → …
¹ http://enwp.org/AltGr_key ² http://enwp.org/Dead_key ³ http://enwp.org/Compose_key

It seems strange that compose u o → ů is u-with-circle-on-top, but compose U G → Ğ is G-with-u-on-top, since it means it's sometimes modifier-first and sometimes base-first.

(FWIW I use the compose key! Just not those particular characters, my set is incidentally consistent :-))

Order does not matter with compose, that also makes it more user friendly than dead keys.

The modifiers' proper names are "ring above" and "breve".

Whoa, thanks! I've used the compose key for a long time (I'm an amateur Polish speaker who also occasionally writes French) and I never realized it was order-insensitive!

It's oder-sensitive under X11 but many of the the default combinations are duplicated with the reverse order.

My bet is that there simply is no standard for anyone to adhere to at Google. It's a rather granular and specific artifact of UI, in a company not necessarily known for it's UI prowess or consistency, at least along that vector.

I'm wondering if some of these issues could be resolved with some solid industry standards.

You’d think “when a user selects text and runs the copy command, that text should be written to the user’s clipboard” would be a solid enough industry standard.

I've gotten into the habit of "pasting as plain text", i.e. cmd/ctrl+shift+v for any rich text editor.

Furthermore don't touch my ability to paste into web forms. Some banks do this, and I have no idea why (some incredibly misguided idea of security?). I disabled the ability for websites to disable pasting using Firefox's about:config, but 99.9% of users won't know they can do this.

> I disabled the ability for websites to disable pasting using Firefox's about:config, but 99.9% of users won't know they can do this.

Have you encountered any unintended consequences from disabling this?

The consequences of dom.event.clipboardevents.enabled;false mostly results in errors of functionality with WordPress, Google Docs, and Facebook (but I suspect FB has other motivations/reasons for this). You might encounter errors with other WYSIWYG editors, but for the most part, it's a nice optional feature to have control over.

None whatsoever. I've had it disabled for about two years.

Do you know the key for this? Can't find it with a quick search.

You want dom.event.clipboardevents.enabled set to false

Does this also work the other way? Prevent copyjacking?

Yes, at least for the specific code discussed in this example.

Excellent, many thanks.

Just want to share that there's also good reasons for employing this technique.

Eg if you select text in Slack that has emojis or formatting, it tries to put plain-text into the clipboard that would produce said emojis and formatting when pasted back into Slack. This is great, and every chat app does does formatted stuff should do this.

I'd hate it if browser vendors would attempt to block this and as a result, break that kind of functionality too.

Except that your example is only “great” because Slack is reimplementing text input- another thing that maybe should just be handled by the system.

Is it great that Slack also has its own autocorrect and doesn’t know about the system autocorrect?

I'm not sure how autocorrect is related.

How would you let text input with emojis and formatting and code blocks be "handled by the system"? I agree that on most platforms, emojis are a solved problem (although the UX could be better), but the rest isn't.

Sure, like all chat apps, what Slack does is a workaround around system limitations. But the limitations are there, and they need the "copy" event to make their hack work end to end.

> Except that your example is only “great” because Slack is reimplementing text input- another thing that maybe should just be handled by the system.

But this won't allow you to have a consistent emoji across platforms, neither do server-specific emoji. Besides, I found typing :emojiname: is much easier than finding emoji in emoji keyboard.

That's just needless hackery from Slack, though.

I should be able to decide if I want to allow apps to mess with my clipboard, and the default should be "no".

I agree and share the frustration.

A stock Apple application "Books" does something similar as well. When copying text -- using keyboard shortcut or the context menu rendered as soon as you lift your mouse (another tragedy) -- your clipboard will look something like the following (2 lines):

“<what you actually wanted to copy>”

Excerpt From: <Author>. “<Title>.” Apple Books.

That gets written to your clipboard; quotes and everything.

In case you haven’t seen the discussion in the sibling comment tree https://news.ycombinator.com/item?id=22353121, there are solutions to this behavior of Books. Particularly, see my answer https://apple.stackexchange.com/a/382603/21473 to the question ‘Don't want iBooks to always paste the “Excerpt From” of what I have copied’.

as does the kindle app. it's come in handy once for me (in a comment to a hn post, ironically), but usually it's a nuisance.

I loved it when I used Windows and OneNote (it would add metadata when I pasted to OneNote but not otherwise.)

I think the windows clipboard can hold that as metadata. OneNote had a setting that I could enable so that when I pasted from Edge into OneNote the url was appended, but the url wouldn't show up if I pasted into another rich text editor like Word (or OneNote without the setting enabled).

Using org mode now, and there's probably a way to do it with emacs lisp, but... I'll never get around to it.

The Windows clipboard has a model where the clipboard contains an abstract 'data object' which exposes a list of supported formats, then you request formats individually. So a given piece of data could expose (not literally this, but the equivalent) text/plain, text/html, image/png, $my_custom_format and then when an application requests a given format it's generated on demand.

This mostly gets used for scenarios where you're pasting into notepad vs into word - the latter will get rich text if it requests it, while the former is just going to request plain text and get it. When copying text or images or HTML out of typical apps, the clipboard actually ends up having 6 or more different formats in it that all represent the same source content.

The 'generate on demand' means that it's theoretically possible to sync clipboard operations over the network transparently for stuff like Synergy, which is pretty cool - no need to copy that big bitmap over unless software on the other machine actually asks for it. I had some custom clipboard sync software I wrote that did this automatically (with a progress indicator for big data like desktop screenshots) in a couple thousand lines of C# and it was pretty satisfying to use.

This is how X11's modern clipboard APIs (CLIPBOARD, PRIMARY and so on) are defined as well, I assume that Apple's approach is similar.

If I had to guess I'd assume either X11 invented it, or a research lab prototype had it and X11 copied that then was copied by every modern OS

Seriously... I want to know the business/legal logic behind adding copyright messages to the clipboard when copying. This has happened for a long time with news sites, Apple Books, etc.

No user ever has ever wanted that.

So what lawyers, where, ever demanded it, and why? Short snippets fall under fair use anyways... and even if it didn't such a message doesn't prevent anything (you just delete it after pasting)... and it you were oblivious to how copyrights worked before, this isn't going to teach you.

So it's UX annoyance but why? Not only does it not provide an obvious legal benefit to any party, I don't even see how it's legally covering anyone's ass? Like, I know how under trademark law companies have to warn people against using their trademark generically or else they can lose it -- so as dumb as it is to get an e-mail from Adobe asking you not to use "Photoshop" as a verb in your press release, I get it. But copyright... doesn't work like that.

So how/why did this become a thing? I just don't understand the legal rationale here.

Attorney here! (Not legal advice; consult a licensed attorney in your jurisdiction.)

I am skeptical that this is a copyright issue that raised an attorney's attention. It's far more likely IMO that this was a contractual obligation imposed by the publisher.

I have no insider knowledge as to whether this is actually true, but it's quite probable that in exchange for allowing Apple Books to republish their content, the publisher required Apple to append this attribution when copying content into the clipboard. In theory, preserving such an attribution might cause more copies to be sold.

Also, contrary to your assertion, not every short snippet qualifies for a Fair Use defense; there's a four-factor test that courts apply, and the length is just one of those factors. But again, I think this less to do with copyright and more to do with a business arrangement.

I have no insider knowledge

I don't either, with the added bonus of a complete lack of legal training but this clipboard thing has been a part of commercial e-reader apps for so long, if your (very plausible-sounding) theory is right, it's been boilerplate in such contracts for many years.

It wasn't about copyright or fair use, at least not directly. The intended purpose of this sort of code was to:

1. Make sure many people who shared content from your site would link back to the source

2. Track shares where the full link to the page wasn't copied

3. Trick scrapers/careless plagiarists into linking back to the original source, using their own carelessness to get an extra backlink or two (and tell Google your site was the original).

All of these obviously only worked on people that didn't check what they were copying, but hey, those were some of the reasons behind it.

It's just advertising surely, like Apple's insistence on "sent from my iPhone" type footers, annoyance is the point.

This isn't just the web, there's a fundamental leakiness to the Mac's clipboard. I was horrified to realise that apps were being alerted to what was on the clipboard when I copied a a public key from some website and MacGPG (or whichever GPG app it was, they change) popped its head up and told me (something like) "Hi, you've copied a public key, would you like to save it in your keyring?"

Does this mean that any currently running app (or maybe not even currently running, perhaps there's an event system) can see my clipboard?

My feeling about this is unprintable.

I started planning a clipboard app that doesn't allow this but I'm busy and I'd prefer this wasn't something I need to fix in the first place.

I believe that the clipboard on windows and in x-server work the same way. Programs can also read most files including files that contain private keys.

I really care about security and lament that most people don't, but maybe they've the right idea and I'm just wasting my time. There are simply too many holes to plug :/

How do you want the clipboard to work then?

To my mind, the whole point is to provide a way to move information within and between applications.

Here's a stab at defining a function:

The clipboard should act at an only at user direction to copy content from one application or context to another.

The clipboard should not, nor should applications be able to, alter the copied content from the visibly-selected content.

Applications, other than when clearly and unambiguously directed by the user be able to access or read clipboard contents.

I'd suggest additionally that it should be possible to examine and edit within the clipboard context itself what was copied.

This creates a few obvious issues, one of which is that commandline and programmatic tools for interacting with the clipboard ... won't function as transparently as they do now. A fact which would affect me directly as I make heavy use of these (xclip in Linux, pbcopy / pbpaste in MacOS, termux-clipboard-set and termux-clipboard-get in Termux/Android). I think I'd be reasonably comfortable with a confirmation dialog appearing in such cases, or having those applications specifically exempted (convenient, though some risk).

The problem of programmatic interfaces to the clipboard is another matter, and those are ... probably a complex issue.

Note that when working entirely within the shell, the issue largely disappears as the inter-process commmunications method is largely pipelines, files, or the shell environment (variables) themselves. With some exceptions, such as gpm(8) (a cut-and-paste utility and mouse server for virtual consols, in Linux).

Though there's also behaviour of the X11/Xorg or Wayland clipboards.

> The clipboard should not, nor should applications be able to, alter the copied content from the visibly-selected content.

This might feel intuitively right, but it severly limits the usefulness of the clipboard.

It then becomes a basic plain text clipboard

Try opening an rich text editor (e.g. https://quilljs.com/playground/) and selecting two words of which one is bold. Hit Ctrl+C. What is now on the clipboard? What happens when you paste? You have your formatting preserved. The editor has intercepted the copy command and stored data only it knows how to create and parse. The same would happen in a diagram editor (a selected shape would perhaps be stored as some json representation), or an image editor.

The problem is that the clipboard as an established concept already means "area where programs write their custom formatted data and where any app can read the same data upon paste".

Restricting it might be a good idea in some cases, but it's the user expectation so it's what apps (including browsers) need to do as the default. This unfortunately means that abusing it as in the article will be possible. There is no way for the browser to know whether the changed content was formatting tags (good) or trashing the selection by adding a copyright (bad).

If you want to pass rich text or spreadsheet cell formulae, display those before selecting. Which puts the onus on the application to provide that functionality.

That preserves the functionality, respects the "copy visibly-selected content" directive, and makes clear just what is being saved to the clipboard, making sneak attacks more difficult.

Argument that the clipboard behaves in a way that is demonstrably prone to malicious attack is simply argument from tradition. Yes, that's how things have been done. We're discovering that how things have been done leads to strongly negative consequences.

Not sure I follow, how should the argument display the rich text (markup/formatting instructions) before the selection?

The displayed/selected content might be

Foo Bar

but the content I want on the clipboard could be

Foo <i>Bar</i>

but I never want to see the markup, only the formatted text. I don’t want to make a two step function where I need to reveal a textual description of the content and select that. The markup might be a base64 encoded piece of binary gibberish in the case of a visual diagram for example.

That would be unsupported functionality.

You cannot both have transparent copy capability and copy hidden content without revealing it.

Copying visual content would be subject to different requirements and limitations. But for text: what you see is what you get. If you're copying glyphs alone, those are what are copied. If you want formatting, you'll need to have the source application reveal that.

It could be an option that is revealed after a warning. "This page wants to use the javascript 'copy' event to place transformed data on the clipboard. If you accept it would place [data shown] on the clipboard, otherwise it would place [raw text] on the clipboard". What do you want to do? copy text? copy data? [ ]remember my choice for this site.

Here, data shown would probably be some info about the data rather than the data itself. It could be a 2mb base64 bitmap...

However, perhaps a better alternative would be to offer both "copy text" and "copy" where the former just copies selected glyphs as plaintext while copy fires the js event allowing the transform.

The important thing to remember is that webpages in 2020 must work like users expect desktop applications to work, and interact with desktop applications (e.g. copy rich content from webpage to desktop must be a default enabled feature or the user will consider the browser broken). For this reason, I don't think it's a viable solution to disable the js event by default (i.e. to hook Ctrl+C to the "copy text" function).

How would you institute that in the clipboard logic and independently of the app?

Because we're no longer operating in a world in which apps or processes are trusted or trustable. Maybe the ones you write, maybe if you're really lucky the ones that come with your fully-vetted Linux distro.

But not npm installs, not proprietary binaries, not website logic, and most especially not the crap that's distributed on mobile app stores.

Your OS, you've got to trust. Which means that the logic's in the clipboard.

And how can the clipboard know that the application is attempting to change contents such that the clipboard won't receive what it is that you see?

That's a key reason I see this as something that 1) has to be in the clipboard logic and 2) has to exclude applications entirely from the copy process. The clipboard should be acting on, say, the graphics render layer directly, outside the application's scope.

The clipboard can’t tell whether the content is “right” nor can the copying be done by the OS/clipboard itself.

In general the application is the only thing that knows what is selected (e.g objects in a cad program) and the OS has no idea of how to serialize these into the clipboard.

Even a text editor has to tell the OS what is selected and the OS can only trust the app to tell the truth.

The only way to “verify it” is to show the copied content to the user (e.g in a notification after the copy). Obviously for anything but plaintext this verification doesn’t help (I can’t tell serialized cad objects from something else - it’s just gibberish). I can however verify that it’s not a script that will erase all my files when pasted into a shell (btw executing on paste is s a horrible behavior by a shell. Pasted data must be treated as untrusted and verified before acted on. Immediate shell execution breaks that).

This is getting beyond my paygrade, though it in part depends on the OS dispaly system.

There are some concepts -- and I very barely grasp this -- such as Display Postscript, not in present use AFAIU, which might offer such capabilities within the windowing system.

That is, with DPS the display itself would have awareness of both the underlying text and the formatting directives.

Whether that's even remotely similar to existing graphical systems, I've no idea.

See: https://en.wikipedia.org/wiki/Display_PostScript

In the most general case (and more commonly than it ever was) each app is just a rectangular area where it renders using gpu hardware with. This used to be how games rendered but now even shells and text editors are getting there. If one has to manage this case too then basically the only places one can copy from are in the compositor (images ie screenshots) or from apps. Even when text drawing is passed through the OS (or platform libs e.g GDI on windows) it’s difficult to imagine APIs where the OS would know what is “selected”, ie what should be passed to the clipboard.

Interestingly, I can copy text from Pages to Word and the formatting is (somehow) preserved, even though this version of Word predates Pages by a few years.

Perhaps the clipboard contents are somehow tagged with their source so the destination can do something intelligent with its contents? This would potentially provide another way to lock down the clipboard: fishy photo filter can get access to an image on the clipboard (or maybe only images from certain sources), but not text.

That’s a good start, but it’s tricky.

How are you going to distinguish between user and programmatically-initated actions? Adding something like a “Secure Copy Key”, akin to control-alt-delete in Windows might work, but it’ll need kernel-level support.

Determining whether the clipboard contents faithfully represent the “visibly-selected” content is also very hard, possibly AI-hard. Suppose I copy an image while zoomed? What size should the clipboard contents be: the original size it’s apparent size?

One mitigation, if you’re worried about this now, might be to run certain programs as another user. I don’t think the clipboard is shared then.

How are you going to distinguish between user and programmatically-initated actions?

That is a deep and fundamental problem within any mediate technology. It's not trivial.

The simple answer is "look for indications from standard user inputs". But those inputs (keyboard, mouse, touchscreen) are themselves sufficiently complex that they can be mimicked, intercepted, or spoofed. There is the problem of distinguishing legitimate from counterfeit confirmation dialogues (already a persistent attack vector for desktop and mobile device users). There is the problem of rogue devices communicating over USB or Bluetooth connections (an argument for a principle of minimum necessary capability for interface ports -- serial and PS/2 connectors have their justifications), though that usually entails other devices being silently added to a system. Though a USB device spoofing an additional keyboard or mouse is also a demonstrated attack.

All of which starts drifting focus away from the key point: what is an unambiguous expression of user intent, and how would you go about ensuring that this is determinable?

The determination of contents question gets to a somewhat different matter: what kind of data are being copied?

I'll admit that I was thinking of the case of text, though there is also image, and conceivably audio and video data.

For text, zoom is irrelevant as that's a display artefact and (at least as I envision it) the goal is to copy the text as displayed, independent of typographical formatting, rather than "glyphs of some size and presentation".

Multi-user clipboard access is dependent on the graphical environment. For X11/Xorg, applications regardless of effective userID, have access to the clipboard.

(X11 also had some early attempts at securing the clipboard, with varying degrees of success. Issues such as grabbing keyboard input, potentially silently, were also an early concern.)

Would it be so hard to say “when I press command-v, send a paste event to the focused window containing my clipboard contents”? Sure some programs would want to customize keybindings, so unrestricted access could be allowed on a per-app basis, but why should any app I install have access to all of my copy events ever?

Similarly for private files: MacOS already alerts you when something tries to read from “~/Desktop” for the first time, why not allow users to extend that to “~/.ssh” and “~/.gpg” too?

Yup. The fundamental problem are the evil apps that steal your keys and add ads to the text you copy. You can't take away the means for them to do it without neutering the entite system itself (an experiment in doing just that is happening on the web and in mobile space right now).

Yes, Mac apps have access to the clipboard; that's why they can paste. Is Windows or Linux different? Honest question.

Please do not give Apple ideas for a "App would like to access your clipboard" dialog.

What's wrong with the system being push instead of pull?

On this key combination (default cmd+V), copy the data stored in this clipboard to the currently active app's own paste buffer, which will then handle inserting the data


All apps can read and write to the clipboard at all times

World readable and writable files that often carry sensitive information sounds like a stupid idea to me.

Because the clipboard is not only accessed with the keyboard. Programs need to be able to access the clipboard so that when the yser presses the "paste" button the application can grab that data and insert it.

The clipboard is a program. Why can the applications that want the data not make an API or file available that the clipboard uses to pass the data into, and then the app does what it will with the data? e.g.

    echo "this is on the clipboard" | cat -
cat doesn't grab anything, it receives via a standard API, both its own and the pipe. Now imagine echo is a clipboard program that exposes no API that cat can use to access data at cat's behest.

Note: As this was informative I hereby declare this a non-useless use of cat.

Most importantly, https://emdash.fan/ is the best emdash site on the internet.

My go-to site for copying special symbols is Wikipedia, which is, of course, well-behaved:

Dashes https://en.wikipedia.org/wiki/Dash

Diacritics https://en.wikipedia.org/wiki/Diacritic

Precomposed Latin characters https://en.wikipedia.org/wiki/List_of_precomposed_Latin_char...

Currency symbols https://en.wikipedia.org/wiki/Currency_symbol

Japanese typographic symbols https://en.wikipedia.org/wiki/List_of_Japanese_typographic_s...


MS Word will autocorrect two hyphens into either an en dash or an em dash based on what it believes you were trying to type.

So this: "em--dash" becomes this: "em—dash" while this: "en -- dash" becomes this: "en – dash"

On macOS

  ⌥- = en dash
 ⇧⌥- = em dash

   ⇧ = shift
   ⌥ = option
   - = hyphen

Now those are the symbols that I find myself going to a WWW page to copy and paste. Fortunately, it's my WWW page.

* http://jdebp.uk./FGA/iso-9995-7-symbols.html

En dashes are for ranges and sometimes for compound modifiers, like "billiard-ball–size hail" or "New York City–based attorney." Not sure who advised MS Word to turn two dashes into an en dash, ever, but it's not correct. Whether an em dash has spaces around it is a matter of style.

Word will only change it if you've used two hyphens, it won't autocorrect the examples you have.

MS has never worried about correctness :)

Use the GNOME Character Map¹ program, or your local equivalent². Not everything needs to be a web site.

1. https://wiki.gnome.org/Apps/Gucharmap

2. https://wiki.gnome.org/Apps/Gucharmap#See_Also

On a related note, I find it interesting how modern US-ANSI layout keyboards (even the International version [1]) still has such a limited set of characters.

We have ¼, ½, and ¾, but not en (–) and em (—) dashes; and I get that (') and (") are leftovers from typewriters and ASCII, but wouldn't it be nice to have proper 6-9 (‘…’) and 66-99 (“…”) quotation marks?

Then again, you often see Europeans online misusing acute (´) and grave ( `) accents as apostrophes (writing don´t or don`t, instead of don't or don’t). So perhaps the availability of more similar looking keys would just lead to even more misuse?

[1]: https://upload.wikimedia.org/wikipedia/commons/2/22/KB_US-In...

I chuckled when I saw the copyright. What exactly are they copyrighting? Lorem ipsum? The bloody emdash itself?

Even I exercised more caution when I was 12 and learning HTML, arbitrarily applying copyright to things I didn't own.

My favorite demo on this subject:


Actually my terminal emulator (xfce4-terminal) added a preview + confirmation dialog a few months ago for copy-pastes including newlines (i.e. which will execute a command instantly). I thought that it was to avoid accidental mistakes, but it makes even more sense that it is to avoid this.

> copy-pastes including newlines (i.e. which will execute a command instantly)

You should enable bracketed paste mode in your shell. IMO it’s much better UX than a confirmation dialog.


Curiously, I never heard of this despite being a Linux user for quite a long time. I'll give it a try.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact