Miners prioritize by fee-per-byte, except that the on-chain fee only accounts for part of the fee paid for some transactions.
Over the years, many services have popped up where you can pay an out-of-band fee to a miner to include your transactions first.
Of course, this is detrimental to users not using these systems as it biases the algorithms used to determine what is the current best fee-per-byte to pay.
1. If you think tx A appeared before B, that's not necessarily the order a particular miner saw them. The whole point of the mining is to agree on order.
2. That said, a minute or two difference is pretty universally visible, so it's reasonable to ask to prioritize the first-seen transaction. However, from the user's perspective, transaction is going to be confirmed within tens of minutes, or even an hour or two, so a shorter difference in time between your transaction and mine is no more important than ordinary variance in block times.
3. When the mempool is full, the entirely "capitalistic" relay logic exhibits some "socialistic" properties: a newly coming transaction has to cover for the lowest-paying transaction that is going to be kicked out. This is a necessary anti-DoS measure to prevent flooding the network cheaply with transactions that kick each other out and most of them end up non-mined with disproportionally low cost of mining the rest. In other words "if you kick out a tx, you have to pay for it, because it was previously relayed". This means that later coming, higher-paying transactions can't just take a seat paying marginally higher fee. That margin is going to continuously escalate as the mempool is being overloaded, better protecting transactions that were already relayed.
4. Finally, mining is very competitive. If you don't prioritize by feerate, another miner will and their extra profit would allow for capturing higher percentage of the hashrate while growing difficulty makes it even more expensive for everyone else. Any sort of agreement that undermines everyone's profits in the name of fairness requires (1) establishing a consensus in the first place, and (2) keeping it relatively static so random outsiders don't undercut it. But then such organization can use its leverage against outsiders in order to do many more things: censor transactions and change rules. That's why Bitcoin works the way it is: so anyone can join and mine without permission, funded by leftover profits, so no single miner (or group of miners) has no long-term leverage against everyone else.
As with any distributed system, "happens before" is only well-defined for committed transactions.
"The Replace-By-Fee flag in Bitcoin is effectively a built-in paid accelerator that covers all miners. I think the reason inefficiencies like this exist is because a lot of wallets either don't implement it or don't have it on by default."
"Although the fee-per-byte dequeuing policy is widely considered the “norm” for prioritizing transactions—we show that miners somehow delay a significant fraction of transactions. Such deviations undermine the utility of blockchains for ensuring a “fair” ordering that might be required for some applications."
You've mis-interpreted the article. The authors are saying the opposite. The paper is somewhat hard to read maybe because the 4 co-authors are from Germany so their English-as-2nd-language isn't the easiest to parse.
For the specific paper, the authors are using "fair" to label fee-based-user-pays-priority which is the conventional expectation of how Bitcoin behaves. However, they analyzed a bunch Bitcoin transactions and noticed that priority based on "user pays" doesn't always happen. They're not exactly sure why but put forth some guesses. They considered that some miners might be "altruistic" (i.e. "socialist") to consider other attributes besides fee payment but they discount that. The other guess they feel is more realistic are "transaction accelerators" that don't broadcast the same blocks to every miner. Excerpt from section 7.1:
>We hypothesize that mining pool operators might be sending a different set of transactions to each miner (or perhaps changing this set every a fixed time interval or event) to reduce the network overhead, avoiding miners to request a new task often. It is possible due to the stratum protocol.
>Stratum is a pooled mining protocol that focuses on reducing network communication between the mining pool and its miners by allowing miners to change some bytes on the coinbase transaction and consequently changing the Merkle root. Another reason is that some clients might be using services like transaction accelerators to speed up the commit time of a particular transaction. They pay the mining pool to use this service off-chain (i.e., with another cryptocurrency or via credit-cards) to, hopefully, increase the probability of their transactions get included in the next block. One example of this service is the BTC.com transaction accelerator.