The intended usage is that the client tells the server, "I want to load data from a file /path/to/data.txt on my local filesystem" in a SQL command. As part of the protocol for executing the query the server sends a message to the client to request the contents of /path/to/data.txt. Unfortunately client's don't validate the file request and will send any file (ex: /path/to/secrets.txt) even if there was no legit data request in their command.
This has been an issue with MySQL client drivers for years. I found and fixed the same issue in MariaDB Connector/J (JDBC driver (wire compatible with MySQL databases) in 2015. It rejects LOCAL DATA requests from the server unless the client app preregistered an InputStream (Java interface for generic stream of bytes) as data for the command being executing.
This is one of the many many reasons I love open source database drivers. I was able to find and fix this issue only because I could see the source code. Similar "features" in proprietary databases could go unnoticed for years and even when discovered may not have feature flags to disable them.
Similarly, MySQL Connector/J also used to attempt to deserialize binary data that looked like a serialized Java object (CVE-2017-3523). Doing this with untrusted data can often be used to obtain arbitrary code execution. Connecting to an untrusted server does not appear to be a use-case that received enough attention.
Loading a CSV is a common use case. PostgreSQL has a similar \COPY command used for a similar purpose (but that's a client side command not server side as far as I know),
psql’s \copy command utilizes the server’s COPY functionality, which absolutely can read and write files on the server or run commands there [1].
COPY with a file name instructs the PostgreSQL server to directly read from or
write to a file. The file must be accessible by the PostgreSQL user (the user
ID the server runs as) and the name must be specified from the viewpoint of the
server.
Clients using COPY, including psql’s \copy, often pass STDIN or STDOUT as the “file”, which allows data to be transmitted over the wire rather from the server’s filesystem.
Bad protocol design occurs in both OSS and proprietary.
However, with proprietary software the protocol is unknown unless it has been published. With OSS, you at least have the source code of the implementation.
As you should know, proprietary software relies on the owners to fix the problem. With OSS, "anyone" can provide a fix - and even if the owner does not wish to include the fix in the official build (which would look very bad on them, in this instance), "anyone" can apply it to their own copy.
Meaning, it's vastly easier for a 3rd party to discover and fix OSS, than proprietary software.
"However, with proprietary software the protocol is unknown unless it has been published" this is not true. Proprietary software does not necessarily mean opaque protocols. It's chicken and egg question.
1) IIS is proprietary server, but speaks open HTTP protocol. Proprietary software may implement well known protocol. This is probably most of the cases.
2) SQL Server is proprietary software, but speaks documented protocol - TDS. Specification is published.
3) Oracle Database is proprietary software and speaks undocumented TNS protocol.
Protocol is just a specification. If design meant to be secure it is way better.
There are proprietary HTTP clients, but no HTTP server can request file from client. So HTTP protocol is better than MySql protocol. If someone will write custom open source MySql client it will probably be affected. So this is bad design. If someone will write custom open source HTTP client it will not be affected. So this is good design.
Open source does not overweight bad design. I see no sense in "open source v. bad design". Bad design is bad design, no matter what the license is. There is nothing good is keeping bad software alive just because it's open source. The fact that you can play with code and fix security bug is very nice at most. The fact that protocol was misdesigned is paramount.
You've got it backwards - the DB server can pipe an arbitrary file from the client. So it is considered the fault of the client - it should not allow that. Since the mysql client is the one receiving the request and it should apply standard security practices by not blindly trusting an incoming request and instead validating that the path is equal to an earlier client load request sent to the server. (Although a better approach IMO would be to modify the wire protocol so the server "request" does not use the file name, but instead uses an ID from the earlier client request)
You misunderstood what's happening here. A rogue server can request the client to read any file on the client's file system, and the client will comply without validation that the client actually requested this.
That’s not what this is about. The intended use is: Client tells server to load a file, server sends request for file, client sends file. Except that the client will send the server whatever file it requests. In fact the client doesn’t even need to tell the server to request a file. The server can just request whateve file it wants whenever it likes and the client will send it.
Uh. Ok. "I was able to find and fix this issue only because I could see the source code." This is how all security issues happen. If I was as terrible person, i would create scripts that pray upon people that didnt' patch.
Actually, I think there are cases where people have patched closed source software. Binary patches are possible, just harder to write. (Not trying to say you're wrong: it's definitely easier if one has the source. Only that some humans are both determined and skilled, and some incredible stuff comes from that combination.)
IIRC, there was a flaw in Flash patched in this manner; it was using memcpy(), which requires the source and destination regions to not overlap, but, they did. In this case, it's fairly simple: one just needs to call memmove(), which conveniently takes the same args in the same order.
(IIRC, there was a lot of consternation getting Adobe to fix that properly, given how obvious the bug was.)
Am I understanding this correctly: LOAD DATA LOCAL is used to bulk load datasets (like CSV files) into a MySQL database from the client. Why is the file transfer initiated by the server, and not by the client? Is it because they wanted to cover cases where the server processes / simplifies the query, and after processing, it becomes evident that the file is not needed at all? If so, why isn't there an additional check on the client that the file requested by the server matches a file that was previously used in a LOAD DATA LOCAL query? Is it because there is no query processing / parsing on the client side at all (in case of string SQL queries)?
The client does not parse the SQL prior to sending it to the server. The name of the file may actually be a computation result that depends on data in tables IIRC.
However, you raise an interesting point about parsing. It is reasonable to want the server to do the parsing. You could, for example, tweak the grammar to add a new feature and do it entirely in an server update without needing to deploy a new client binary to every user. But then that makes it a bit harder to fix bugs like this.
There are surely some commands parsed and implemented by the client, though. Like connect, quit, and source for example. So another way to have designed this would be to make a local command to designate a file that the server will be allowed to access. You could define the path name on the client side and pass a symbolic reference to the server. Then the client could parse that command without taking on the burden of parsing SQL, and the client could still own opening files rather than taking orders from the server.
Sure, but the command could've been: "Load this file: <stream of bytes>", rather than "Load this file: <path to file>" which necessarily requires the server to parse the command.
Yeah, i definitely think common workflow rarely involves connecting to untrusted servers. But... If server gets hacked, connecting in to see why your site is down could steal files before you know what happened. Even better if hack isn't visibly causing problems. Though that seems really indirect and unproductive an attack vector.
But sounds great for a honeypot. Put up an easy-to-hack WordPress server, and when attacker connects to mysql, start downloading all the PII files you can think of from the client.
What is it with this concept of "(un)trusted servers" that people throw around here all the time that seems to build on the assumption that the world consists only of entities that are to be distrusted and those that should have full access to and control over all your information and resources?
No, just because I don't expect my client to kill me, does not mean that therefore there is no reason to be concerned about a rogue employee of theirs being able to gain access to any other of my clients' servers that I have access to by patching an exploit into their MySQL server.
Nor is it reasonable to assume that everyone is on top of their IT security and their infrastructure is only under their own control, and that includes your own organization once it grows beyond one or two people.
If that is the kind of mental model you are working with, your IT security probably is shit. Even the slightest external vulnerability anywhere in your systems or the systems of people whose systems you access/(co-)manage is going to grant an attacker total control over your organization if that is how you manage security.
Is there really such thing as a trusted server? Any server could have been hacked, it's not something that's under your control, so it's better to be a bit more paranoid about this.
"Server attacks client" is pretty common with web browsers. Probably more common than the other way around. Much less common with Database clients of course.
Yes, so if your server has been hacked, you now also have been hacked. It's not the most common scenario, but it would be good to protect against it in the interest of defense-in-depth.
Pledge could only prevent this by effectively disabling the feature entirely. If you were to implement the necessary logic to determine which "LOCAL DATA" statements were legal and which weren't, it would be trivial from that point to block the illegal ones with or without pledge.
Pledge is meant to be used for stopping your software from doing unanticipated things, not anticipated things. In this case they anticipated this behavior and purposely kept it even though it's potentially dangerous.
EDIT: Sorry, perhaps you meant that the application which is embedding the MySQL client ought to use pledge. In that case, that does seem like a good defensive usage of pledge which would prevent this kind of attack, notwithstanding the need to fix the issue in the client itself.
And if you use mysql client to upload files to the mysql server (like csv files) then you could invoke the mysql client with a command line argument saying "I intend to send files from this directory to the server" and pledge would let the process read files from that directory but not others.
if you pass the `execpromises` parameter, and make sure that the flags DO NOT include the ability to further execve() (the `exec` flag), then yes, but openbsd pledge flags are not implicitly inherited like linux seccomp filters.
This has been an issue with MySQL client drivers for years. I found and fixed the same issue in MariaDB Connector/J (JDBC driver (wire compatible with MySQL databases) in 2015. It rejects LOCAL DATA requests from the server unless the client app preregistered an InputStream (Java interface for generic stream of bytes) as data for the command being executing.
This is one of the many many reasons I love open source database drivers. I was able to find and fix this issue only because I could see the source code. Similar "features" in proprietary databases could go unnoticed for years and even when discovered may not have feature flags to disable them.