Hacker News new | past | comments | ask | show | jobs | submit login
Remote code execution vulnerability in SQLite (tencent.com)
444 points by LinuxBender 4 months ago | hide | past | web | favorite | 147 comments



It is very likely that this bug only affects systems which accept and run arbitrary SQLite3 queries. This includes Chromium, because Chromium ships with WebSQL. The Google Home is probably vulnerable because it can be coerced to load a webpage. I doubt that this bug affects systems that merely use SQLite as a database without providing external query access.

My best guess for the bug is that arbitrary SQLite queries, prior to 3.26.0, were permitted to write to the shadow tables used by various plugins to implement features. fts3/4, prior to 3.25.3, appear to contain an integer overflow bug which can be triggered by manually modifying the fts index data. A careful application of this integer overflow appears to make it possible to truncate a writable buffer, leading to a nice heap overflow condition that can be exploited by further crafted SQL queries.

The primary integer overflow bug was fixed in https://sqlite.org/src/info/940f2adc8541a838 "Add extra defenses against strategically corrupt databases to fts3/4.", committed as part of the 3.25.3 update (which is what Chromium updated to). Later, in 3.26.0, they further secure it by making shadow tables optionally read-only.

The worrying thing here is that SQLite3, in its default configuration, is still not convincingly secure. Being able to write arbitrary data to the shadow tables has the potential to break all sorts of assumed invariants, and it's pretty clear that the SQLite3 developers did not necessarily anticipate all the ways in which this could break. The "SQLITE_DBCONFIG_DEFENSIVE" option which was added does not appear to be on by default, and it breaks backwards compatibility (setting it causes SQL imports from .dump to fail because .dump assumes shadow tables are writable during import).

There may be more bugs lurking in this area - this would be an excellent opportunity to fuzz all the plugins in SQLite to see if any of them barf when their shadow tables are corrupted.


Excellent summary, nneonneo. I think everything you said here is correct.

The vulnerability only exists in applications that allow a potential attacker to run arbitrary SQL. If an application allows that, it is usually called an "SQL Injection" vulnerability and is the fault of the application, not the database engine. The one notable exception to this rule is WebSQL in Chrome.

I put up https://www.sqlite.org/security.html recently to serve as guidance for people who want to live on the edge and give unrestricted SQL access (or unrestricted database file access) to potentially hostile attackers. That page is a work in progress. More could be said. For example, it is probably also a good idea to use various obscure APIs to limit the length of SQL statements or the amount of memory that can be used, to avoid DOS attacks. I'll keep improving the document as I have time.

Our intent is that SQLite should be secure against these kinds of attacks. We have spent years fuzzing it to try to find these problems. But the thing is, we never configured a fuzzer in such a way that it might start modifying the shadow tables of FTS3, and so we missed this one. Moral: never underestimate the ingenuity of a motivated gray-hat.

The Chrome people have recently starting fuzzing SQLite database files on Google's infrastructure. We had previously only fuzzed database files on our own workstations. It's amazing the number of new problems you can find when you run a fuzzer at scale. :-) A few more problems have been fixed. We are not aware of any exploits. And in particular, if you follow the advice of the article above and "PRAGMA quick_check" untrusted database files or set "PRAGMA cell_size_check=ON" then none of the recently found and fixed issues are reachable.


If giving unrestricted database file access is “living on the edge,” maybe https://www.sqlite.org/appfileformat.html should be updated to reflect that?


I agree. If the file is one you might get elsewhere rather than only being your own files, then it is untrusted.

To me, "trusted" is: queries entered by the local user (or, for setuid programs, by the local system administrator instead of the local user), or that are built in to the program. Others are untrusted.

And yet, I have already considered these kind of vulnerability before even knowing about it.


What about Fossil? Is there a way for a potential attacker to run arbitrary SQL? I can't think of one but I'm only a light user.


I think it depends on the permissions. If you do not allow users with those permissions to login remotely, then I would suppose it would not be a problem. (This is what I do: users with permission to enter SQL and TH1 codes cannot login remotely.)


A nit: it's not necessarily just applications that accept arbitrary queries, but also applications that use sqlite as a file format.


Yes - good point. Programs accepting SQLite databases as input (as opposed to just queries) are also vulnerable. The exploit is probably somewhat harder if you don’t have interactivity, since it would depend on exactly how the corrupt database gets used.


Are you sure Google Home loads web pages? I can't think of a feature that requires that.


SQLite is the most thoroughly tested codebase I'm aware of [1]. It has seven times more test code than non-test code. 100% branch coverage. If even SQLite can have a RCE vulnerability, I'm convinced that it is not feasible for anybody to write safe C code.

[1] https://www.sqlite.org/testing.html


100% branch, line coverage means nothing. It's about logical coverage. What are you testing for? You are not testing lines of code, but logic.


Right. The actual standard is called "modified condition/decison coverage" or MC/DC. In languages like C, MC/DC and branch coverage, though not exactly the same, are very close.

Achieving 100% MC/DC does not prove that you always get the right answer. All it means is that your tests are so extensive that you managed to get every machine-code branch to go in both directions at least once. It is a high standard and is difficult to achieve. It does not mean that the software is perfect.

But it does help. A lot. When I was young, I used to think I could right flawless code. Then I wrote SQLite, and it got picked up and used by lots of applications. It will amaze you how many problems will crop up when your code runs on in millions of application on billions of devices.

I was getting a steady stream of bug reports against SQLite. Then I took 10 months (2008-09-25 through 2009-07-25) to write the 100% MC/DC tests for SQLite. And after that, the number of bug reports slowed to a trickle. There still are bugs. But the number of bugs is greatly reduced. (Note that 100% MC/DC was first obtained on 2009-07-25, but the work did not end there. I spend most of my development time adding and enhancing test cases to keep up with changes in the deliverable SQLite code.)

100% MC/DC is just an arbitrary threshold - a high threshold and one that is easy to measure and difficult to cheat - but it is just a threshold at which we say "enough". You could just as easily choose a different threshold, such as 100% line coverage. The higher the threshold, the fewer bugs will slip through. But there will always be bugs.

My experience is that the weird tests you end up having to write just to cause some obscure branch to go one way or another end up finding problems in totally unrelated parts of the system. One of the chief benefits of 100% MC/DC is not so much that every branch is tested, but rather that you have to write so many tests, and such strange, weird, convoluted, and stressful tests, that you randomly stumble across (and fix) lots of problems you would have never thought about otherwise.

Another big advantage of 100% MC/DC is that once they are in place, you can change anything, anywhere in the code, and if the tests all still pass, you have high confidence that you didn't break anything. This enables us to evolve the SQLite code much faster than we could otherwise, using relatively few eyeballs.

Yet another advantage of 100% MC/DC is that you are really testing compiled machine code, not source code. So you worry less about compiler bugs. "Undefined behavior" is a big bugbear with C. We worry less than others about UB because we have tested the output of the compiler and we know that the compiler did what we wanted, even if the official C-language spec didn't require it to. We still avoid UB, and SQLite does not currently contain any UB as far as we know. But is is nice to know that even if we missed some UB in the code someplace, it probably doesn't matter.


Nicely written, and thank you for providing such a great peice of engineering!

A thought: would it help to have a modified C compiler that would crash the app whenever UB was encountered? It might help find some bugs where non-default C compiler was used (which I assume happens, given the large amount of platforms sqlite supports). Or am I missing something?


There is ASAN, the address sanitizer. You can enable it by passing some flags to gcc. It will make your program crash as soon as there is an out of bounds read / undefined behaviour. If debug symbols are enabled, it will also tell you which line of code was responsible. It can save you countless hours of debugging


I believe there's some things your have to do such aren't C compatible, e.g. store fat pointers of base+length+offset instead of raw pointers, to catch OOB accesses.


I would not help because modern compiler treat UB as an optimization opportunity, including a license to do whatever they want (even elimination of code).


That 100% branch coverage does not include indirect calls via functional pointers or jumps to signal handlers caused by devision by zero or invalid memory access, right?


True, but SQLite still is one damn well-crafted codebase, that has been explored by thousands of pairs of eyes over time. And that's the point.


Isn't that the definition of branch testing, to test all possible branches within code and also testing the logic in all of those branches?


Consider a line like:

    value = 1 / input;
You can get "100% coverage" if you test that with `input = 1`, but unless you check with `input = 0` you're missing a quite important logical check.


Can't you just have a NonZeroInt type?


Only if your language supports dependent types.


It's perfectly possible in languages with ordinary ADTs.

  data Nat = Z | S Nat
  data NonZeroNat = OnePlus Nat
  data NonZeroInt = Negative NonZeroNat | Positive NonZeroNat


C isn't one of those languages, though.


The analogous would be the Go-style represent-a-sum-badly-as-a-product,

  struct nonzero_t {
    int is_negative;
    unsigned int one_less_than_the_absolute_value;
  };
which, under interpretation, ranges from -(2^32) to -1 and +1 to +(2^32).


In Ada, you can define integer types that only accept a given range of values.


Not really - internally inputs 0 and 1 use different branches.


That's not a branch; otherwise you would have an infinite (or impossibly large) number of branches for just that one line of code. A branch is when you execute one set of code upon a given condition, and another if that condition is not met.


I didn't say every number is a different branch. But on many processors, divide by zero triggers an interrupt. That's semantically the same as a branch.


It depends on the language. In C it is not a branch because division by zero is undefined and not a path you consider. In Java you can argue that there are two branches. One branch that throws an exception and one that does not.


No coverage reporting library will attempt to tell you that kind of coverage. You are essentially in violent agreement with the op but turning it into an argument by using different words for the same concept.


Testing all possible branches (each considered individually) won't get you very far in terms of testing all possible logic flows.

Consider a function which checks 5 simple if-statements in a row, always in the same order. Getting each branch means you tested 10 things.

But there are 32 ways for 5 if-statements to jointly evaluate. If there is a logical dependency between the state checked by one if-statement and the state checked by another one, your perfect coverage may not pick up on that.

If the if-statements might be checked in an arbitrary order... there are 120 ways to order 5 things. But you'll still get perfect branch coverage by checking 10 of them.


  int returns_less_than_twelve(int input1, int input2) {
    int sum = 0;
    if (input >= 5) {
       sum += 5;
    } 
    if (input2 >= 5) {
      sum += 8;
    }
    return sum;
  }
The following test cases will pass and achieve 100% branch coverage.

  int x = returns_less_than_twelve(5,0);
  test_assert(x < 12);
  int y = returns_less_than_twelve(0,5);
  test_assert(x < 12);
However, this does not cover the entire input space so

  returns_less_than_twelve(5,5);
  test_assert(x < 12);
Will fail. However, branch coverage won't tell you that there's a hole in your test coverage.

Generally however, writing full branch coverage will find a lot of issues, and also cause you to really think through how your code works; but still, it doesn't guarantee correctness. If you want that you need to start bringing tools that either exhaust your input space (a function which takes 5 booleans can be exhaustively tested for correctness in trivial amounts of time), or you start modeling your chosen language well enough that you can use a mathematical prover to demonstrate that your program or function is safe on all inputs.

This of course requires you to come up with a definition of 'correct' or 'safe'. For the above program, it's clear how to define correctness. For things like "Don't let an unauthorized individual access this data or data that is derived from it in a way contrary to the desires of the owner of said data" it gets 'tricky' ;).


You test all possible executions of your program when you test with all possible data inputs, which is infinite.


You can do this with machine verified programs. It's like proving a maths theorem; you don't check it for all values but you create a robust argument that it must be true for all values.


That is why I like using random data generators for tests. You can input some static data and then the rest is random. Every once in a while a bug pops out when you see a test fail that was previously passing.


SQLite is heavily subjected to fuzzing already ... maybe this vulnerability was discovered that way.


And the probability to find the corner case ('input = 0') for the simplest expression ('value = 1 / input') this way is not quite astounding.


None of the above methods are used for testing. You use boundary testing, branch testing, equivalence partitioning etc. Random data is not a good method of testing.


Except for the fact that it is exactly the method that has been used to discover a large number of critical bugs in the most popular OSS projects (including SQLite): http://lcamtuf.coredump.cx/afl/


Fuzzing isn't really practical if all you do is just generate a totally random bit stream for input. There are many much more clever and robust strategies to hit as many edge cases as possible. Check AFL[1] for some details on generating smart random input files. You can also combine that with pretty advanced dynamic execution analysis to fuzz against unknown processor instruction sets, like in sandsifter[2].

[1]: http://lcamtuf.coredump.cx/afl/

[2]: https://github.com/xoreaxeaxeax/sandsifter


On the contrary, for years, the most prolific fuzzers basically did just generate random bitstreams, and that technique will still find vulnerabilities in all the memory-unsafe software that hasn't been fished out by those same dumb fuzzers.


Sorry, I strongly disagree. Random data with a few static arguments is an incredibly great way to test. Adding in some chaos finds bugs. "why did that test fail after 100 times...ohhhh"

I try to only use random data when possible, less and smaller tests to write with a proper setup. End result: more bugs found.

Random data is a great method of testing.


This is exactly why I don't find code coverage tools useful. Now, a tool that can show what the tests assert against? THAT would be useful.


What you’re describing sounds very similar to “mutation” testing. https://en.m.wikipedia.org/wiki/Mutation_testing


That would be interesting. In what form would you express the result?


SQLite can by principle not suffer from a RCE.


Not sure why this is being downvoted, but you're correct. Now, a networked application that exposes some level of access to sqlite? That's another story. The question I think we all are asking is just how much "leg" does sqlite have to show to be vulnerable?


It's a pretty silly definition; it's like saying PDF or JPEG parsers can't be vulnerable to RCE, when they are in fact major vectors for RCE attacks.


I think the reverse definition is just as silly... Calling a JPEG parser vulnerability an RCE just because some online service is using it in a way that can be exploited remotely. By that definition, any bug is an RCE, since I can just set up a web server to run that program.

I think a better way of looking at it is that it's an ACE Vulnerability in the e.g. JPEG parser that causes an RCE in the Online Service.

Or, in this case, an ACE vulnerability in SQLite that causes an RCE in Chromium.


Sure, though what I'd say is silly is epistemological conceit of trying to pin down vulnerabilities as "remote" or "local". A lot of vulnerability research terms are silly (sillier than RCE). Either way: it's a "term of art", and it means what it means, and this is a clear and obvious instance of an RCE.


I assume people making this distinction are thinking about "network services that the public can compromise by interacting with them over the Internet" vs. "software that someone can compromise by getting it to accept a malicious input". But I agree that "RCE" is commonly used for both; otherwise we would have to maintain that browsers don't suffer RCE vulnerabilities because a malicious document is no longer "remote" once the browser has downloaded it.


Sure, but I very frequently parse PDFs and JPEGs from untrusted sources, but almost never open untrusted .sqlite files.

(This is still a serious security vulnerability)


It's an RCE in Chrome.


Ok I see your point


Risk is transitive.


> but almost never open untrusted .sqlite files

You may not notice that you do when apps use sqlite as their file format:

https://www.sqlite.org/appfileformat.html


I don't know. I'd say PDF or JPEG parsers (and SQLite) can have arbitrary code execution vulnerabilities, which can in turn be responsible for remote code execution vulnerabilities when used in network-connected software.

e.g. SQLite has an ACE. Chrome has a RCE (which is SQLite's fault).


If what you're observing is that industry lingo is suboptimal, you'll get no argument from me. Consider for instance "XSS" and "CSRF", which are just manifestly silly names. But the names mean what they mean; try as I might, I can't get people to accept "Javascript injection".


The actual industry term is just "code execution", or maybe "arbitrary code execution" if you want to get more specific than is typically worthwhile, not "RCE".

Usage example: "I got code execution!"


I’m observing there are reasonable terms for both the vulnerability in SQLite (https://en.m.wikipedia.org/wiki/Arbitrary_code_execution) and the vulnerability in Chrome due to the vulnerability in SQLite (https://en.m.wikipedia.org/wiki/Remote_code_execution) and wondering why we can’t just use those?


I don't know what to tell you. Try this: Google [browser rce], and then [browser ace] (or [browser ace vulnerability] or whatever). It'll be immediately apparent what the term of art for drive-by code execution vulnerabilities in browsers is.

I sort of intellectually in the back of my head know that "arbitrary code execution" is a term that has been coined and used in the past, but I don't offhand know of anyone that uses it (among other things, it's kind of redundant). "Local only" code execution vulnerabilities aren't "LCE", but rather (usually) "privilege escalation".


In both my comments I explicitly said that vulnerabilities in browsers can and should be called RCEs. I was only arguing about what to call vulnerabilities in the underlying libraries (like SQLite) which aren't inherently exposed to "remote" data/manipulation.

Say for some reason someone used an exploitable version of SQLite in a program that had the setuid bit set. You wouldn't say SQLite had a privilege escalation vulnerability, would you?


They're only vulnerable to RCE if image data can be supplied remotely. What's the analog here? Accessing the JavaScript API? Specifying a query string? Maliciously encoded data? Some of these are scarier than others.


This isn't, like, a real debate. Go here:

https://pwnies.com/

Start with the 2018 nominations but feel free to check the archives. Drive-by browser vulnerabilities are RCEs.


> Drive-by browser vulnerabilities are RCEs.

I would never argue they aren't, but by this logic ("it's like saying PDF or JPEG parsers can't be vulnerable to RCE") virtually every code execution vuln in a library can be called RCE. I haven't noticed this to be the case with e.g. libtiff vulnerabilities (of which many make it into my inbox regularly), although image libraries are one of the cases were CE = RCE is still fairly reasonable.

Let's assume this SQLite bug is only exploitable if you can input arbitrary SQL. Almost no applications use it this way (except Chrome). I think it's clearly unreasonable to call it a RCE in SQLite then.


Uh, no, for example, it could also potentially impact any application that uses sqlite as (part of) a file format.


In that sense, soon USB kernel stacks have remote code execution vulnerabilities because browsers added dumb APIs.

Should we fix bugs? Yes. Should we scream at people that expose raw APIs they don't understand far beyond their design constraints? Yes yes yes.


I think you're assuming the target is a browser, but my question was how this might affects servers. Does the attack use malicious SQL statements, API calls, or encoded data?


Technically, this attack is actually is two separate attacks in a chain. The first node in the chain is delivering malicious SQL. The second node is executing code remotely via SQLite. The proof is that SQLite or the application linking it could have mitigated this attack independently by either filtering the query string or better protecting the memory which is being written to.

In practice, however, the community gets more bang for their buck if they label the SQLite code execution vulnerability as an RCE since the vast majority of use is in a networked setting. You have to remember the audience used for these terms. They aren’t scientists in the traditional sense where taxonomy is highly aligned the ontology — instead, the labeling serves the operators with metaphors that depart from reality insofaras they increase security engineers ability to do their job effectively.


Sql databases are usually behind an application layer. Still they can suffer from RCE. Sqlites model is no protection.


It's no more "by principle" secure than any other SQL server bind to localhost only, so I'm not sure what you meant by it does not suffer from a RCE.


It is, actually. When something binds to localhost, there's still potential for privilege escalation vulnerability, since any process can connect to the port - so if there's an exploit, a low-privileged process could hijack a higher-privileged one. Localhost sockets are still a security boundary.

Since SQLite in and of itself is just a library, it doesn't have that problem. You have to expose it to untrusted inputs manually somehow (e.g. by setting up a socket).


Especially considering you could generate faster C code than written by hand. With Proofs, Without Compromises http://adam.chlipala.net/papers/FiatCryptoSP19/FiatCryptoSP1...


Have you seen the language Esterel?


My impression is it's rather low-level verification and modelling language comparing to the platforms like Kami (https://deepspec.org/entry/Project/Kami)


Or lets wait and see what the bug is and why it wasn't detected in the testing coverage.


> "If even SQLite can have a RCE vulnerability, I'm convinced that it is not feasible for anybody to write safe C code."

This has much less to do with C, than it has to do with the fact that sqlite is a huge codebase.

Software can be vulnerable regardless of the programming languages used.


You had me up until that last little bit at the end. No where did you make any reference to C until the end. You're kind of putting the cart before the horse aren't you?


To be a bit glib, unit tests don’t catch security vulnerabilities. Maybe I’d agree this can happen to any project, but my example might be something more like OpenBSD


Why not?

In this specific case, a unit test that checked this integer overflow seeems to prevent the vulnerability.

To be clear: This is not to admonish sqlite. They have taken testing further than any other project i've heard of, except maybe the NASA software that might cost lives if it fails.


Incidentally a lot of NASA tools use SQLite as well from what I have heard.


My guess, for what it's worth, is that chromium is vulnerable because it exposes sqlite to web applications, which can then execute queries in such a way as to achieve code execution.

I highly doubt this would affect, say, a blog running with a sqlite database. From the alarmist nature of this post, though, it's unclear.

I see chromium, in their patch, switched to using new flags when opening the DB. There are also some sqlite changes that seem to prevent meddling with virtual table shadow tables (eg inverted index for the fts3 extension).

The question I think everyone is asking is how much sqlite needs to be exposed by an application in order to be vulnerable?

Just my thoughts. Eager to learn more.


Not so much as a guess as a documented fact:

https://chromereleases.googleblog.com/2018/12/stable-channel...

The Chromium exposure is through Web SQL.


That doesn't really answer the question about other applications. What SQLite function do I need to not call with network input?



An RCE in a non-networked component is interesting (in other words, obvious hyperbole). Either this is your usual corruption bug/vuln triggerable in some/many programs using SQLite, or an actual bug in SQLite itself, e.g. query preparation (a fix/workaround being committed to SQLite doesn't necessarily imply one or the other). Whether the RCE hyperbole is justified remains to be seen.

Edit: Apparently the exploit vector is due to WebSQL.

And my guess as for the vulnerability area are "strategically corrupt databases", because there have been numerous commits related to this in the relevant SQLite releases and some seem like they were added relatively late in the process (e.g. after changing the VERSION file but before releasing).


They mention the example of browsing to a web page. Not quite a fully fledged remote execution bug but close enough. I think what they're really saying is they're aware of remote execution cases. An example of that might be a web services that's backed by a SQLite query.


Drive-by code execution from a page in a web browser is almost the textbook example of "RCE" in its modern sense.


Yeah, of the browser or plugin. SQLite is neither although does have a local exploit.


JavaScript code has access to SQLite. So any webpage where the attacker controls/can inject JavaScript is vulnerable.


Firefox has never supported WebSQL...


Ancient and long dead Opera 12.xx let you set all HTML5 Offline Storage quotas/features globally and per domain.

Afaik Chrome and its derivatives lack any form of user control over local storage. No Quota mechanisms, no domain black/white listing, no feature toggle. localStorage, webSql, IndexedDB, Filesystem API, all forced on with no limits under user control.

In a better world this would be an easy fix for Chrome users unable to upgrade their browser, flip one config setting to disable webSql and you are done, alas Google wont let you do that. Cant wait for first worm using this vuln.


I casually thumbed through a few of the commits they posted and came across this

https://chromium.googlesource.com/chromium/src/+/c368e30ae55...

   for(i=0; i<nChar; i++){
     if( n>=nByte ) return 0;      /* Input contains fewer than nChar chars */
     if( (unsigned char)p[n++]>=0xc0 ){
   -      while( (p[n] & 0xc0)==0x80 ) n++;
   +      while( (p[n] & 0xc0)==0x80 ){
   +        n++;
   +        if( n>=nByte ) break;
   +      }
     }
   }
   return n;
Looks like there may have been an issue in parsing malformed multibyte unicode characters properly.


   const secondStatements = [
   "SELECT quote(root) from ft_segdir;",
   "UPDATE ft_segdir SET root = X'0005616261636B03010200FFFFFFFF070266740302020003046E646F6E03030200';",
   "SELECT * FROM ft WHERE ft MATCH 'abandon';"
   ];
Just saw the proof of concept page. Looks like they are building quite the usual string in hex... Starting with a null terminator? Mmmhmmm


Unfortunately this announcement is light on details; does anybody know what the actual vulnerability was?


Reading through https://www.sqlite.org/releaselog/3_26_0.html which they linked to, I'm presuming that it is tied to items 3 and 4. Which is highly suggestive that the problem is that ordinary SQL is able to write to internal virtual tables in a way that corrupts the database. And presumably from there, once you can introduce corruption you can get it to exploit a payload that you provide.

The fact that Chromium also saw fit to patch this suggests further that there was likely some way that it could be tricked into issuing queries that did this, allowing some compromise of the browser. If this could have been triggered by a web page, then that explains why they are light on details.

It should be noted that a lot of applications embed SQLite internally. If one as well studied as Chromium could be tricked in this way, I'm sure that others can as well. And since the upgrade has to happen to an embedded component, we're probably going to hear about this one for a while.

Please note, this is all educated guesswork from knowing the software ecosystem and reading release notes. I have absolutely no knowledge of the vulnerability.


> The fact that Chromium also saw fit to patch this suggests further that there was likely some way that it could be tricked into issuing queries that did this, allowing some compromise of the browser. If this could have been triggered by a web page, then that explains why they are light on details.

Chromium still supports WebSQL, though, which gives you essentially free reign on a SQLite database. This is quite different from the way most applications expose SQLite to untrusted data (i.e. only through parameter binding).


From what some others have posted, I suspect the underlying bug here is that a database corrupted in the right way can cause arbitrary code execution. Since Sqlite suggests use cases that require loading untrusted databases, this is a bug in its own right.

Then, the rest is "to be safe" measures, because it was possible for carefully crafted SQL to intentionally corrupt the database in controllable ways, including triggering the former bug. This isn't really the bug fix, but rather a measure to reduce the attack surface against similar undiscovered bugs.

This is speculation, though.


Correct. The primary error is that corrupt "shadow tables" used by the FTS3 full-text search extension could cause RCE. The fix for that specific problem is here: https://www.sqlite.org/src/info/d44318f59044162e

The new SQLITE_DBCONFIG_DEFENSIVE features is more of a defense-in-depth, designed to head off future vulnerabilities by making shadow-tables read-only to ordinary SQL, along with some other restrictions. If you have an application that allows potential attackers to run arbitrary SQL, then the use of SQLITE_DBCONFIG_DEFENSIVE is recommended. It is not required. We still consider it a serious bug if somebody is able to find an exploit even with SQLITE_DBCONFIG_DEFENSIVE turned off. But that setting reduces the attack surface, making future bugs less likely.


They are giving time to more SQLite users to patch and waiting the CVE to be assigned. Looking at the chromium patch, this has something to do with ALTER TABLE. Looking at SQLite release notes they clearly are hiding the real nature of the issue there. For instance, see this: https://www.sqlite.org/lang_altertable.html

"Compatibility Note: The behavior of ALTER TABLE when renaming a table was enhanced in versions 3.25.0 (2018-09-15) and 3.26.0 (2018-12-01) in order to carry the rename operation forward into triggers and views that reference the renamed table. This is considered an improvement. Applications that depend on the older (and arguably buggy)..."

A problem that (well tailored) enables a RCE is just "arguably buggy" in their view?


I think it relates to "ALTER TABLE" too-- specially: https://www.sqlite.org/src/info/6e1330545e7b74fe

The comments in checkin a61ed147 for "renameColumnFunc()" give me the willies.


They say to update to this SQLite version: https://www.sqlite.org/releaselog/3_26_0.html

Skimming that log doesn't give me any ideas.. maybe the diff would help elucidate?


Maybe related to this, as a guess? https://www.sqlite.org/c3ref/c_dbconfig_defensive.html#sqlit...

SQLITE_DBCONFIG_DEFENSIVE The SQLITE_DBCONFIG_DEFENSIVE option activates or deactivates the "defensive" flag for a database connection. When the defensive flag is enabled, language features that allow ordinary SQL to deliberately corrupt the database file are disabled. The disabled features include but are not limited to the following: The PRAGMA writable_schema=ON statement. Writes to the sqlite_dbpage virtual table. Direct writes to shadow tables.


The Chrome update only bumped to 3.25.3, so it's likely a commit between 3.24.0 and 3.25.3:

https://chromium.googlesource.com/chromium/src/+/c368e30ae55...


Maybe related to https://www.sqlite.org/src/info/8576ccb479fc4b76 ?

Edit: Yes, this is probably it.


No way, this commit is Windows only.


That commit seems to imply that it's a bug when running on Windows, but the Tencent Blade folks have said they've exploited this on Google Home devices. My guess is that this commit is one of many that helped resolve the vulnerability.


"Google Home" is either:

the app installed on your phone to control say a Chromecast,

or a smart speaker.


Yeah, I was hoping to find a CVE number but couldn't


Has there been a CVE number assigned yet?


I don't see any: https://www.cvedetails.com/vulnerability-list/vendor_id-9237...

Edit: it's "To be allocated": "[$TBD][900910] High To be allocated: Multiple issues in SQLite via WebSQL. Reported by Wenxiang Qian of Tencent Blade Team on 2018-11-01" (from https://chromereleases.googleblog.com/2018/12/stable-channel...)



Welcome to the security apparatus hype machine.


Remote implies it can be accessed remotely, is that true, or did they mean “remote if an attacker can remotely send data to SQLite”?


Sounds like they mean “remote” because chromium uses SQLite and JavaScript loaded into your machine comes from a remote source. So because a website can run JS that can exploit chromium they’re calling it an RCE.


Yeah it seems like RCE in the context of Chromium, but not SQLite? I know it’s pedantic but if this is RCE in SQLite because it’s exposed to the network via other software, every vulnerability is “remote” because you may expose it via other software.


This is surprising considering that SQLite is very heavily tested. It shows that ridiculous amounts of testing with 100% coverage of every code path and "millions and millions" of test cases still doesn't guarantee that the program always works as intended.

I think that this is an important lesson about testing. We should have fewer tests but we should try to get the most value possible out of each one and for developers that means actively seeking out unusual edge cases that are likely to break things.

Source: https://www.sqlite.org/testing.html


Two points:

(1) The coverage testing used by SQLite is very good at finding problems that occur when the system is used as it was intended. Fuzz testing is better for finding vulnerabilities that can be exploited by a hacker. The 100% MC/DC testing in SQLite is very useful in ensuring that the code does what is intended for sane inputs. And 100% MC/DC helps prevent us from breaking things as we evolve and enhance the code. But the MC/DC testing is less useful at fending off attackers.

(2) The magellan vulnerability exploits a bug in an SQLite extension, FTS3, which while very well tested, is not testing to 100% MC/DC. (See the second sentence at https://www.sqlite.org/testing.html#test_coverage)

Hence my takeaways from this episode include that I need to extend 100% MC/DC testing to all commonly used extensions in SQLite, including FTS3, FTS5, and RTREE, and I need to improve fuzz testing throughout SQLite but especially in extensions.

Advocates of "safe" language correctly observe that this particular problem would not have happened if SQLite were written in (say) Rust. Rewriting SQLite in Rust in not (yet) a viable solution. (See https://www.sqlite.org/whyc.html) But I can start moving SQLite in that direction, and perhaps make use of techniques taken from safe languages to improve its resistance to attack.


Hopefully soon, “moving in that direction” can be done by slowly porting to Checked C, while always retaining an executable artifact. https://github.com/Microsoft/checkedc


Zhuowei Zhang (@zhuowei) published a proof of concept that crashes Chrome 70: https://worthdoingbadly.com/sqlitebug/


Python ships with a sqlite3 module in the standard library.

Does this mean Python needs to ship a security path? What should Python users be doing about this?


But does the sqlite3 module actually contain SQLite with it, or just a library to interface with it? The fix does not change any interface library code.


On Windows, a complete sqlite3 DLL (~1MB) is included with the Python distribution.

On Linux/macos, the Python extension (usually) links dynamically to a shared sqlite3 system library.


Does python use it's own sqlite3 or system sqlite3? In the latter case, patches would be the package maintainers' responsibility.

Of course, on Windows, it's going to have to use its own sqlite3.


Windows 10 ships a copy of sqlite3 since version 1511, I don't know which version it's shipping in the latest updates though.


> Does this mean Python needs to ship a security path?

Python binaries (e.g. the Windows installers) may need to be updated. For Linux distros Python would depend on a system package.


It's only an issue if you allow untrusted code to run arbitrary SQL statements, which should never be done.


Do you have a source for that? It seems far more concrete than anything else I've seen.


"If you use a device or software that uses SQLite or Chromium, it will be affected."

If I write a hello world C program that does some sort of IO with SQLite, it will be vulnerable to remote code execution? (if this turns out to be true, that will be quite impressive!)

Guessing something was lost in translation there. Sounds more like someone found a way to get code execution if you can inject certain data into SQLite, then found various applications that expose this functionality remotely?


Probably they found the vulnerability through Chromium, then extended that to "everything that uses SQLite". Hard to tell anything without more details though.

But if that is the case this is huge. SQLite is used in many places nowadays: Websites, browsers (Chromium and Firefox, I know of), various software including some Android apps. That also probably means the attack vector is some procedure where input is sanitized (assuming SQLite provides that, I never programmed against the C API).


WebKit; worse off, WebKit and Chromium expose SQLite almost directly through WebSQL! Drive-by malware!


If I understand correctly, it requires JS to be enabled (which it is usually).

(Edit: wrong term, it's not "HTML5 Local Storage", it's "HTML5 Database" thanks):

Chromium (EDIT: idk yet)

Webkit2: https://webkitgtk.org/reference/webkit2gtk/stable/WebKitSett...


Drive-by malware used to require Flash or Java... which used to always be enabled.

Edit: don't disable local storage! you'll break lots of things that way, and I don't think that includes WebSQL.


Yeah, that's obvious nonsense. Merely using SQLite in your app does not open a port. How, exactly, are you going to be "vulnerable to remote code execution" when you don't use any network connections?

And the phrase "uses SQLite or Chromium" is pretty close to gibberish. Those two things... are not really related.


Doesn't Chromium (like lots of other software) embed SQLite? So using Chromium would be (a potentially easily overlooked way) of using SQLite?


Yes it does, and it makes sense to mention it because Chrome and Chromium are very popular and widespread, but many may not know it uses SQLite.


Its a fairly special case where any website can execute arbitrary SQL due to WebDB. It should be mentioned that this is deprecated: https://hacks.mozilla.org/2010/06/beyond-html5-database-apis...


Yeah and that deprecated feature will probably still be around failing to totally die a decade from now.


That would be "Chromium and SQLite", which is a combination that makes sense. "Or" doesn't make sense.


Huh, so Tencent Blade is like Google Project Zero? Ideally all companies would start attacking each other, and improve everyone's security.


The question are: Where is the vulnerability? By executing user-specified SQL statements (with or without setting an authorizer callback; I have once reported a bug causing SQLite to segfault in some cases when the authorizer callback denies something)? By downloading a corrupt database? In some extension (if so, in what extension)? In the VFS? What circumstances are needed to exploit this?


Pretty sparse on details. I presume local SQLite files for backend systems are unaffected, can anyone confirm?


Any news about Firefox? Are they also impacted?


Apparently not since they resisted implementing WebSQL.


Pretty light on the details here.


When people will start realizing how fragile is actually every project from Google. In the end it's built by humans, remember past week about kubernetes and consider the fact that Google pays thousands of dollars to white hackers who helped them in the past.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: