In a similar vein, VueJS has a comprehensive exploration/comparison of itself compared to other frameworks.
I really appreciate open source maintainers that have the humility to compare their work fairly with others within the domain while still taking pride in their own acheivements.
For example https://github.com/sapcc/swift-http-import#do-not-use-if -- This particular instance was born out of watching users bending over backwards to use the software even though standard software does the job just fine or better in their concrete case.
So while I have no idea about the status of vue.js regarding that topic, I belief that even if they are just in a normal state it is a huge positive aspect in comparison to riot.js, but the whole issue might be a little subjective too.
SQlite on the other hand doesn't have such a process so, every client has to open the database file itself, process the SQL and close/write the file again.
So the biggest difference is, that if you want to use SQLite, you will have to limit your database operation to one or fewer ;-) processes to not get into a situation where one process has to wait until the other finished writing the database file to the filesystem just to read it again in the next moment. Otherwise your filesystem and the continuing reading and writing of the database file will make the whole thing very slow.
But as long as you have only one process using the SQLite file and keep it open for a while you will see a comparable performance to other database systems. Actually, SQLite can be very fast if you do use it in a sane way.
So the limiting factor is not the degree of concurrency, but the amount of write-activity. Workloads that mostly read and only occasionally write to the database work well with SQLite.
I say this not because SQLite is not good. It's _great_ for what it intends to do which is, as the linked page says, offer an excellent alternative to fopen(). It doesn't really offer an alternative to a full DBMS, though. There may be some gray area between a "heavy fopen()" workload and a "light DBMS" workload where there might be a legitimate case to be made for SQLite over a DBMS, but standard OLTP or OLAP workloads with any degree of concurrency are almost uncertainly outside of SQLite's intended use case.
There was a recent conversation on a similar topic with input from Richard Hipp, SQLite's author, here: https://news.ycombinator.com/item?id=15303662
In a usual quassel (irc bouncer that can, optionally, store logs in SQLite) setup, you’ll have between 1 and 10 writers, constantly writing hundreds of IRC messages a second to SQLite, while you’ll also have readers that need to respond with low latency for queries loading hundredthousands of messages, including those that were just written.
The result is that SQLite gets so slow that the writers queue up so much work that the IRC connection times out.
SQLite is absolutely useless for multiple writers.
I don't really quite agree with this, but would say that with SQLite it's more important be be aware of how concurrency is being used with regards to the database, in comparison to something like MySQL or PostgreSQL. As far as IRC logs, I think the important thing (that would almost certainly hamstring any SQL database that's actually being ACID) is to not use autocommit on inserts, and to not use a transaction for every single insert (which is possibly what's happening). I wrote a plugin for ZNC to do exactly what you're doing above, and by batching my writes (either waiting 15 seconds or until n (where n is maybe 1000 or more) messages are queued), I've gotten superb performance out of SQLite3 running with a WAL journal, with the trade off being a little bit of data might be lost if your bouncer happens to fall down in an unhandled way. I was also able to get really good read performance without hurting raw write performance by leveraging partial indices, which something like MySQL doesn't give you.
With SQLite, it’s basically useless.
In the end, it's a matter of testing. Also, with faster drive options (Optane and whatever is next), it could very well be a decent option. It all depends.
Frankly, for logging, I'd lean towards elasticsearch (ELK) these days, which is about the most write heavy situation you're likely to see. Depending on the scaling Mongo, RethinkDB, ElasticSearch and Cassandra can all be better solutions than RDBMS. I'd still reach for SQLite or PostgreSQL first depending on the situation though.
Making a backup before trying something risky (and then restoring the backup when something screws up) is similarly trivial.
Maybe to some of you it's not worth that much, but it's a legitimate timesaver in my opinion.
But you wouldn't want a single file format on enterprise level systems where enterprise level performance and stability is required. At the very least, you'd want your data and transaction log files to be different files on different disks. Especially when IO throughput is important.
Also, since DBs tend to have important data on them, being able to copy a file (db) and move it anywhere you want isn't much of a selling point on serious systems.
Having said that, sqlite is great for application/client side storage.
It's deeply disappointing to not be able to use SQL on both client and server. Yes, there's Lovefield, which employs its own DSL that has nothing to do with SQL, so you can scrap everything on the server and redo it on the client.
Or just give up and go with NoSQL on client and server, that seems to be what the browser vendors are implicitly pushing developers toward.
SQLite hadn't been kicked to the curb, the WebSQL API whose “standard” was essentially “act like version x of SQLite” was kicked to the curb.
The concept of WebSQL was good, but the standard itself wasn't great, and the absence of any intent to implement from other vendors or intent to improve the standard from those who had embraced it made it a dead-end standard.
Nobody said what was so wrong about that.
I just don’t think there’s any rule that says that we need multiple implementations of thing to make it a standard. Also, we already have a standard for SQL and they could’ve just used the parts of SQLite that were compliant with that.
> I just don’t think there’s any rule that says that we need multiple implementations of thing to make it a standard
There are rules that say that we need multiple implementations of thing to make it a standard:
* The IETF requires it to be accepted as "Internet standard", ie the highest form: https://en.wikipedia.org/wiki/Internet_Standard#Standardizat...
* The W3C strongly desires that proposed standards have multiple, independent implementations to advance to the state of "Proposed reccomendation": https://www.w3.org/2005/10/Process-20051014/tr.html#cfr
> Also, we already have a standard for SQL and they could’ve just used the parts of SQLite that were compliant with that.
That would probably have been the best solution, I agree. And then people would have complained about bloat again, that browsers are now required to expose a SQL engine, that they are too big, etc...
I know right, there are in fact several (which may have been part of the problem). Most SQL database vendors tend to stray slightly (or not so slightly) from the SQL standard that they target; SQLite is no exception.
Just unfortunate that the browser vendors couldn't come together and build a client-side SQL based API rather than punting and implementing a very basic key/value store on top of which someone will eventually reimplement SQLite. It's like 5 or 10 years from now we'll have what we had 10+ years ago, such progress!
It's hard to say which reason was more compelling for Mozilla, but my gut feeling is that they were betting hard on NoSQL and came up with convenient technical justifications afterwards.
Are you thinking of WebSQL? I don't remember a SQLite interface client-side in any browser. I might be wrong, but I thought IndexedDB was originally implemented on top of SQLite in both Chrome and Firefox. I think Chrome moved to a different backing store, but that has no bearing on the API.
Well, yes and no. The whole problem with WebSQL was that it effectively meant using SQLite and never anything else that wasn't completely bug compatible.
So the whole problem with WebSQL was that it was, de facto, a web standard built around giving an SQLite interface client-side.
Yes, exactly, but Web SQL uses SQLite as the backing store. IndexedDB is its own thing, a standalone key/value store, nothing to do with SQLite.
Were you writing the same queries on the client and the server? Generally client and server are different moving parts of the same machine, so they don't really share a lot of code anyway. (Perhaps in some cases computations could optionally run on client or server depending on the circumstances, but security concerns limit the opportunities to do that.)
Yes, it's a pain to have to know two different technologies, but it doesn't seem like this often leads to literally needing to scrap code and rewrite it.
> Yes, it's a pain to have to know two different technologies, but it doesn't seem like this often leads to literally needing to scrap code and rewrite it.
Currently working in an Angular 5 + TypeScript frontend with Scala on the backend, a shared nothing architecture, really blows compared to a single typed shared everything stack.
All the stats based queries run on the server? Throw 'em out and manually recalculate on the client, probably literally 100X the amount of code; less efficient and error prone to boot. Same for form validation, model layer (hey, why does this new property exist client-side and not on the server?), etc. -- it's a house of cards.
As much as I love SQLite, I find this little issue always adding an extra bit of work to deal with. Not a blocker, but useful to know. I've found that in some (rare) cases, H2 became a better solution (horrendous start time, slow data load, but ability to use real date and time types reduced errors and improved analyses).
Finally, knowing that SQLite is not right for a situation is great, but it wouldn't hurt for them to mention a potential alternative, if the devs or community know of one.
H2, being java, is in a whole different universe from SQLite...not a valid comparison in my view.
It's dual-licensed under the Mozilla Public License and the Eclipse Public License.
Another perspective; embeddable py 10MB, electron app 50MB, free pascal exe 1.5MB.
So 7MB for a RDBMS seems reasonable, but not comparable to SQLite as you pointed out.
If you're determined enough you can turn any datastore into a mess, though I suppose SQLite gives you less tools.
Genuinely curious, what kind of page would require 200 queries? I've written plenty of moderately-complex applications and don't think I've come close to 10% of that on a single (non-crazy dashboard) page.
I should imagine something running on a generic ORM (ie without the optimised queries) would use considerably more (Rails, IIRC, will often do two queries to load a single object.)
(And just to be clear, for my use cases, I have been very happy with SQLite's performance.)
(If you’re thinking, “no dummy, just use caching” then I regret to inform you that cache invalidation and eviction without an info architecture are as hard if not harder. And caching bugs make you look like a clown in front of your bosses. Because the whack-a-mole process never looks professional)
Caching has sucked all the fun out of too many projects I’ve been associated with.
Do cache headers and cache busting first. You can always put a transparent cache in front of your REST endpoints in prod. After debugging the browser caching. Don’t write your own caching implementation. Please.
I've done similar with node.js streams pretty easily.
This can then be displayed to the user via a web server which has read-only access to the measurement db.
Haven't had much time to finish it, but it's been fun to play with.
I appreciate that there's just a hint of snark there. YAGNI, after all, is one of the most often-forgotten important principles in this industry.
Mainly this is because there is at most one, global transaction per Sqlite database (in other words you can't nest transactions), but also because the optimum window to gather up inserts for a bulk transaction might not match up well with the logical function call structure of you program. I recently learned about the Sqlite SAVEPOINT feature; this might help implement nested-transactions, but cannot help with the second problem.
For example, what if you made a "logging.db" file for a GUI application, and you're grouping writes to a log table in batches of 50 for each INSERT transaction. What happens if your GUI app gets to 48 inserts but then stops receiving user interaction events? You'd probably still want those 48 inserts to show up (eventually) in the logging.db rather than just being "lost" if the app subsequently crashes. So what do you do? Register a timer event to create a watchdog to flush the transaction if enough time goes by? OTOH what if a subroutine of your app is actually at that moment performing a lengthy "real" (logical) transaction and wouldn't want it flushed? Or maybe the application should use different database files for logging versus data where consistency is vital (so that a database can be opened either in "accumulate INSERTs" mode or "real logical transactions" mode)?
My point is not the detail of how to answer those questions; it is just to highlight that these are questions which have to be thought about at a whole-application architecture level.
Working around these sorts of problems with integers is just not worth the trouble.
I can understand the convenience if you are setting up a simple online store, but more complicated cases (like the one in your example) absolutely need a proper handling of decimal values.
I entirely agree that numeric is a good solution for this case.
But to try to answer your question nevertheless: I'd say there is potential for storage density and math performance issues. You really need to have a number of numerics for each row, and a lot of rows to make storage density matter. If you do a lot of aggregates over numerics you definitely can notice the CPU overhead of them over say a plain float. But most of the time that doesn't matter.
(I don't disagree with you, just answering your question)
Resource consumption isn't an issue at all. 100MB+ is less than loading an average news website in a browser and it costs almost nothing. The reason why I would have wanted to use SQLite sometimes is that it makes it easy to distribute and run the entire app on premises if needed. People are rightly concerned about losing access to web based software.
Serious question, what's the proper way to handle this case? I'm having hard time to believing it's IEEE-754 floats.
Of course, there's an issue of how to handle mul/div (i.e., rounding/truncation), but that's a separate question -- at least you'd be making decisions yourself rather than relying on how IEEE-754 floats work.
Some DBs like PG, do, as a matter of fact, support decimals natively which makes things quite a bit simpler.
> E.g., you won't find floating point fields in binary protocols for any of the major stock exchanges.
That's true of NASDAQ's OUCH, but some (many?) other protocols use decimal floating point.
NYSE's UTP Direct (See section 4.15) Is not fixed point, but actually their own decimal floating point system with a 32-bit significand and single ASCII character exponent ("0" to "6", indicating negated base 10 exponent, not allowing "5").
FIX SBE uses an 8 byte significand and a 1 byte (power-of-10) exponent, if I remember correctly.
It's a bit unfortunate that (as far as I know) there aren't any major messaging formats supporting IEEE 754-2008 decimal64 values.
"up to 131072 digits before the decimal point; up to 16383 digits after the decimal point"
Non jokingly: I think precision wise that's really plenty enough for pretty much any halfway realistic usecase (no, don't do public key crypto using SQL datatypes). There's some issues with the numeric / decimal type, but lack of potential precision isn't one of them.
From the docs for [`BigDecimal`](http://ruby-doc.org/stdlib-1.9.3/libdoc/bigdecimal/rdoc/BigD...):
> Ruby provides built-in support for arbitrary precision integer arithmetic. For example:
> 4213 -> 1265437718438866624512
> BigDecimal provides similar support for very large or very accurate floating point numbers.
My takeaway from the above is that, if arbitrary precision is needed, you should still just use integers. Tho, in practice, it may not be significant.
> Ruby provides built-in support for arbitrary precision integer arithmetic.
[I'm also assuming that it's really arbitrary precision and not just 'really big but limited' precision.]
Tracking transactional values for large stock and currency trades would be an example. That said, could use integers to represent arbitrary precision.
Store your as a single integer then write a small layer that converts that integer into decimal (that's all a language or DB does anyway). Alternatively, use two entries in the database for each side of the decimal if you want even more precision or store it as a string if you want arbitrary precision.
Well, that's my point. Postgres does it for me. SQLite doesn't. I'm lazy.
Also, SQLite's largest integer type is 64 bits. That's awfully tight for some use cases.
Also, seems like the obvious thing to do to store money in an integer would be to store it in pennies. Or if the numbers get too big, do what corporate financial reports do and use thousands.
If you implement multiplication and division with integer arithmetic, you need the additional decimals to keep the error small. See https://en.wikipedia.org/wiki/Salami_slicing
I recall something about Oracle sending people down there to help with the bank or perhaps government databases deal with those numbers. I can't find confirmation of that, but https://en.wikipedia.org/wiki/Hyperinflation_in_Zimbabwe says "Computers could not handle the amount of zeros such that other forms of money had to be used to act as normal money (bearer’s cheques)."
But if you sacrifice 7 places you can no longer represent the market cap of Toyota Motor at the Tokyo stock exchange (24 trillion Yen) or the US GDP (~19 trillion USD).
So if you don't have a data type that can represent all of these quantities as well as the results of any calculations up to the required precision then your code will become a lot more complex and error prone.
Also - easy to dump the sqlite db into a SQL text file then import into something like postgres :)
An SQLite database is limited in size to 140 terabytes (247 bytes, 128 tibibytes).
I think it's a hard question to answer, "what's the upper limit on SQLite?"
Better question would probably be, "We have this app, what database should I use?"
I use SQLite for a lot of smaller apps. Applications that have a focus on single-business, low-volume use.
They might do a good amount of computation and data-fetching, but they don't make heavy use of the database.
If you need features Postgres has, use that. If you need features MS SQL has, use that... if you don't need anything special, think about using SQLite :)
However, the limit depends heavily on your workload. If your working set (the data that all the simultaneous queries require) fits in memory then you'll be limited by memory bandwidth or the language you're calling from: GC stalls and query preparation can really hammer the throughput; if you're doing any actual work with the results of the query then I'd hope that time spent there would dominate the data access.
If your data doesn't fit in memory then you'll be limited by the speed of your disk system. If you have a fixed working set then you can increase the performance by adding RAM until the working set fits again.
If you have a streaming workload then you'll never be able to fit it in RAM (by definition) so you'll be limited by the speed of the IO systems involved; either network or disk.
If things are still too busy then stuff will start to queue up. This will cause each individual process to run slower as it'll be contending with the others. This, in turn will cause more incoming work to queue up. At some point the machine will run out of some other resource (memory for incoming connections, network sockets, space in queues, etc). How the system responds to that depends on how it's configured but, traditionally, your system is no longer providing service.
"SQLite on Heroku" , essentially: don't do it, switch to postgres.
I suppose it would work with AWS Beanstalk and EBS.
What I mean by that is that SQLite really is a replacement for open() and as such is not multi-thread compatible. If you're doing a lot of work that's corrupting you can also turn up the level of binary logging until recovery becomes easy enough.
I'm using SQLite effectively on a few thousand multi-threaded windows machines and I've encountered corruption exactly once- which was when a hardware raid had degraded non-gracefully.
I also believe that the common way of gaining performance while protecting safety is to use multi-thread mode with connection pooling. The idea being that your application can use different database connections to achieve multiple writes without relying on the overhead of additional locking required within SQLite itself for serialized mode.
Would be interesting if anybody has performance metrics that backs up what I said with data converging on an optimum number of database connections.