Agreed. The reason SQL rules is that nothing has really come into existence that is as or more powerful AND easier.
I used to hate writing SQL (I still do somewhat) but I took the time some years ago to really learn the language and once you do that you appreciate a few things:
1. Yes it requires a high mental load to craft something sophisticated
2. I can’t think of a ‘better’ alternative to SQL that isn’t a compromise.
3. You can do a crap load of stuff with joins, procedures and CTEs once you understand it.
> 3. You can do a crap load of stuff with joins, procedures and CTEs once you understand it.
Yes, or you can skip that part and do the tricky part in an actual programming language ;) For instance in Haskell or JavaScript. There is then even a chance that the code is both fast and actually works. :D
I mean in the end this is just a cheap marketing trick to literally buy people into the platform. People can write it into their CVs. The whole Influx ecosystem looks great. Although the UI (Grafana) is slow as hell once there are enough Graphs in and if you need fail-over or scaling, you need to buy an expensive license.
Also already the existing query language is obscure and counter-intuitive.
Many of the optimizations you mention (for example the first two) may in fact be removing baggage that probably would not have been there in the first place.
> Yes, or you can skip that part and do the tricky part in an actual programming language
That works great up to a point but will not scale for environments that need to scale. I’ve worked on a number of systems where they implemented excel export or bulk import components and wondered why their application locked up after a few thousand rows.
Usually the developers turn around and blame their SQL engine for being deficient and then try Mongo or similar - not because it’s better necessarily but it allows them to continue working the way they want to (on application code).
Not to sound mean but the developer types who shy away from writing proper SQL queries also seemed to be oblivious to asynchronous programming techniques and using the right data structures in their code - indicating that they were generally less interested in finding the most efficient methods.
As crazy as this sounds, yes. Not the entire dataset of course, a selection of it and then tie things together on the fly.
I've seen a handful of systems (one of them I wrote myself :D) that did complex queries. In part and in certain situations this can be very efficient, no doubt. But what I observed is:
- dev-wise these systems become one-man-shows
- implementing feature over feature becomes exponentially slower with time
- initial performance is there but it turns into slow performance over time
As someone who actually glares at Perl quite frequently can I gently request you take that slight back. Or at least qualify it to be "perl regex". Unless youve used Perl 6 (a different language) and not found it readable youd do well to say Perl 5 too. Or perhaps you've never used either language but use it as a defacto meme to make a point?
The bigger argument is not readability but where the abstraction lies. From the examples I cant see much sugar beyond what an ORM often provides.
I write Perl5 every work day and find Perl6 less readable due to double symbols and increased symbol madness. Like the new ternary operator, which is probably much more visible but looks just crazy. Or private/public variables. In some respects it's worse than Perl5, in others it's exponentially more powerful. It's like giving you the equivalent of a swiss army knife for each and every tool inside your toolbox.
That said, I am more tempted to try Ruby, Crystal and Dlang for a new project, rather than Perl6. The latter looks like it was designed by a mad scientist on LSD.
> It's like giving you the equivalent of a swiss army knife for each and every tool inside your toolbox.
Please note that the link you've given, is about being able to restrict calling a subroutine by making sure that the first 2 parameters are of the same type (regardless of which type). Generally one knows which types to expect. And even so, typechecking is an optional thing in Perl 6 (hence the term "gradual typechecking".
> The latter looks like it was designed by a mad scientist on LSD.
I think that's uncalled for. But that's your opinion. The same mad scientist who gave you Perl 5, by that reasoning, by the way.
>can I gently request you take that slight back. Or at least qualify it to be "perl regex".
Regex has the same complexity in any language, that can't be a valid reason to denigrate Perl. My biggest issue with Perl is that functions don't even have proper parameters and you need somewhat cryptic syntax to work with arrays, scalars and dictionaries.
Define "proper parameters". Perl subroutines are delightfully flexible in that regard compared to most other languages, and they also happen to be closer to your machine's reality (x86 doesn't care about what's in your stack when you JMP). With the Perl approach, you have the tools to implement multiple dispatch, variable numbers of arguments, etc. rather easily and elegantly. Yeah, it looks weird from the outside, but once you're used to it you start to want it in other languages.
That's one example of what makes Perl a great teaching tool for general programming: it doesn't hide the mechanics of function/method calls behind the language, but instead shows it off to you and encourages you to explore and experiment.
People keep saying that Perl 6 is too late, but they don't say that about other new languages.
I mean nobody says the same about Julia, which has been in development for the better part of a decade.
The only way it could be too late is if it doesn't do anything better than a single other language. By that I mean if there is a language that does everything that Perl 6 does, and does it just as well.
Even if all it does is bring a collection of useful features that haven't already been collected together into a single language, then it isn't too late.
It could also create a new design of an existing feature that goes on to influence the design in future languages. That would also make it so that it isn't too late. (I think that in the future we may be able say that about Perl 6 grammars.)
I think the real reason people keep saying that about Perl 6 is that they want it to be true. We can only really make that determination years from now when we have perspective.
Ok, I didn't know that, haven't used Perl in a long time. But I don't think any of my Perl programming colleagues have heard of it or like using it. I still think their code is harder to read because of this ingrained bad practice.
SQL definitely has weaknesses, but I wish people wouldn't use "straw man" examples to crap all over it. The flux example from his blog post would look something like this:
select
lag(value, 0) over recent * 1 +
lag(value, 1) over recent * 0.5 +
lag(value, 2) over recent * 0.25 +
... as exp_moving_avg
from telegraph
where time > datetime_sub(current_datetime(), interval 1 hour)
and measurement = 'foo'
window recent as (order by time rows 10 preceding)
Here, the main difference between flux is that flux has a built-in exponential moving average function, whereas in SQL we have to actually write out the formula.
> Here, the main difference between flux is that flux has a built-in exponential moving average function, whereas in SQL we have to actually write out the formula.
The possibility to provide arbitrary data processing functions is one of the core features of contemporary query languages. In SQL it has always been a huge problem (and your example is a good demonstration). In the example provided by they also rely on a built-in (exponential smoothing) function and therefore it is not clear whether I can really perform arbitrary ad-hoc computations within the query itself (without built-in or externally defined functions).
> The possibility to provide arbitrary data processing functions is one of the core features of contemporary query languages.
Yes and this is actually a big selling point of PostgreSQL based solutions. It's important that peoples don't confuse Modern SQL solutions with the ones they may have encountered decades earlier or comparison with NoSQL will indeed be unfair.
SQL has a very solid ground in research - a lot of - in relational algebra. If you try to make a query language that is a dsl for anything without a really different data model underneath, you will accomplish nothing great.
Just being built on the relational model is a tiny part of what makes SQL what it is (which is largely catastrophic, as languages go.) It says little about how queries are executed, indexes, syntax, data types, yada yada...
And the relational calculus itself isn't super lovely as a base... Turing completeness is a nice thing to have, after all.
Sure, those things aren't specified in SQL, but again -- that's a feature of SQL, not the relational model. You could just as easily have an imperative relational language as a declarative one, so those theoretical foundations can be used to justify very little of the query language.
I think the specific departure from the "traditional" relational model that gets them there is "WITH RECURSIVE". I don't know whether there are other ways to get there.
Personally I think it's the right side of the fork to land on.
Which is actually the case for influxdb : there are no relations in it. Remember that what's used in influxql is the syntax, not the data model. However I must admit I don't see a particularly big need to change the query language to something completely different.
Author here, I just noticed that this got picked up so I'm late to the party. I suppose I'll take the bait and aim to clarify one thing that I think is funny people are getting hung up on. My line: "I don’t want to live in a world where the best language humans could think of for working with data was invented in the 70’s"
Read in context, the meaning of that sentence isn't that things invented decades (or centuries or millennia) ago are all bad. I even state that SQL is a great and powerful tool. If you took from the post that I think SQL is shit and needs to be replaced, you weren't paying attention.
The point of that line (and really the point of us creating Flux) is that we think there can be a more elegant and understandable language (read: API) for working with time series data. But that we won't get there by trying to improve SQL. You don't build an automobile by creating better wheels for your horse and buggy.
Also, SQL isn't a language like English. SQL is an API and APIs change all the time. Yes, code is communication, but its form evolves much more quickly than spoken and written language between humans.
>> If you took from the post that I think SQL is shit and needs to be replaced, you weren't paying attention.
Why, why, why, why, why???? I didn't even have to look it up and I could already tell from your attitude in your blog post and this comment that your company is based in SF and backed by a bunch of VC money. And why is it that the systems software startups always have the worst attitudes to boot? Usually, when a bunch of people take something from an article that wasn't intended by the author it's because the author did a poor job not because a whole bunch of people don't get it or aren't paying attention. The arrogance is unbelievable.
Holy shit. The language name alone is a really, really stupid idea.
Protip: If you're inventing a new esoteric programming language (and until you have other people implementing non-trivial projects in/using your language, it is an esoteric programming language), google the fucking name first.
If googling your intended <thing's> name results in more then 1000 hits, CHANGE YOUR <thing's> FUCKING NAME.
If you don't trying to find any resources about the <thing> on the internet will be a huge pain in the ass. Name your project something unique.
Googling "flux" results in "About 197,000,000 results". If you just make it a little more specific as "fluxql", you get ~142 results.
People looking for language resources will actually find the shit they're looking for, and the name actually tells you something about what it does, which is nice.
fluxlang is what people should be searching for and we'll continue to point that out in future blog posts, documentation, #fluxlang on Twitter and SO and everywhere else. People got there with Go, so I assume people will get there with Flux if the language is successful.
I'm the dir. of eng. for the team building flux. Before that, I built a transactional SQL system (during peak NoSQL hype). I like SQL.
After a year+ of watching people use InfluxQL and thinking about the types of user experiences that timeseries specific platforms can offer - I'm eager to see flux enter the world.
People like exploratory and notebook like environments. Building a language that integrates with those workflows and even supports a REPL for writing queries is a nice fit to this space.
InfluxDB chooses a non-relational data model. Timeseries queries almost all filter on terms, partition, window, group - and then apply a sequence of functions to those groups. Most queries end up using SQL analytic functions that many users aren't experienced using... while mapping a only vaguely-relational (and very non-normalized) data model to boot.
The timeseries space is visual - visual tooling really matters. SQL isn't an easy fit there, either. It is hard to write SQL incrementally or to interpret just part of a SQL query to show intermediate results. Additionally, users expect a large set of non-standard SQL functions to be builtin.
There are competing systems betting explicitly on SQL and others choosing a functional approach, a strong competition of ideas and practices which should be a win for end-users.
(And to correct the above comment, flux expresses select, project, and join operators.)
when making a new project I honestly believe the best way of doing it is documenting other projects, in this case SQL, datalog, etc and why the choices they made were not what you wanted and the alternatives and why $x was chosen, this way if people disagree with your design you can refer to the research you did.
I'd be interested in seeing a language that can take full advantage of the architecture in Out of the Tarpit.
It's always touted as a must read paper but I haven't seen many inroads towards something truly Functional-Relational.
I don't think a new query language is the solution to the problem. SQL isn't that bad. Sure, for sanity, every language has a reimplementation of SQL syntax using whatever abstractions available. But even if you don't go SQL, you've got datalog.
A language being 40 to 50 years old doesn't make it a problem. I'm speaking English - that's been through 2000 years of iteration. I don't think that makes it a good candidate for a ground up rethink when so much thought by great thinkers has already gone into it.
You can check out Bistro [1] which is an alternative to SQL-like languages and to set-oriented approaches in general. It focuses on column operations (formally, functions) as opposed to having only set operations. It is pricesely why it works well for time series and it is why it has been used for stream processing.
After 50 years it remains non-intuitive and confusing and extracts a tax penalty far greater than its value add. Worse, the problem it claims to solve, "I have all my data in one database now let me transact over it", isn't viable and was never actually viable at scale.
Unfortunately this proposal doesn't understand the real problems with SQL (lack of types, lack of distribution, and no separation between the data write model and the data read model). It actually doesn't seem to introduce anything really new that can't be done using CEP Engines already.
isn't viable and was never actually viable at scale.
And yet the entire global economy somehow works. Maybe viable and scale don’t mean what you think they mean... I mean I run relational databases of 10s of Tb and know people doing 100s. And people with mere Gb tell me they’re doing amazing things with scalability.... lol
Do you have any recommendations on where to read about the problems of SQL? When you put it this way, clearly my assertion that "SQL isn't bad" needs a review.
I ask for recommendations because I don't know what I don't know. I've read a few criticisms of SQL where some conclude "just use mongodb". A good recommendation from an expert helps avoid faulty understanding.
I've been writing Datalog for the last week or so. It took me a bit to adapt to the syntax (also, [1] helped), but now I find myself enjoying the combination of terseness and expressiveness.
It's probably a matter of familiarity, but looking at Flux, it seems both noisier than Datalog and less readable than SQL.
Given that readability is a goal for Flux, I guess it's a matter of subjectivity: readable for whom? What background do you have in order for Flux to look readable?
On some level, readability is a subjective thing. So is expressiveness and general feel for a language. It's about aesthetics and reasonable people can disagree about language and API design choices. We're making our choices and we hope that a good number of people come to agree with them. However, part of the engine design is to decouple the language from the actual execution. The engine takes a DAG represented as a JSON object. We'll have parsers that create that DAG from Flux or from other languages like PromQL or anything that people might think of.
"I don’t want to live in a world where the best language humans could think of for working with data was invented in the 70’s. I refuse to let that be my reality."
---- re-play the same sentence for c/SQL/ENGLISH ---
I don’t want to live in a world where the best language humans could think of for "communicating" was invented in the (c. 550–1066 CE). I refuse to let that be my reality.
------
P.S.: Any technology is built over time with sedimentary layers... every layer has played key role in where we are today.. I'd not discount any...
All the words in the english language (not to mention place names, words in other languages/transliterations), possible acronyms, etc, and they choose one that already has a Facebook-promoted architectural pattern using it.
Not sure if arrogance or ignorance, perhaps just apathy?
Chaining calls in JS means that a function returns the original object, piping is actually passing over the result as an argument, so it's semantically different.
Honestly, for the use case, the same thing that's wrong with a (mandatory) visible function application operator in Haskell.
> Chaining calls in JS means that a function returns the original object, piping is actually passing over the result as an argument, so it's semantically different.
The syntax Flux uses for creating the result is the same as JS would use for mutating an inbound object, so it actually would be consistent if pushing the result used the same syntax as passing the (mutated) original object would in JS.
Though I’d prefer whitespace for piping just like Haskell does for application. If your are going to specialize a language for a domain, don't be timid about it.
I've gone down a similar line of thinking when writing a lot of SQL over
timeseries or otherwise ordered datasets recently. Going in a more functional
and composable direction while keeping it limited so that hopefully the query
execution engine can still make good optimizations seems like the right
idea.
Making it a separate and open language outside of Influx is also a great
approach - I'd love to see other databases try adopting this. I'll definitely
be keeping an eye on this project.
There seems to be a lot of not understanding SQL in the SQL criticism, and the design of the new language seems to have a lot of excess noise.
1. If arity-1 functions are as dominant in use as the examples suggest, mandatory named args are excessive noise.
2. From the examples, the |> operator also looks like needless noise; code would be cleaner and more readable if this operator was whitespace without any other character (like function application in Haskell, but newlines should also be acceptable) and there was a different punctuation for when that isn't intended.
Based on all the reflexively negative comments I’ll assume that most the participants in this thread write SQL for a living. Notwithstanding certain ridiculous assertions on the part of the author attempting to correlate the value of a technology with the year in which it was created this seems like a pretty compelling idea. We use InfluxDB at my company to manage time series data and it’s specialization for that use case has been a big benefit. I don’t see any reason to be dismissive by default of a language designed for interacting with data having these specific characteristics in a manner explicitly suited to it.
I don't write sql for a living, and the little I do in my free time is pretty miserable.
I still think this is a really dumb idea.
SQL, as hard as it can be to do stuff with SQL, at least you can use google for getting help. A obscure, single-database-specific with a completely un-googleable name is going to be a complete clusterfuck to try to do anything with it.
The name alone will make trying to get help with the language a disaster. The fact that it will have such limited market penetration (it only works on one specific time series database) don't make the unsearchable name any better.
------
If the author would come out and just admit they want to spend time intellectually masturbating over query language design (I think about inventing a "better" language in my free time too!), I'd have a lot more respect for the project.
Isn't that the problem though? For even a mildly interesting problem you have to google and google for a correct and usually non-obvious solution. And when you find it, it usually works in only one dialect but not other (e.g. MS SQL vs MySQL). It's elitism at its finest, people probably get good money too writing obscure queries, no wonder they are so defensive. I think your comment shows short-sightedness.
So your arguing that the solution to the annoyance of the platform-specific nature of SQL is to create a platform-specific language?
Or do you think that this language won't heavily depend on the internal implementation of InfluxDB? If you believe that, I want to know what you're smoking. It's gotta be some good stuff.
I like the approach. I find it annoying that in SQL I need more than select privileges to write my own function or view (Aliases and WITH is all that you can use to structure a big query). Also macros that work on all tables that have certain columns are at best difficult to write in SQL, so there is room for improvement.
OTOH in Flux you write something that looks a lot like the output of a planner, so if things change in your dB you might have to modify your scripts instead of adding an index.
xquery has a power and limitation with it's data model being xpath. sql and relational has proven very capable of being optimized as well as being extensible. the schema that relational systems impose at times has thought to be too restrictive but it makes it possible to reason about the query and data and that makes them powerful; it also helps the user to understand the shape of their data. with systems like google's big query showing how you can have schemaless non 1nf powerful and scalable systems still queried with sql there needs to be a powerful and innovative system to justify moving on from sql. having lambda like syntax for where clauses filters just looks like linq to sql.
In ML? What other ML and data science software? I'm curious because I couldn't find any. Of course a generic name is used many ways, but of course we're talking about in context.
Probably! A generic fictional software product name is something like “Acme Flux,” which is analogous to John Smith for people. Given it’s common use as a placeholder, there must be a few, here is one from 1984: https://www.sciencedirect.com/science/article/pii/0743731584....
It is difficult to search for though since it is such a widely used name, probably one of the most widely used one.
What you link to isn't an ML software. If it's so easy to find, why not link to one? http://fluxml.ai/ comes to mind without having to search of course, but I am still not seeing any others.
Why We’re Building Flux, a New Data Scripting and Query Language? Who knows why?
Before pedantically saying "I don’t want to live in a world where the best language humans could think of for working with data was invented in the 70’s" show us your breakthrough that makes us think you deserve reading across all your article paragraphs.
It seems the argument behind many ”X reinvented” posts is the age of X, not it’s flaws.
A more cynical observer might also guess people are shooting for a place in history. If you successfully launch the better mouse trap, fame and riches await. See also: intense churn in JS land.
Ironically, the fundaments of why computers even work were established 70 years before the 70s and haven't really changed a bit. I think his quote is definitely tongue-in-cheek.
Uhm let me think: if the world (or even a small niche really) switched to a tool I have authored, and they become fundamentally dependent on it, I’ve basically secured myself a significant and perpetual stream of income. Not bad eh! /s
I skimmed through the crap and was highly deceived by the example. So I think he would still have ranted. They chose a very specific example that made their product look good.
Beside the article start by stating that they rebranded their product influxQL —> Flux precisely because power user found major lacking features as to compare with SQL.
From TFA: ". This is kind of like the worst part of Lisp (nested function calls), but even less readable. Not only was the Flux example more terse, it was more readable and understandable."
It's kinda funny, because his forward pipe operator (suspiciously similar to Elixir's) is the same as a threading macro, which you have in lisp (or can trivially write if you don't)
"I don’t want to live in a world where the best language humans could think of for working with data was invented in the 70’s"
I don't want to live in a world where the best language humans could think of for communicating with other humans was invented in the ${CENTURY_WHEN_ENGLISH_WAS_INVENTED}70's.
I really loved the tenets given, ie..
Useable
Readable
Composable
Testable
Contributable
Shareable
But post that the article really failed to connect how the new language is going to do the above in a way no other language has done.
Also, I agree with the rest on the point against SQL, doesn't matter if SQL was invented in the 70s, so was the many programming languages and paradigms we use today.
It looks indeed a lot like graphite, and since you explicitly mention in your
talk that your objective is to reimplement all the functions that are present
in graphite, why no instead present your work as a port of the graphite
language, with some extension to work on other data sources and sinks (and dots
replaced by the fat pipe)?
This is interesting to me as I'm currently working on something close: a
lightweight stream processor to allow system engineers to manipulate some large
streams of data while in flight to a database. And I've been wondering (and
still am) about the trade-offs between simple and expressive. Very early, I
decided not to be TS specific at all (since we were prevented to use an
off-the-shelf product for that reason that our data does not look enough like a
TS -- not a single time nor a single value fields). Eventually, after a few
detours, we ended up favoring a SQL like language for that reason that it's
field agnostic.
Regarding the language itself, the main differences I can see are that you
query over a time range while we process infinite streams, with the consequence
that we must explicitly tells each operation when it has to output values
(windowing); the other is that you have an implicit key and one TS by "group"
with the same key, which makes piping many operations easier (but JOINing
harder), while we have to be more specific about how to group.
So for instance, where you have:
from(db:"foo") |> window(every:20s) |> sum()
we would have the more SQL-alike:
select sum value from foo group by time // 20
("//" being the integer division).
Or, if you needed the start and stop additional columns added by window():
select sum value, (time // 20)*20 AS start, start+20 AS stop group by start
But then, because fluxlang process a range of time while we stream "forever" we
would also have to tell when to output a tuple, for instance after 20s has
passed:
select sum value, (time // 20)*20 AS start, start+20 AS stop group by start commit after in.time > group.stop
which gets verbose quickly.
But to us this constraint imposed by streaming (as opposed to querying a DB for
the data to process) is essential since our main use case is alerting from a
single box, so querying every minute the last 10 minutes of data for thousands
of defined alerts would just not work.
Another interesting difference is the type system. One thing I both like and
hate in SQL is the NULL. It's convenient for missing data but it's also the
SQL equivalent of the null pointer. So we have a type system that looks closely
on it: we support this special case of algebraic data type that a "type?" is a
NULLable "type", and that NULLs must be dealt with before they reach a function
that does not accept NULLs. For instance, there is no way to compile a filter
which condition can be NULL, and one would have to COALESCE it first. What's
your thoughts about missing data? Do you manage to avoid the issue entirely,
including after a JOIN operation?
The other difference I noticed is how nice your query editor is. For now our
query editor is $EDITOR, but my plan is to build a data source plugin for
Grafana. What do you think of this approach?
The ultimate fantasy of every programmer is to a) invent a new language and b) force other people to use it. It’s OK, we all get it, it’s fine. But let’s be honest about our motivations...
A previous employer had RQL, "relational query language". It was between 10,000 and a million times slower than SQL depending on what you were doing (under the covers it was just generating really bad SQL). But the engineer who invented it was sufficiently well connected to get it declared the corporate standard, so...
Years before I ever read Thinking Fast and Slow I clicked on to the fact that humans are predisposed to making exceedingly poor decisions. This is such a fun example.
I almost want to start a side-project based on collecting these...
If the OP is talking about the same RQL as I am, it was an in-house solution that basically tried to be graphql, kafka and spark solution, all in one, being built at the same time as the app meant to consume it. Horrible experience.
Also, this? This is the illustrative example you chose for your amazing new query language?
So you're saying you're combining the expressiveness of SQL with the readability of, what, perl?