Hacker News new | past | comments | ask | show | jobs | submit | markusde's comments login

It doesn't matter but I fully disagree with this. A transpiler emits code the user is supposed to understand, a compiler does not. At least that's the general way I've seen the term used, and it seems quite consistent.


There is a phenomenon I have observed many times where you can get a bunch of people in a room and make some statement, in this case, "Compilers are different than transpilers", and everyone around the table will nod sagely. Yup. We all agree with this statement.

But if you dig in, it will turn out that every single one of them has a different interpretation, quite often fatally so to whatever the task at hand is.

I mention this because my impression has been that the distinction between "transpiler" and "compiler" is that the latter is into some machine code and the former is not. I think if we could get people to sit down and very clearly define the difference we'd discover it is not as universal a definition as we think.

My personal favorite is when I say a particular term is not well defined on the internet, and I get multiple commenters to jump up and tell me off about how wrong I am and how well-defined the term is and how universal the understanding is, while each of them gives a completely different definition. As I write this it hasn't happened in this thread yet, but stay tuned.

Anyhow, the simple solution is, there isn't a useful distinction between them. There's no sharp line anyhow. Plenty of "transpilers" produce things like Python that looks like

    def f000000001_bch(a0023, a0024, bf___102893):
        __a1 = f000000248_BCh1(a0024, const_00012)
        if __c_112__0:
            f0000000923(__a1)
        else:
            f0000000082(__a1)
and it's really quite silly to look at what can be a very large process and make a distinction only in how the very last phase is run, and on a relatively superficial bit of that last phase too.


2 hours later, I think it's safe to say there are multiple definitions in play that are, if not outright contradictory, certainly not identical.

It seems the term is not terribly useful even on its own terms... it is not as well defined as everyone thinks.

Ultimately, "compiler" isn't a bright shining line either... I can take anything and shade it down to the point where you might not be sure ("is that a 'compiler' or an 'interpreter'?"), but the "transpiler" term is trying to draw a line where there isn't even a seam in the landscape.


> the "transpiler" term is trying to draw a line where there isn't even a seam in the landscape.

I don't think you have proven that it is a seamless landscape. In fact, I think that people's definitions have been remarkably consistent in spite of their fuzziness. The heart of what I have read is that most people understand a transpiler to be an intermediate text to text translation whose output is input to another tool. The common colloquial definition of a compiler is a text to machine code (for some definition of machine code) translation whose output is an executable program on a host platform. You can make an argument that every compiler is a transpiler or every transpiler is a compiler, but I think it requires a level of willful obtuseness or excessive pedantry to deny that there is something behind the concept of a transpiler. This discussion wouldn't even be happening if transpiler were a completely meaningless term.


> , but I think it requires a level of willful obtuseness or excessive pedantry to deny that there is something behind the concept of a transpiler.

Transpiler means something. Fuzzily. And it defines and denotes nothing of practical utility.

> This discussion wouldn't even be happening if transpiler were a completely meaningless term

This discussion wouldn’t even be happening if (people like) JS programmers didn’t insist on using terminology that implied some archaic view of technology, like “compilers emit machine code”—the distinction between high- or low-level target languages aren’t interesting anymore, even if it might have been novel to normie programmers in the 90’s or something.


I've observed this sort of behavior frequently, is there a name for this phenomenon yet?

Something like "Assuming all concepts are universal to one's own peculiar definition"

Maybe "semantic egocentrism" could fit the bill?


And in the other corner you have Chomsky with universal grammar... and in another you have Platonic Forms...

I love the "draw me a tree" idea of a Platonic form, we all have an idealized model of what that is, that is uniquely our own. With that in mind isnt everything subject to some sort of semantics?


> A transpiler emits code the user is supposed to understand, a compiler does not.

No, a transpiler emits code that another system is meant to understand (often another compiler or interpreter). Whether a human can understand it or not is immaterial to the objective of transpiling.


But then compilation is the same thing as transpilation as noted.


Yes.


Does that imply that a compiler emits code that nothing can understand? Or are you saying that 'transpile' is just another word for 'compile'?


> Does that imply that a compiler emits code that nothing can understand?

Bizarre take. No, compilers in the classical sense target byte code and machine code which is meant to be interpreted by a byte code interpreter or a hardware machine.

> Or are you saying that 'transpile' is no more than another word for 'compile'?

Yes. Compilers translate from one language to another. Transpilers translate from one language to another. Both have the objective of preserving the behavior of the program across translation. Neither has the objective of making something intended for humans as a general rule.

That transpiled code (if we draw a distinction) targets languages meant for humans to read/write means that many transpiled programs can be read by people, but it's not the objective.


> Bizarre take.

Bizarre in what way? If compilers are somehow different, then they mustn't target systems, as that's what your previous comment says transpilers do. Which leaves even your own classical definition to be contradictory, if they are somehow different. What does that leave?

> Yes.

But it seems you do not consider them different, which was the divergent path in the previous comment. But evaluating both the "if" and the "else" statement is rather illogical. The evaluation of both branches is what is truly bizarre here.


I see what you mean but how is it academically useful to identify transpilers something of their own? It's still compiling (lowering) from one notation to another.


All compiler outputs are understandable. I suppose you mean with the intent of it being a one-time translation? As in, like when the Go project converted the original C codebase into Go source with the intent of having developers work in the Go output afterwards and the C code to be never touched again?

What is meaningful about such a distinction?


Yea, not sure i disagree with anything being said here. Though to me, transpiler just typically means it goes from one language i could write, to another i could write. I don't necessarily expect to enjoy reading or perhaps even understanding the JavaScript output from any lang that builds to JS, for example.


> transpiler just typically means it goes from one language i could write, to another i could write.

What possible compiler target couldn't you write? Compilers are not exactly magic.


Fair, by "could write" i meant one intended for humans to write. Ie i would not say LLVM bytecode is intended for humans to write by hand. Can they? Sure.

The difference (to the parent comment) in my eyes is that the target language is the thing intended for humans, not the target output itself. As another commenter points out, transpiled code is often not intended for humans, even if the language is.


Machine code is intended to be written by humans. That was the only way to program computers at one point in time. Given a machine code target, would you say it is product of transpilation or compilation?


I would stand by my original statement, as i don't consider that "intended" or common by modern day standards. Humans hand wrote binary for a while too hah.

If it's not clear, these are just my opinions. Not an attempt at an objective fact or anything.


> Humans hand wrote binary for a while too hah.

Like, as in flipping toggle switches? Isn't that just an input device? They were still producing machine code from that switching.


> A transpiler emits code the user is supposed to understand, a compiler does not.

How come Godbolt is so popular? Inspecting compiler output?

Is GCC now officially a transpiler?


Actually, I can understand assembly.


Is Babel a transpiler?


I noticed this in myself last time I was as a TA. I'd go back and re-grade the first 15 assignments or so to make sure the rules were being applied consistently.


I would also be interested in reading some of these examples!


Yes... Rust!

I don't know if you can toggle it from the front-end, but the Polonius borrow checker includes a "location insensitive" mode which is faster but accepts fewer programs.


I think the GP had something else in mind. You're mentioning a mode where the rules are more stringent, allowing for a faster check and consequently faster compile times. The GP is pictuting a mode where the checks are skipped altogether, only the necessary transformations occur with an assumption that everything is already correct. I'm weary of having something like that outside of nightly, unless cargo publish had more checks than it does today.


The difference is, a junior employee knows that killing prod is bad. An LLM doesn't know anything.


Don't be so sure that all, or even most, junior employees know any such thing. I've seen junior employees fired for doing silly things in prod before[1]

[1] Of course whatever more senior bozo granted the junior the rights to blow up the thing(s) they did should have been fired instead. That's not the way things work in the corporate world.


I like getting juniors into situations where they can blow up a db since it's the perfect introduction to backups.


And we only do it once (I didn't kill the db, but I did kick off a process thinking I was in a test environment).


Were you the guy sending test push notifications from Firebase to all users of Xperia phones last year? :D


Them knowing that it is bad isn't much of a consolation for the dead production

The magic that happen in someone's mind that leads to their actions matters very little for everyone else. Their actions and the consequences of their actions are what everyone else actually cares about


I disagree that this is an important problem. Even in the unlikely case that some NP-hard algorithm is P, it may be completely infeasible to compute on modern hardware. I'd wager that to certainty be the case if any solution exists at all.

I am not aware of any practical use of P=NP outside of pure complexity theory, and the P!=NP case is already ubiquitous assumption anyways. P=NP would be a notable result, but probably not important.


When people say the most important problem in a theoretical discipline, usually its not due to practical applications.

That said, there is definitely potential practical implications for this. Even if it means we can know np problems do not have efficient solutions - that is something with practical implications.


Even a P=NP result doesn't tell us that NP problems have efficient solutions. That depends entirely on

- if the solution is constructive,

- if the asymptotic solution has good constants (lower bound on input size, highest degree term), and

- having no other mitigating factors (requiring absurd amounts of space, for example)

The idea that P=efficient is a complete misnomer. For simple algorithms, asymptotic analysis is a fine enough coarse-grained view of program performance. However for extreme cases (like may be the case for P=NP) asymptotic analysis tells you almost nothing at all.


> Even a P=NP result doesn't tell us that NP problems have efficient solutions.

Yes it does. That is literally exactly what it means. The class P is the class of problems which are considered theoretically "tractable"/"efficiently solvable"/"feasibly solvable" (Cobham-Edmonds thesis). Hence, if NP=P, then that same definition extends to all problems in NP.


There are very obviously algorithms in P which are not "efficient". For example, an algorithm in O(n^10^10^10) is not efficient in any reasonable sense. It is in fact much much much less efficient than O(e^n) for any n low enough that the whole thing would finish in under a year or so.

In practical terms, the class of efficient algorithms is probably O(n^3) at best, and even then assuming no huge constant factors.


I am not quite the mathematician to find out but I would love to know the relationship between the constant factor and the exponent in matrix multiplication research papers.


P vs NP is not an "in practical terms" question. It is a theoretical question with theoretical definitions of theoretical terms, including "efficient", which directly corresponds to the class P by definition.


Ok. When I say efficient, I mean "produces efficient code on near-term hardware". I understand that complexity theorists have a different definition of "efficient"-- they also have a different definition of "important" too.


The question being asked was "what would proving P=NP mean for us in practical terms". The fact that mathematicians call all polynomial-time algorithms efficient is irrelevant to this question.


> The question being asked

Where was that question asked?


It wasn't exactly a question, but the thread started by discussing practical implications:

> That said, there is definitely potential practical implications for this. Even if it means we can know np problems do not have efficient solutions [emphasis mine]

So, this was about efficiency in the practical sense, not some largely useless definition of efficiency by which galactic algorithms are "efficient".


That's not Cobham-Edmonds thesis. The assertion is that problems are feasible _only if_ they are in P, not _if and only if_ they are in P.


Sorry, but no. Edmonds clearly maps "efficiently-solvable problems" onto the notion of P in a complete way, not only as a subset of P.


I chased some links from Wikipedia and you're right that Edmonds uses "efficiently solvable" to mean P. However, he does not take "efficiently solvable" to mean "feasible" for basically the same reasons as I've said in this thread. From section 2 of Edmonds' "Paths, Trees, and Flowers" (1965):

> An explanation is due on the use of the words "efficient algorithm." First, what I present is a conceptual description of an algorithm and not a particular formalized algorithm or "code." For practical purposes computational details are vital. However, my purpose is only to show as attractively as I can that there is an efficient algorithm. According to the dictionary, "efficient" means "adequate in operation or performance." This is roughly the meaning I want—in the sense that it is conceivable for maximum matching to have no efficient algorithm. Perhaps a better word is "good."

> It is by no means obvious whether or not there exists an algorithm whose difficulty increases only algebraically with the size of the graph. The mathematical significance of this paper rests largely on the assumption that the two preceding sentences have mathematical meaning.

> ...

> When the measure of problem-size is reasonable and when the sizes assume values arbitrarily large, an asymptotic estimate of FA(N) (let us call it the order of difficulty of algorithm A) is theoretically important. It cannot be rigged by making the algorithm artificially difficult for smaller sizes. It is one criterion showing how good the algorithm is—not merely in comparison with other given algorithms for the same class of problems, but also on the whole how good in comparison with itself. There are, of course, other equally valuable criteria. And in practice this one is rough, one reason being that the size of a problem which would ever be considered is bounded.

You should read the rest of section 2, it's short very clear. Calling P the class of "efficiently solvable" problems is a completely reasonable framing for this paper, considering that this was written at a time when we had fewer tools to formally compare algorithms. Edmonds correctly does not claim that all P algorithms are feasible, and my opinion that P=NP is not important is based on the 60 years of feasible non-P computations we've had since.


I think even a negative solution could have practical implications. For starters you can now know for sure the problem can't be solved efficiently, which means you can stop looking and focus on other things. Possibly you can use it in situations you need a hard problem lkke in crypto (although there is more to it then that, since just because a problem is NP doesnt mean that a specific instance is)

I am not a complexity theorist, but ive always doubted the n^1000 solution to p=np being likely. Think about it - that essentially means you have to loop through the data 1000 times, but no more than that, no matter how much data. It just doesn't seem natural that looping 999 wouldn't be enough, but 1000 would be precisely enough. Its a sort of middle ground that seems like it would be extremely odd to me. Much more so than p=np with a low exponent or p!=np.


Apologies for nitpicking, but n^m doesn't mean m loops (ie n^1000 dient mean 1000 passes): that would be mn (1000n in your example). I think your intuition argument kind of breaks down there.


I meant to say 1000 nested loops. You're right to call me out on it though, as that is a huge difference in meaning.

I still think it would be bizarre to have to nest a loop n layers deep, where n-1 is not enough, but n layers is sufficient for really large n. Like what extra info would you get on the 1000th nested loop that you didn't on the first 999.

Of course there is nothing formal about this, it just feels like it would be wrong to me (and hence personally i would consider it the most interesting result). Of course gut feelings dont really count for much and i have nothing more than that.

I suppose my intuition is that increasing the exponent gives diminishing returns to how much more power you really get , so it doesn't make sense for problems to be in the n^1000 range. My gut feeling is they should either be easier or harder. I certainly can't think of very many non-exponential algorithms in the > n^50 range.


Not 1000 iterations of 1 loop, but 1000 loops nested inside each other.


> - having no other mitigating factors (requiring absurd amounts of space, for example)

I think you can't require absurd amounts of space, because you only have P steps to access that space, therefore the space is bounded to P anyway. (Memory you can't access can't help you.)


Agreed. We have problems that we have polynomial algorithms for but we don’t use them because their exponential counterparts are orders of magnitude faster on average.


This sounds absolutely wrong to me. (Though note - I'm not a complexity theorist.)

For one thing, the proof of P!=NP (which is what most people assume) will almost certainly provide us a lot of new math and insight. So yes, there won't necessarily be something new practically just from the result P!=NP, because this is what everyone assumes, but it's likely to yield lots of new math.

As for the case where P=NP - that's almost certainly a gamechanger, especially if this is proved by way of a counterexample, e.g. a problem in NP that is actually computable in P. While this may not be immediately computable on current hardware, at that point we'll almost certainly focus a lot of time and effort into making it more feasible.

What CS problem do you think is more important?


I am a theorist (though not in complexity) so I agree with you that new math is never bad. I also agree that an very lucky P=NP result _may_ be implementable _at some point_ in the future, though to be honest I wouldn't put my money on that happening.

As for more important problems-- I think it's more important to improve the specializations of NP problems, heuristically, for problem instances which arise in practice. A concrete example in my line of work could be improving the performance of SMT solvers on encodings of real programs. There's a lot of exciting work happening in this area and it's opening doors in program verification previously thought to be unrealistic. IIUC the formal methods team at AWS is putting a lot of work into memoized, distributed SMT solving, and are making meaningful gains over the current state of the art.

I don't really care if we can solve the most general NP hard problems in O(bad*N^bad) only for N>bad. A) Even a getting result that weak seems to be too challenging to prove and probably not true, and B) trying to solve the most general problem is complete overkill for any problems that come up outside a complexity textbook.


OK, fair enough. My intuition is that we will "get lucky" in developing new math for proving P!=NP, but that's just an intuition and I'm not a theorist, so you probably have a better sense for this than I do.


While this may not be a practically important problem, I can't readily come up with a more important problem in computer science off the top of my head.


How's this for interesting: many people in my field (formal methods) seem to be pretty excited about our job prospects. Before we used to just say that people don't actually know what their code does, but now it looks like it might really be true.


> How's this for interesting: many people in my field (formal methods) seem to be pretty excited about our job prospects.

Wow, I hadn't thought of it from the generative AI standpoint, only from the standpoint of "cybersecurity is increasingly important, so we need to use proof assistants and verified toolchains more".

What kind of new career prospects in formal methods do you predict will come as a result of generative AI? For example, certifying AI-generated code for use in automobiles?


Full verification is one, but that is still challenging at scale. Formal methods has many weaker methods as well which are easier to apply in general (automated proof, abstract interpretation, static or dynamic model checking, hell some would even argue the judicious use of dynamic checks is a kind of formal methods).

I think that a shift from "this is my program, it's a sequence if steps that does what it does" to "I asked a LLM to generate me a program that does this vague thing I want" makes you naturally ask questions like "wait is it actually doing the thing I want" and "what do I even want it do do". Formal methods, broadly, has a lot of good answers to those questions.


It’s true. We don’t. But does it matter? Is there demand to change this?


A problem with randomly searching for theorems is that it blows up exponentially, and many of the theorems you would find are probably not useful in their own right. Another problem is that "check the for truth" is undecidable: they are what I think of when you mention "trade CPU cycles for proofs" though if you're just trying to prove random theorems they will be limited.


"many of the theorems you would find are probably not useful in their own right"

To give an example, there's an infinite array of "If X is greater than 20, X is greater than 10" theorems. Obviously useless to prove each of them, but enumeration will encounter all of these, as well as a huge number of classes of similar theorems, in the process of getting to anything interesting.

You can't just "skip" them either, because they get arbitrarily complicated. I chose a really simple one to drive home the point, and because it's the sort of theorem you'll hit early, but tautologies can be arbitrarily complicated. Trying to create something that identifies them hits halting problems of their own. Rice's Theorem, which can be colloquially summarized as "any interesting property is not computable", will stop you, among other things.

("If X is a prime > 521, X does not have 21 as a factor." "If X is irrational, its fraction expansion does not have a denominator between 15 and 54." "If the absolute value of X is not equal to X, X < 932." "If the fractional component of X is greater than .5, the first digit of the fraction expansion is not 2 or 4." "The sum of two positive numbers is greater than the smallest of the two numbers number minus 273,883,192,823." "Most" theorems are really, really useless.)


What if you narrowed down one side, like "Find a formula for solving arbitrary polynomial equations"? (I know this has been proven to not exist using human brainpower). but the idea would be: "I'm looking for a func that produces zeros to an input equation, now generate billions of tokens and evaluate them and keep the ones that minimize the search criterion". Couldn't that eventually find a useful result (for actually-solvable search spaces)?

I can see how it would fall apart quickly for something like the Riemann Hypothesis, because you're searching not for an equation (well, if you found a counterexample, sure) but for a nebulous classification problem: all solutions to this equation ( this series of tokens from this language that return zero) must lie on (I can't recall if it was literally on, or "around") re(1/2). Once you try to reason with infinite sets (domain/codomain) the logic of a CPU can't help you nearly as much.


Simply generating all possible tokens is already an exponential problem. There really isn't any way around that. Analysis of them for any interesting property becomes super-exponential pretty easily.

If you are going to do something more clever than "generate all tokens", then you're no longer talking about the same thing, and the analysis depends on the nature of your "more clever". Although starting with an exponential (or super exponential) process and then trying to "cut it down" to something reasonable is almost always a fool's errand anyhow. Starting yourself out in the position where you are not only behind the eight ball but have essentially already been murdered by it, and then cleverly working yourself back into a position where you've somehow won, is rarely, if ever, a good approach to solving a problem.


> "If X is greater than 10, X is greater than 20" theorems

I think you meant in the opposite order, i.e.: "If X is greater than 20, then X is greater than 10" :)

But otherwise your point is valid.


Whoops, yes, thanks, fixed. One of those things I edited one too many times and didn't catch in the end.


> > "If X is greater than 10, X is greater than 20" theorems

I think I have seen theorems of that sort. (Can't think of a specific example right now.)


A theorem like that isn't provable (unless you're using some kind of nonstandard arithmetic).

You could only prove it if you add something else, e.g. a precondition saying that X = 30.


Right, if it can be asserted based on some independent assumptions that a positive X must be at least 30, then from those assumptions and X > 10 it would follow that X > 20.


It also just follows from X being >= 30.


Randomly searching for theorems is like using monkeys with typewriters to search for a lost work of Shakespeare. Sure, you _might_ find it, but you'd never know amongst the noise. Well, it wouldn't be like monkeys with typewriters, more like monkeys with word processors and grammar check because the results will type check... but there is no way to screen out what is interesting.


I agree with monkeys+typewriters, but look at how compact the quadratic formula is. It's not a work of shakespeare. it's maybe 12 to 20 "tokens" , using a loose definition of a token. To me, it's totally within the realm of a 1-10 billion search space, isn't it?

I'm just thinking aloud, but suppose a random-math-syntax shuffler eventually produced the quadratic formula, how would you even check it? You would have to input many permutations of quadratic equation inputs and check them against the randomly-produced one. You'd quickly hit imaginary numbers, which could (falsely) signal that the QF is invalid.

Compare that to monkey-typing fizzbuzz. That seemingly could be done in a few billion iterations, no? (assume you could exclude monkeys from typing syntax errors, and only produce valid C++/java whatever).

IDK, I'm just mesmerized by the fact that the QF is so compact, yet there isn't an easily-discoverable way to derive it by brute force. (and I'm deliberately excluding AI because of how tiring the questions of the form "why can't AI do _____" are. I don't care that much about AI).


> To me, it's totally within the realm of a 1-10 billion search space, isn't it?

A proof of the theorem is going to be at least a few hundred bits, which is totally out of reach of brute force search.

A 10 billion search space only covers 34 bits (less than 5 ASCII symbols) in which you'll struggle to even state the theorem.


> It's not a work of shakespeare. it's maybe 12 to 20 "tokens" , using a loose definition of a token. To me, it's totally within the realm of a 1-10 billion search space, isn't it?

Even 12 tokens with only 10 different token values gives you 10¹² different candidate ‘proofs’. That already is a thousand times 10 billion.

In reality, there easily are 100 possible token values (lower and upper case alphabet, punctuation, Greek lower and upper case, dozens of symbols, a bit of Hebrew, etc), taking that number to 10²⁴.

Also, I think even just express the two solutions to ax²+bx+c=0 without any hint of a proof in twelve tokens is an insurmountable challenge.


The Pythagorean theorem is even more compact, and even more impactful. But the low hanging fruit has been already picked for centuries and how would you detect a meaningful formula if you were give a 10GB text file of various ones?


You're mixing up functional programming with an execution model for functional languages. These are not the same.


Research around functional languages involves their execution models. Even ignoring execution models immutability is a staple of functional programming which is not good for performance.


It's not absurd because nobody is forcing that person to rent. Housing is a public resource, and contributing to that public resource shouldn't come without asterisks.

Cities need the non-owning class to function, and cities function better when the non-owning class has some degree of stability-- letting landlords juice their tenants for as much as possible is counterproductive to this.


But if it becomes unprofitable (for some value of profit) to rent, they will stop renting, unless you force them at gunpoint to continue renting even when it loses money.


Offer to sell it to the tenant then. That should be more common.


The tenant almost by definition can't afford (if they could afford to own where they rent, they'd likely own).

(Sadly, when you see "rent to own" on houses, they're almost always a scam preying on poor people. The trick is to get them to pay MORE than fair market rent for an "option" to buy that you know they'll never be able to execute on, or if they can, said execution is in your favor. If you care for more details, https://johntreed.com/products/sisgle is worth the price, as it details exactly WHEN an investor can make money buying these "lease options" and how they're usually unprofitable and almost scammy.)

It does naturally occur in some cases - long term tenant who is comfortable renting is offered the house "off market" when the landlord decides he's done and wants to retire from the hassle of property management and sometimes is even willing to take back a mortgage (a great way to convert a rental income stream to a financial interest scheme and remove the property maintenance factor).


If they don't want to sell, and don't want to rent, what are they going to do with it then?


They'll hold it (this is currently happening) or they'll eventually sell at market rates.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: