Hacker News new | comments | show | ask | jobs | submit | zvrba's comments login

There are many new C ripoffs, but nobody seems to acknowledge Cyclone (https://cyclone.thelanguage.org/) which is a lot older. It was even backed by a corporation (AT&T), but never succeeded. I wonder why.

Maybe it has something to do with this:

> Cyclone is no longer supported; the core research project has finished and the developers have moved on to other things. (Several of Cyclone's ideas have made their way into Rust.) Cyclone's code can be made to work with some effort, but it will not build out of the box on modern (64 bit) platforms).

Most insightful part for me was this one:

> Many variations [he's talking about Rust] that seemed like they ought to run the same speed or faster turned out slower, often much slower. By contrast, in C++ it was surprisingly difficult to discover a way to express the same operations differently and get a different run time.

This is an insidious pitfall that may earn Rust the "slow" label, akin to what happened to Common Lisp.

> This is an insidious pitfall that may earn Rust the "slow" label, akin to what happened to Common Lisp.

This doesn't match my experience at all with Rust, for what it's worth. Not having copy constructors makes up for any difference in that regard: in C++ it's way too easy to accidentally deep copy a vector and the entire object graph underneath it--all it takes is "auto a = b" as opposed to "auto& a = b"--with very bad performance consequences. Rust, on the other hand, doesn't let you deep copy unless you explicitly ask it to.

When I went back to C++ after learning Rust and used a vector again, I realized this and it hit me like a brick. I used to program in C++ before, so it wasn't like I didn't know this, but it was something rather unremarkable to me. Then I go back to C++ and throw vectors around willy-nilly and stuff works. Huh, I thought, this feels strange, I wonder how C++ accomplishes this without a garbage collector or ownership. Then it hit me -- it deep copies All The Things. Again, I already knew this, but I hadn't ever thought about it much. But the change in perspective made it stick out like a sore thumb.

I got so worried about accidentally deep-copying stuff that I ended up with an ugly C++ hack:

    C(const C &) = delete;
    C &operator=(const C &) = delete;
    C(C &&) noexcept = default;
    C &operator=(C &&) noexcept = default;

    struct MakeCopy {
        const C &src;
        MakeCopy(const C &src) : src(src) { }

    MakeCopy make_copy() const { return *this; }
    C(MakeCopy src) : field1(src.src.field1), (blah blah) { }
Then, when you really want to copy stuff, you have to say "a = b.make_copy();".

This isn't really clean, because it's too easy to add a field to C and then forget to add it to the "manual copy constructor", but so far I found it good enough for me.

In Rust, if you want a deep copy you (IIRC) implement the Clone trait, which allows you to explicitly clone everything. Many collections in the standard library already implement this, so you get it by default :).

Edit: I should point out that deep copies can only be explicit in Rust -- there's no implicit deep copy AFAIK.

There is no implicit deep copy, Clone is usually how a deep copy is implemented, but not all Clones are deep copies. Rc, for example, bumps a reference count on Clone.

The Qt C++ library does some neat tricks to avoid copying. For example its vector class (like nearly all of its container/value classes) overloads the "=" operator to perform shallow copies. So even though default behavior in C++ is to deep copy, it's possible to override.

Qt even takes it further by providing automatic copy-on-write. Methods that read data operate on the shared copy. Methods that change data cause the object to deep copy and detach first. This allows writing programs using values rather than references, while retaining the performance of references.

> This is an insidious pitfall that may earn Rust the "slow" label, akin to what happened to Common Lisp.

Could it just be that the language is still young and all the edge cases in the optimizer haven't been implemented yet?

Not just that, but there's a lot of optimizations that would be nice, but we haven't implemented yet. (Though some of these will end up being more to help with compile times than runtime performance, though)

I thought Rust left the optimization to LLVM. Is that not the case?

LLVM is great, and it's the reason why Rust is able to match C++ in performance on this workload, but it's not a magic-optimizer-of-everything: it's tuned to C and C++. We have added a couple of Rust-specific optimizations to it, but not a whole lot.

I suspect what the author was seeing was random performance differentials between iterators and loops, which often boils down to little missed optimizations in LLVM. If you find them, please file them in the Rust bug tracker--we've fixed many of them and will continue to do so in the future!

Note that when MIR lands, progress on which is well underway, we'll have the ability to do optimizations at the Rust IR level in addition to the LLVM level. This should make it easier to do these kinds of things.

Figured this might be the case. LLVM is great but C++ compilers have had a long time to mature.

It's impressive that the filter and map methods produced such good performance. The lazy evaluation scheme must be well setup, shout out to the Rust team!

Are the lazy iterators are combined and effectively stripped out at compile time?


As I understand it, there are a number of potential optimizations based on Rust's implicit understanding of the lifetime of a memory region. LLVM doesn't know about those because they can't be represented in the IR. Last I heard, any kind of optimization work based on the extra information rustc has is waiting for the compiler refactor that's currently underway.

Note that there aren't any optimizations that C++ can do that Rust can't in this area.

Can Rust's aliasing rules lead to extra avenues of optimization compared C++ (along the lines of C vs. Fortran)?

Quite possibly yes.

I dunno, it's very easy for me to express things differently in C++ and get a different runtime. It's easy enough for a bunch of C++ programmers writing needlessly slow code whom I've known. It must be equally easy for the author. The reason he doesn't ought to be vast experience with C++, so that he doesn't try obviously dumb things (as in things that malloc a ton of stuff needlessly, etc.), whereas his Rust experience is necessarily smaller. I think you need quite a bit of experience to declare that, well, experience just doesn't help because the performance model of this thing is weird!

I would never call software good to go until I have profiled it to weed out all the places where a slow technique meets an important path.

In my experience profiling and optimization is usually sort of like a victory lap - low hanging fruit for substantial differences.

This is a problem in Haskell as well, from everything I've read.

Can you share some links?

One particular example that I recall was that if you calculate something like

    foldl (+) 0 [1..1000000]
(i.e. calculate the sum of the first 1000000 natural numbers)

Then the obvious way to do it is to simply keep a running total, but haskell doesn't do that, it constructs the expression:

And tries to evaluate that, which tends to cause stack overflows, and is just generally slow. To fix it you have to use the following expression instead:

    foldl' (+) 0 [1..1000000]
This problem is described in more detail here: https://wiki.haskell.org/Foldr_Foldl_Foldl'

Adding just a single tick mark to make a big difference in performance is one of many areas in Haskell that make performance-minded development a pain. With these high-level languages, you really have to understand the implementation of the language to get anything fast written.

Another discussion today: https://news.ycombinator.com/item?id=11060257

for example: https://news.ycombinator.com/item?id=11060566

Yeah, I wish he had gone through this in more detail, too.

I laughed at this. I STILL write "++foo;" rather than "foo++;" as a default because C++ used to create copies at odd places.

Confidential computing in the cloud, for example.


When I'm bored I pick a random section and read it. I was really fascinated by the section on sorting networks for example.

The real gems are the exercises and answers to them; there's at least as much useful/applicable stuff as in the main text. I tried to solve them, but found myself unable to proceed with them if difficulty was > 20, even if I understood the concepts. I have no idea on how to attack them. (Even non-mathematical ones.)

IMHO, TAOCP needs "Volume 0" which would teach you problem solving in this particular domain. (No, "Concrete mathematics" is not it. That book is, IMHO, awful: many problems depend on having tricks up your sleeve.)

Any tips on how to approach exercises which seemingly don't have a lot to do with the preceding text?


He plans to remove that section in the next edition of the book :( I'd argue the section is very relevant since today's RAM works fastest when accessed sequentially.


> If you start with C++, you're committed to a multi year project, that's what it takes to advance from novice to junior level.

This is nonsense. Perhaps it's true to some degree if you try to "learn" C++ from tutorials of dubious quality rather than from a decent book.


You sound like someone who has never really explored C++ lookup rules.


Funnily I've been working on MLOC-sized C++ projects and almost never had a problem with lookup rules. On rare occasions where it was a problem, it was a compile-time problem and easily resolved.

In any case, these rules do not fall under what I'd call "junior-level" C++ programmer as the OP wrote.


In my previous job we were evaluating HDF5 for implementing a data-store a couple of years ago. We had some strict requirements about data corruption (e.g., if the program crashes amid a write operation in another thread, the old data must be left intact and readable), as well as multithreaded access. HDF5 supports parallelism from distinct programs, but its multithreaded story was (is?) very weak.

I ended up designing a transactional file format (a kind of log-structured file system with data versioning and garbage collection, all stored in a single file) from scratch to match our requirements. Works flawlessly with terabytes of data.


Might one take a peek at that ? In other words was it open sourced or plans to that effect exists ?


No and no. Strictly proprietary technology which gives a real competitive advantage. Fun thing is, if you choose your data structures wisely, it's not even that hard to write; it ended up being under 2k lines of C++ code.

GC was offline though; it was performed at the time the container was "upgraded" from RO to RW access. I don't think it'd be difficult to make it online, but there was no need for that.


Right. I spend less than half of my income on neccessities (food, rent, etc. -- this I split with my partner) and recently I've found myself telling people that time is my most limited resource. So in the past couple of years I've been very conscious about allocating my free time.

When I quit may last job I wanted to trade the notice period with unpaid vacation, but it was no-go. :(


Nice to see that C++ is picking up speed after C++11. In my previous job I had to work with "legacy" C++, but now I'm working with "modern" C++. The difference is huge. C++11 features allow me to do more with less headache and I'm more confident in code correctness.


> I'm not sure that's the best example. In that case, the knowledge that the 'array' global never has its address taken allows you to perform the optimization.

Separate translation units. For this to work, you'd need to postpone all optimizations to the link stage. Even then you're not safe, because you may be producing a dynamic library and the program loading it can take the address of an object by dlsym().



Applications are open for YC Summer 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact