> The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious
I've sometimes found myself in situations where the only way I've been able to deal with this is to check the compiler's output and trawl forums for hints by Rust's developers about what they think/hope the semantics are/will be.
Historically speaking, this situation isn't uncommon: working out exactly what a language's semantics should be is hard, particularly when it has many novel aspects. Most major languages go through this sort of sequence. Some sooner or later than others --- and some end up addressing it more thoroughly than others). Eventually I expect Rust to develop something similar to the modern C spec, but we're not there yet.
Because Morello is an experimental platform, only a small number were manufactured. They are/were allocated mostly to people involved in early stages CHERI R&D and, AFAIK, none were made available to the general public. [That said, I don't know whether there are still some unallocated machines!] One can fully emulate Morello with qemu. While the emulator is, unsurprisingly, rather slow, I generally use qemu for quick Morello experiments, even though I have access to physical Morello boards.
You're quite right, I over-simplified -- mea culpa! That should have said "often unify these phases". FWIW, I've written recursive descent parsers with and without separate lexers, though my sense is that the majority opinion is that "recursive descent" implies "no separate lexer".
For what it's worth, in my little corner of the world, all of the recursive descent parsers I've seen and worked with have separate lexers. I can't recall seeing a single recursive descent parser in industry that didn't separate lexing.
However, I do often see a little fudging the two together for funny corners of the language. Often that just means handling ">>" as right-shift in some contexts and nested generics in others.
That's not my impression of the majority opinion, fwiw. (I wrote my first recursive-descent parser in the 80s and I learned from pretty standard sources like one of Wirth's textbooks.)
Hello Filip -- I hope life is treating you well! I'm happy to clarify a couple of things that might be useful.
First, VM authors I've discussed this with over the years seem roughly split down the middle on microbenchmarks. Some very much agree with your perspective that small benchmarks are misleading. Some, though, were very surprised at the quantity and nature of what we found. Indeed, I discovered a small number had not only noticed similar problems in the past but spent huge amounts of time trying to fix them. There are many people who I, and I suspect you, admire, in both camps: this seems like something upon which reasonable people can differ. Perhaps future research will provide more clarity in this regard.
Second, for BBKMT we used the first benchmarks we tried, so there was absolutely no cherry picking going on. Indeed, we arguably biased the whole experiment in favour of VMs (our paper details why and how we did so). Since TCPT uses 600 (well, 586...) benchmarks it seems unlikely to me that they cherry picked either. "Cherry picking" is, to my mind, a serious accusation, since it would suggest we did not do our research in good faith. I hope I can put your mind at rest on that matter.
- Academics don’t publish results that aren’t sexy. How many people like you ran the same experiment with a different set of benchmarks but didn’t publish the results because they confirmed the obvious and so were too boring? How many times did you or your coauthors have false starts in your research that weren’t published? You’re cherry picking just by participating in the perverse reward system.
- The complexity of the data analysis sure makes it look like you’re doing something smart, but in reality, it’s just an opportunity to cherry pick.
- These results are not consistent with what I’ve seen, and I’ve spent countless hours benchmarking VMs I wrote and VMs I compete with. I’ll believe my own eyes before I believe published research. This leads me to believe there is something fishy going on.
Anyway, my serious accusation stands and it’s a fact that for large real-ish workloads, VMs do “warm up” - they start slow and then run faster, as designed.
I not only welcome reasonable scepticism, but I do my best to facilitate it. I have accrued sufficient evidence over time of my own fallibility, and idiocy, that I now try to give people the opportunity to spot mistakes so that I might correct them. As a happy bonus, this also gives people a way of verifying whether the work was done in the right spirit or not.
To that end we work in the open, so all the evidence you need to back up your assertions, or assuage your doubts, has been available since the first day we started:
* Here's the experiment, with its 1025 commits going back to 2015 https://github.com/softdevteam/warmup_experiment/ -- note that the benchmarks are slurped in before we'd even got many of the VMs compiling.
* You can also see from the first commit that we simply slurped in the CLBG benchmarks wholesale from a previous paper that was done some time before I had any inkling that there might be warmup problems https://github.com/ltratt/vms_experiment/
* Here's the repo for the paper itself, where you can see us getting to grips with what we were seeing over several years https://github.com/softdevteam/warmup_paper
* The snapshots of the paper we released are at https://arxiv.org/abs/1602.00602v1 -- the first version ("V1") clearly shows problems but we had no statistical analysis (note that the first version has a different author list than the final version, and the author added later was a stats expert).
* The raw data for the releases of the experiment are at https://archive.org/download/softdev_warmup_experiment_artef... so you can run your own statistical analysis on them.
To be clear, our paper is (or, at least, I hope is) clear to scope its assertions. It doesn't say "VMs never warmup" or even "VMs only warmup X% of the time". It says "in this cross-language, cross-VM, benchmark suite of small benchmarks we observed warmup X% of the time, and that might suggest there are broader problems, but we can't say for sure". There are various possible hypotheses which could explain what we saw, including "only microbenchmarks, or this set of microbenchmarks, show this problem". Personally, that doesn't feel like the most likely explanation, but I have been wrong about bigger things before!
For those of us with Unix-y mail setups the move to OAuth2 can be a bit tricky, but there are now several different programs to help (spurred, I suspect in no small part, by Microsoft/Exchange's stance). The ones I know about are:
Not only it’s tricky and user-hostile, but it also severely decreases security by forcing people to use fundamentally insecure mechanism to obtain the authentication token.
It makes it necessary to use a browser to obtain the token. That browser is a huge attack surface. With web, it doesn’t matter, since you need to be using it anyway, but for mail it’s just additional cruft.
That's just for certain flows, like the common authorization code flow. The client credentials flow does not require a browser, for example.
Not sure about Google, but Microsoft supports client credentials for IMAP/POP3[1], but not for SMTP yet. IIRC it was supposed to be rolled out this January but is still missing. Hopefully they can get that deployed ASAP.
More like authorization. Authentication is completely opaque for most people using gmail. (except for those very few using service accounts and signing their own authorization tokens)
Or maybe you can enlighten me how you can get the token for XOAUTH2 from just your gmail email address and password without involving any opaque google service.
Authentication is happening completely outside of OAuth inside some google black box. 2FA has nothing to do with OAuth at all. It's just another feature of the google's black box which decides whether to give you the access/refresh tokens or not.
Well the question I responded to was "What does oauth have to tho with authentication".
I fully agree with the move away from plain passwords in this case, given that it's no longer "just" the password to a mail account, but to much, much more.
Now while I think OAuth adds some features that can be useful in certain settings, I'll be inclined to agree that requiring OAuth isn't the best move.
However the alternatives would probably require a lot of extra work on Microsoft's behalf, like being able to set up device-specific passwords or similar.
So, given the need to move away from plain account passwords, I can understand why they wouldn't want to do that and just use what they already had.
Honestly, I haven't tried Xvfb in years. That said, I did have another motive by keeping things simple. Even though xwininfo is very X11 specific, I hope that it's easier for people to work out an alternative for other platforms and adapt the recipes from my post to their situation.
I always try to be respectful towards other software in my writing and I fell short this time. I hope you'll accept my sincere apologies!
> the tool chain target names, which I think was your real complaint
Yes, this was solely what I was referring to (slightly thoughtlessly) as "ugly", and even then only in the sense of "where did those magic names come from?" I certainly wasn't referring to objcopy itself!
don't worry: I'm not actually insulted. I just think it's funny that something intended for development debugging turn out to be actually useful (as I said I still use them too, and not for debugging bfd).
I agree that you don't want non-CHERI Rust code to have to know about capabilities in any way. However, if you are using Rust for CHERI, you need to have some access to capability functions otherwise, even in pure capability mode, you can't reduce a capability's permissions. For example, if you want to write `malloc` in purecap Rust, you'll probably want to hand out capabilities that point to blocks of memory with bounds that only cover that block: you need some way to say "create a new capability which has smaller bounds than its parent capability."
As to the utility of hybrid mode, I politely disagree. For example, is it useful for safe Rust code to always pay the size penalty of capabilities when you've proven at compile-time that they can't misuse the underlying pointers? A very different example is that hybrid mode allows you to meaningfully impose sub-process like compartments (in particular with the `DDC` register, which restricts non-capability code to a subset of the virtual address space; see also the `PCC` register and friends). Personally, I think that this latter technique holds a great deal of potential.
> For example, is it useful for safe Rust code to always pay the size penalty of capabilities when you've proven at compile-time that they can't misuse the underlying pointers?
This seems like it's impossible though? How can you prove at compile time that all software that your safe Rust calls doesn't corrupt pointers? Don't you need capabilities in the Rust to ensure that if such software does something nefarious, the Rust code catches it before doing something untoward? (Not to mention the risk of compiler bugs causing something.)
If you’re passing a pointer to safe Rust code, with the capability bound encoded into something “native” to the language, then you don’t need hardware capabilities at all.
Correct, but once you've done that you can strip the capability information and pass the raw address around to the safe code because the compiler runs validation.
One potential option I haven't seen mentioned is to make references (i.e. `&[mut] T`) not use capabilities, but raw pointers (`*(mut|const)`) to use capabilities. Since the compiler already guarantees that references are used correctly, at least theoretically this is best of all worlds.
Now it's possible that CHERI would make this impossible, but it's definitely an angle worth recognising.
It's absolutely possible, because the hardware doesn't care about your compilation model: you can mix normal pointers and capabilities as you wish. A challenge is that it's easy to go from capability -> pointer, but harder to go from pointer -> capability -- where do the extra capability bits come from? CHERI C provides a default ("inherit capability bits from the DDC") but I'm not sure that's what I would choose to do.
Problem is, there's lots of unsafe code that casts *mut T to &mut T (usually after checking T is valid and whatnot). If &mut T didn't use capabilities, this kind of unsafe code would end up not taking advantage of the CHERI capability checking, which would be unfortunate.
I don't think this is actually a problem, since when casting from `&mut T` to `*mut T`, the returned pointer can only access the data (the T value) directly behind the reference.
The raw pointer would be synthesised with the capability for only the pointee of the original reference.
[Warning: over-serious response coming.] There's a big "cattle grid" sign a little way before it, so I don't think that's a realistic worry. As an aside, that road is long, but a dead end: it leads up to a beauty spot / hiking point. The road also largely lacks dividing lines (i.e. it's sort-of single track, though often wide enough for two vehicles), and the cattle grid comes not long after a sharp corner, so I can't imagine many vehicles would have been able to get to flying speeds. Besides, that neck of the woods has plenty of interesting driving roads including the main A39 out of Porlock just a couple of miles away, that I imagine are of more interest to speed demons.
> Is this a separate thing that would be integrated into an IDE or would it make more sense as part of the compiler?
As things stand it's most easily used in a batch setting (e.g. a compiler). I don't think it would take much, if any, work to use in an incremental parsing (for an IDE), but I haven't tried that yet!
I'm not an IDE person myself (I'm a neovim person), but I'd love to see someone integrate such an approach into an IDE! The algorithm is free available and the implementation in Rust (grmtools) the same. If you wanted to reuse the Rust code, and it needs some tweaks, then I'm sure that we can find a way to support both batch and IDE-ish use cases well.
> The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious
I've sometimes found myself in situations where the only way I've been able to deal with this is to check the compiler's output and trawl forums for hints by Rust's developers about what they think/hope the semantics are/will be.
Historically speaking, this situation isn't uncommon: working out exactly what a language's semantics should be is hard, particularly when it has many novel aspects. Most major languages go through this sort of sequence. Some sooner or later than others --- and some end up addressing it more thoroughly than others). Eventually I expect Rust to develop something similar to the modern C spec, but we're not there yet.