
Implementation of Direct Segments on a RISC-V Processor (2018) [pdf] - ingve
https://carrv.github.io/2018/papers/CARRV_2018_paper_4.pdf
======
JoachimS
Very interesting. But I was really surprised that the conclusions didn't
mention any performance results.

The problem description in the abstract states that "Past analysis shows that
big-memory workloads can spend 5%-50%of execution cycles on TLB misses".

I somehow expected that they would have presented some before and after-
performance results. In Conclusion they state that "Our preliminary results
show that the TLB-Miss overheadhas reduced significantly and we plan to do
further analysis onwhere to perform the Direct Segment lookup in hardware to
get thebest performance." So I guess they had a dealine to meet.

------
audunw
Very interesting. I think it would make a lot of sense for RISC-V to try to
innovate on the memory protection and virtual addressing schemes as well.

I found the solution used in the Mill architecture to be very interesting:
[http://millcomputing.com/wiki/Memory](http://millcomputing.com/wiki/Memory)

If you have a 64-bit address space, all processes might as well use the same
virtual address space, since there's still plenty of space for each process,
even with thousands of processes or more. Making the design such that access
protection and translation are parallel paths is also a good idea.

------
gumby
I like risc-V has brought us to the point that you can so easily spin up a
custom processor to try something out!

------
dooglius
Other architectures have solved this problem (excessive TLB misses) by
allowing for larger page sizes than 4KiB. Using larger page sizes seems like a
much superior solution because you can have arbitrary numbers of larger pages
while this only allows for one global segment to be used at a time.

~~~
AboutTheWhisles
Prefetching is another technique. If memory is accessed linearly, the CPU can
prefetch ahead of what is currently being accessed, and that will include the
TLB lookups.

