
An overview of direct memory access (2014) - panic
https://geidav.wordpress.com/2014/04/27/an-overview-of-direct-memory-access/
======
leggomylibro
Cool! I hadn't realized that PCI and PCIe were successors to DMA, and it's
nice to see an overview of how application processors shuttle data around
quickly.

If you want to play with DMA concepts on a less complex level, most modern
microcontrollers implement it in some form or another. If you write a program
that needs to move a lot of data quickly, like drawing to a display or using
an SD card, the difference in speed and reduced overhead is dramatic.

I guess it's a basic concept that most people here will find pedestrian and
unremarkable, but I really appreciate how flexible and powerful it can be. You
just have to configure a DMA channel pointing at two memory addresses, and
then the data moves in the background while the CPU goes off to do other
things. You can send data from memory to an external device, or vice-versa, or
within the chip's internal memory. You can usually wire two peripherals
together to implement 'dumb pipes' which require almost no intervention from
the CPU to keep working. Most chips will also let you trigger DMA transfers
from arbitrary signals such as hardware timers, so you can also 'set and
forget' things like sending audio data to a DAC at a specific bitrate.

I can see how it would be inadequate for the enormous bandwidth required to
host multiple devices like GPUs and Ethernet ports, but DMA is still fun. A
more elegant protocol for a more civilized age :)

~~~
cperciva
DMA originates from a time when the CPU accessed memory via the North Bridge
chipset; in effect, the CPU was just one of many peers in accessing memory,
and data could be transferred between memory and devices without ever touching
the CPU.

These days, RAM hangs off the CPU -- and even "DMA" involves data passing
through the CPU.

~~~
leggomylibro
>These days, RAM hangs off the CPU -- and even "DMA" involves data passing
through the CPU.

Not for the microcontrollers I was talking about, it doesn't. They have built-
in bus arbiters to manage access to the internal RAM across the CPU and each
DMA channel.

Not every "CPU" is an application processor. But thank you for the history
lesson! I guess the term "DMA" has become sort of generic these days.

------
bogomipz
The author states:

>"Virtual memory is continuous as seen from a process’ point-of-view thanks to
page tables and the memory management unit (MMU). However, it’s non-continuous
as seen from the device point-of-view, because there is no MMU between the
PCIe bus and the memory controller (well, some CPUs have an IO-MMU but let’s
keep things simple). Hence, in a single DMA transfer only one page could be
copied at a time. To overcome this limitation OS usually provide a
scatter/gather API. Such an API chains together multiple page-sized memory
transfers by creating a list of addresses of pages to be transferred."

So a DMA controller accesses physical memory pages and not virtual memory
pages? Is there a reason why it doesn't go through the MMU like everything
else? Is this something historical?

If a IO-MMU is present would there be an reason to use a scatter/gather API?

~~~
dooglius
It would be expensive as the DMA's MMU would now have to maintain a large TLB
to be performant, and there would be a startup overhead as new page
information is loaded.

------
continuations
Is RDMA the same as kernel bypass?

~~~
dooglius
Generally RDMA implies kernel bypass as the point is to write into a process'
address space efficiently, but you could conceivably implement RDMA in a way
that goes through the kernel, e.g. by using a conventional NIC and doing a
copy into the corresponding address space manually. Note that there are many
other things using kernel bypass that are not RDMA.

------
hootbootscoot
This author needs to spend some time looking at ARM, AARCH64, RISC-V etc, as
the entire article has a heavy Intel/AMD/PC bias.

~~~
hootbootscoot
because, yes, "computers" still contain DMA controllers.

we use them all the time.

~~~
hootbootscoot
[https://www.st.com/content/ccc/resource/technical/document/a...](https://www.st.com/content/ccc/resource/technical/document/application_note/47/41/32/e8/6f/42/43/bd/CD00160362.pdf/files/CD00160362.pdf/jcr:content/translations/en.CD00160362.pdf)

[https://www.sifive.com/soc-ip/dma-controller](https://www.sifive.com/soc-
ip/dma-controller)

These are far from "legacy" applications, I can assure you. Very little to do
with the ISA bus, I can assure you.

The author neglects to mention how DMA works, nor discuss Stride size, etc.

Additionally, the topic of how the MMU integrates with DMA and the issues
raised are merely glossed over.
[https://patchwork.kernel.org/patch/10671331/](https://patchwork.kernel.org/patch/10671331/)

and yes, "address translation"... as DMA needs physical addresses.

to be sure, stuff like the R-Pi confuses people, as the CPU is a "device" to
the broadcom GPU that drives the boot process/show, and hence "bus addressing"
internal to the CPU...sigh.

~~~
vvanders
Rather than casting scorn why not step up to the plate and do what the author
has done here but for the architectures that you know?

I spend a fair bit of time in the Arm space and didn't find the article
incorrect, just PC focused. I don't think the author claimed to be the
authority on everything DMA.

~~~
hootbootscoot
Having thought about it a bit, despite my admitted lack of expertise, I guess
I left the OP article wondering what DMA actually does, rather than how it
historically worked on a given architecture and dismissing it as passe. Some
explanation of the actual workings of it would go a long way towards
explaining DMA, vs this article on bus mastering.

1) a description of the dma state machine and what it does over and over
again. "so look, you configure xyz and then it copies from A to B over and
over again in x sized chunks" etc... maybe a mention of the general moving
parts you may find in some kind of dma system... channels, multiplexing, etc.
(i got something about an interrupt out of the op article, which is definitely
true but not the main story)

2) the concerns this naturally raises, when discussing memory management,
virtual memory, etc. (thus, the discussion of paging would then make sense
after #1)

3) the particular implementations and their locational/positional
implications. devices, busses, host, addressing schemes, etc. (here is where
the bus mastering could be best contextualized)

I basically think that it wasn't about DMA, as such, and by ignoring the "how
DMA works and is used" part, it basically then didn't provide any means for
technical analysis of bus mastering, etc.

