
Span&lt;T&gt; in C# - runesoerensen
https://github.com/dotnet/corefxlab/blob/master/docs/specs/span.md
======
Nuzzerino
I've done some ultra-fast optimizations of algorithms in C# before. Stack
allocation and unmanaged pointers was essential for this. It seems that
Span<T> is making this approach, in a sense, more accessible to more
developers.

This makes my optimization skills less valuable as more developers will know
how to do this ;)

However, this will greatly enhance awareness of C# as a language that can be
used for reasonably high-performance code. Most people don't seem to
understand how fast C# can be with good optimization techniques. This will
help increase adoption of C# in the developer community.

~~~
dr_zoidberg
OTOH, I'd say this gives you another tool to your optimization toolbox, and
one that you can market as "this thing I was doing before with unmanaged
pointers, can now be done in a safer way" (and call your previous clients to
sell them the new-and-shiny-good-way-to-do-it).

~~~
WorldMaker
Yeah, if Linq optimization work has taught me anything, `Span` will get lumped
into "weird advanced things" by most developers and there will still be a role
for developers that understand how to use it.

(Aside: I still can't believe the number of developers I meet that seem to
think of basic Linq concepts like `ToLookup` as "weird advanced things", which
leads directly to long rants about `ToList` and why I consider it harmful.)

~~~
Nuzzerino
Yes and no. You're right on the money about Span being an advanced construct.
However, before Span, stack allocated memory was essentially C programming in
the CLR, and most of the benefits of the CLR (cross architecture
compatibility, for one) therefore goes out the window, as you have
dependencies on raw memory layout and stack space (determined by the CPU
architecture and OS) and things like memcpy, which depend on binaries from the
C runtime. If Span does not compromise these benefits, then the developer no
longer has to think about those things.

There will be an gain in ease of use. Granted, average developers aren't going
to care about this (nor will they need to). But you should expect to see an
increase in performance of many libraries, as most open source projects
generally won't use unsafe code in order to increase performance.

------
runesoerensen
Span<T> was recently added to C# 7.2
[https://blogs.msdn.microsoft.com/dotnet/2017/11/15/welcome-t...](https://blogs.msdn.microsoft.com/dotnet/2017/11/15/welcome-
to-c-7-2-and-span/), and the early proposal was discussed a bit on HN last
year:
[https://news.ycombinator.com/item?id=12713186](https://news.ycombinator.com/item?id=12713186)

Also this blog post has more examples and benchmarks for people interested:
[http://adamsitnik.com/Span/](http://adamsitnik.com/Span/)

------
twotwotwo
C# was mostly irrelevant to me a long time when Microsoft's implementation was
closed, but there are some neat things about it. They've done a lot of
interesting stuff in the language since it first came out, including pragmatic
sugar-y stuff like type inference (`var`), async/await, and recently some
moves towards more functional-style pattern matching though they're not
totally there yet ([https://www.kenneth-truyers.net/2016/01/20/new-features-
in-c...](https://www.kenneth-truyers.net/2016/01/20/new-features-in-c-
sharp-7/) discusses proposals, some of which didn't make C# 7). Interfaces and
value types also seemed like important things to have early, and there's some
other handy looking stuff like the SustainedLowLatency GC mode (defer full-
heap compactions as long as it can).

Can be tempting to think of it as a Java clone because of its early history
and the shared general category (OO-focused GC'd imperative statically typed
language whose first major impl was bytecode/VM-based), but there's signs of
more to it than that.

~~~
eighthnate
C# is a more polished and "prettier" Java. It has done well with it's
incorporation of functional programming, generics, dynamic programming and
syntactic sugar in the last few releases. Too bad it's pretty much confined to
the windows ecosystem.

~~~
mythz
Our server library and framework software suite and Apps have been running
cross-platform for several years and with the support of .NET Core 2.0 they
now run fast and flawlessly on Linux which is a very popular deployment target
for our Customers, in fact all our .NET Core Live Demos were developed on
Windows and deployed to and running on Linux:

    
    
      - https://github.com/NetCoreApps/LiveDemos
      - https://github.com/NetCoreWebApps/LiveDemos
    

Each project can also be opened and developed on Linux or Mac with VS Code or
Rider. Xamarin's solutions has been making C# a popular language for
developing native high-performance iOS/Android Apps for several years and the
stigma of C# server apps being confined to Windows should be eradicated with
the advent of .NET Core.

------
kccqzy
I’m not a C# programmer but can someone explain how it deals with memory
validity? For example what if after the Span is created the underlying memory
is freed or reallocated (e.g. moved to a different region in memory because it
needs to grow)? What if I create some data on the stack and then return a
Span<T> that points to it? The data on the stack would be implicitly
deallocated as the function returns. Basically how does C# solve the iterator
invalidation problem in C++?

~~~
int_19h
The linked document kinda sorta explains it, by saying that Span is a by-ref
type. Let me explain what this actually means.

In .NET (below C# level, in the VM itself), there are two fundamental types of
pointers - managed (e.g. int&), and unmanaged (e.g. int* ).

An unmanaged pointer is basically the same as a C pointer. It points to
whatever you tell it to point, and it's your responsibility to ensure that
it's still valid when you dereference it. So you can get dangling pointers to
locals that went out of scope, for example. If you point it at some memory
that is managed by GC, you also need to make sure that GC doesn't
automatically move that memory, because the pointer will _not_ be updated -
the VM provides an opcode for "pinning" managed objects so that they don't
move to facilitate this. Also, because these pointers are "dumb", you can do
pointer arithmetic on them, cast them to/from integer types, etc. And they can
be used in any position a valid type can.

A managed pointer, in contrast, is a pointer that is guaranteed to be memory-
safe. For managed pointers that reference managed objects and their fields,
this means that GC is aware of those pointers, and adjusts them as it moves
the objects around in memory, just like it adjusts regular object references
(so you don't need to pin anything; things "just work"). When you have a
managed pointer to stack-allocated data, the VM basically makes it illegal to
return such a pointer, or stash it away into a variable that can outlive the
scope - this is enforced by the bytecode verifier, simply by prohibiting
fields of managed pointer types in heap-allocated objects (including,
recursively, in any structs). So the only legal operation that you can do with
a managed-pointer-to-local is to pass it into a function call - since the
stack frame of the calling function is guaranteed to be there for the duration
of the call, that is memory-safe.

On C# level, unmanaged pointers are just pointers (int* ), and managed
pointers are used to implement ref types, as in "void Foo(ref int x)". Until
recently, function arguments were really the only place they were permitted,
so they were only used to pass arguments by reference. Recently, they've also
added the ability to declare ref locals and return by reference, subject to
all the verification rules - e.g. if you return a ref, you cannot return a ref
to a local, it must be a ref to a field, or a ref argument that you got from
the caller.

The runtime additionally has some types, that effectively wrap a managed
pointer and add some functionality to it. One existing example is
TypedReference ([https://docs.microsoft.com/en-
us/dotnet/api/system.typedrefe...](https://docs.microsoft.com/en-
us/dotnet/api/system.typedreference)) - this is basically a type-erased
managed pointer, plus runtime type of whatever it points to. You can then
"downcast" it to a proper typed managed pointer, and because it knows the
target type, it can verify that the cast is valid. There's also a type called
ArgIterator, which is basically the .NET type-safe equivalent of C's va_list.
These are used for various low-level stuff like C++/CLI vararg functions.

Now, these wrapper types, because they encapsulate a managed pointer, have all
the same verification restrictions that managed pointers themselves have. And
all these types, including managed pointers themselves, are collectively
called by-ref types.

Now, Span<T> is basically just a new by-ref type, that combines a managed
pointer with span length under the hood. As such, it's subject to all the same
restrictions, which together are sufficient to make sure that it's not
possible to use it in such a way that it points to invalid memory.

~~~
RotsiserMho
Thanks for taking the time to type this up. Having just written a class
similar to Span in C++ I was curious as to how this would be accomplished in
C#.

------
tomjakubowski
This looks great! In other languages, I frequently miss having a simple,
widely adopted abstraction like Rust's &[T]; C++ iterators are kind of a pain
by comparison, and often feel like an abstraction maybe one level too
low/broad (as always, concepts could help that).

It's nice that they the documented the interaction with the GC too.

------
ComputerGuru
The greatest benefit of `Span<T>` in C# is that it allows you to do things
that once required an unsafe context in an efficient, (memory) safe, and type-
safe manner.

~~~
DubiousPusher
Can you give an example?

~~~
colejohnson66
The Github link has an example:

    
    
      Span<byte> stackSpan;
      unsafe {
          byte* stackMemory = stackalloc byte[100];
          stackSpan = new Span<byte>(stackMemory, 100);
      }
      SafeSum(stackSpan);
    

You still need unsafe to allocate, but once allocated, you can access the
memory in a type safe way. M

~~~
justinvp
Unsafe is no longer necessary for such cases. This can now be written as:

    
    
        Span<byte> stackSpan = stackalloc byte[100];
    

See
[https://github.com/dotnet/coreclr/pull/14503/files](https://github.com/dotnet/coreclr/pull/14503/files)
as an example.

~~~
ComputerGuru
Yup. Perhaps the github links can be updated to reflect this awesomeness?

------
alkonaut
I’m most excited for the “value types as references” such as ref returns an
“in”-parameters. I often use arrays of structs for memory locality, and they
often grow larger than the size where they can be passed in registers. In a
64bit app with all double precision floats there isn’t much you can do with 16
bytes or less.

Being able to send 3 structs of e.g 32 bytes each (such as Vector4<double>) to
a function without hidden copying will be great.

------
polskibus
Great to see span taking final shape!

As for the future, I wish c# had const refs working like in c++ instead of
readonly and a plethora of ireadonly collections! It could optimize away many
things on the CLR level too. I guess it's not doable without breaking some
things, but I think it is worth it.

~~~
benaadams
readonly ref/in params in C#7.2?

[https://github.com/dotnet/csharplang/blob/master/proposals/c...](https://github.com/dotnet/csharplang/blob/master/proposals/csharp-7.2/readonly-
ref.md)

~~~
polskibus
It seems like that's a step in the right direction, although as I can see
implementation and specification are not finalized so it is hard to say
whether any compromises will have to be made when it is released.

What I really want is to have some basic code contracts (like pure methods)
embedded in the language in as frictionless way as possible. It eliminates
need for many simple unit tests, null checks, etc.

I'm not sure if this feature will be only for structs or also for classes - I
hope both will work.

Although I switched to C# 7 years ago from C++, and I like it, I have to say
that C++ syntax seems more readable to me - const T& seems more readable to me
than ref readonly T.

Another thing that might be missing here is propagating immutability
information down to JIT. AFAIK, codegen could optimize code better if
immutability is guaranteed (at least that's part of the rationale of using
const & in C++). There is PureAttribute in System.Diagnostics.Contracts,
however CodeContracts seem to be mostly dead in terms of compiler and VS using
information from them for static analysis.

I wish this ref readonly feature could basically render all IReadOnly*
interfaces useless, so that collections would have proper "ref readonly"
interfaces. I don't see that in proposal - I wonder if there is a way I could
contribute with that remark? I think it makes a good reason that's missing
from the rationale on that github page.

Lastly, I wish there was more discussion about how C++ or other languages with
stronger checks do it. Pretty sure there's huge opportunity to learn on
other's mistakes :).

As a side note - this is what makes HN great - I can post a comment and get
response from Ben Adams himself! One of the greatest .NET Core contributors
who made it really fast! Thank you for that link :)

~~~
benaadams
Is finalized, better doc might be "What's new in C# 7.2" and section
"Reference semantics with value types" [https://docs.microsoft.com/en-
us/dotnet/csharp/whats-new/csh...](https://docs.microsoft.com/en-
us/dotnet/csharp/whats-new/csharp-7-2)

The `in` modifier on parameters, to specify that an argument is passed by
reference but not modified by the called method.

The `ref readonly` modifier on method returns, to indicate that a method
returns its value by reference but doesn't allow writes to that object.

The `readonly struct` declaration, to indicate that a struct is immutable and
should be passed as an `in` parameter to its member methods.

The `ref struct` declaration, to indicate that a struct type accesses managed
memory directly and must always be stack allocated.

ReadOnlySpan also gives immutability; though will only work over contiguous
memory rather than a more general collection type; for which you'd still need
to use a IReadOnly* interface either as param or generic constraint.

~~~
polskibus
Thank you for additional clarification, I can see that some of related issues
like
[https://github.com/dotnet/csharplang/issues/38](https://github.com/dotnet/csharplang/issues/38)
are not yet closed too, so it may cause additional confusion about its status.

I think that a C++'s const& equivalent would be very useful in C#, especially
to simplify making objects and value types immutable by default and also more
performant. I created
[https://github.com/dotnet/csharplang/issues/1118](https://github.com/dotnet/csharplang/issues/1118)
as a starter point for discussion, although I'm not sure if linking it to "ref
readonly" that's available for value types is the right association to make in
terms of more general C# roadmap and design.

------
WalterBright
D has had slicing since 2001 or so. I proposed them for C a while back:

[https://digitalmars.com/articles/b44.html](https://digitalmars.com/articles/b44.html)

Slicing is an incredibly useful feature.

~~~
sjmulder
I see the usefulness of spans, but I'm not sure if they fit well with the C
model. Consider these cases:

    
    
        void foo1(char *p, size_t sz);
        void foo2(char p[2]);
    

If I understand correctly, your proposal applies to the first case. The two
arguments could be reduced to one, which is currently not possible for the
general case without generic or template structs, but that alone would in my
view not be worth introducing fat data types to the standard. An
implementation doesn't need the new type either, it can already bounds check
deferences of the pointer against the lvalue it was created from[1].

For the second case, I agree that would be nice but it can be worked around by
using a struct.

[1] "recalling that pointers are anchored to the lvalue from which they were
created" Kell, Some Were Meant for C, 2017
([https://sjmulder.nl/dl/pdf/2017%20-%20Kell%20-%20Some%20Were...](https://sjmulder.nl/dl/pdf/2017%20-%20Kell%20-%20Some%20Were%20Meant%20for%20C.pdf))

~~~
WalterBright
> which is currently not possible for the general case

Not sure what you're thinking, since it does not require templates or
generics, and works for the general case. I know it does, because I
implemented it for D without generics or templates.

> it can be worked around by using a struct

Yes, but nobody does since it is clumsy, hence all the array bounds overflows
in C. Even tiny bits of syntactic sugar, which is what this is, can have
dramatic and transformative effects.

> [1]

That inescapably makes all pointers fat pointers, which is far more overhead
than what I proposed. My proposal does not have any unavoidable overhead.

\--

As for implementation experience, the D community finds they work extremely
well. It's much more than an "it would be nice" feature.

~~~
sjmulder
> Not sure what you're thinking, since it does not require templates or
> generics, and works for the general case. I know it does, because I
> implemented it for D without generics or templates.

This was in favour of your proposal, about how it isn't currently possible to
implement this construct generically as a user of the language, hence, it
would need to be implemented in the language itself.

I think I do see the point you're trying to make. However:

> Even tiny bits of syntactic sugar, which is what this is, can have dramatic
> and transformative effects.

This is true, but its use would come mainly from bounds checking at compile
time and runtime (otherwise it's just the struct) which is hardly even done
for regular arrays.

~~~
WalterBright
> its use would come mainly from bounds checking

That is where the memory safety comes from. But the use comes from things
like:

1\. strlen, strcat, etc., become obsolete, replaced by far more efficient code

2\. looping over an array contents becomes straightforward

3\. documenting the extra variable holding the dimension of the array becomes
unnecessary

I.e. a host of benefits.

------
c-smile
Yet another incarnation of slices in D :
[https://dlang.org/spec/arrays.html#slicing](https://dlang.org/spec/arrays.html#slicing)
in a store near you this time.

Yet string_view, vector_view and other range<T>'s in C++. Yet Pythonic slices.
Yet Rust.

~~~
WaxProlix
I don't understand your comment, could you clarify a bit? Are you saying that
Python's slice operator is a copy of D's? Or that there's some sort of
contiguous memory guarantee with some sort of Sliceable in D? In Python?

------
legohead
Starting a new job and C# will be a part of it eventually. Reading through
this is really confusing -- can anyone recommend me a good book that is
relatively up to date? I am good with C and several other languages, so don't
need anything that is too newbie-friendly.

~~~
megaman22
Jon Skeet's C# in Depth is pretty solid. The fourth edition isn't done yet -
and at the rate of change lately may be out of date by the time it is
released, but it goes into some great detail on the guts of the language, and
how it has changed over time.

[https://www.manning.com/books/c-sharp-in-depth-fourth-
editio...](https://www.manning.com/books/c-sharp-in-depth-fourth-edition)

~~~
tonyedgecombe
I think that book is more useful once you have been using C# for a while, I
would recommend it as a starting point.

------
zoom6628
This will make writing IOT apps easier as one often have to scan/process
contiguous buffers. Using string was slow and onerous - this could be cool,
and when combined with .NetCore starting to make C# a contender in device
space and other low level uses.

------
verdex_phone
This is pretty cool. Ive seen a surprising number of things hidden in c# that
can affect performance. For example stack allocation. Anyway im looking
forward to the tools, frameworks, and abstractions that come from these
features.

------
nimish
Glad to see something out of Midori becoming widespread

~~~
pjmlp
Yes. .NET Native and async/await were also influenced by it.

------
fulafel
They might call the example something other than "SafeSum" as it produces
wrong answers on overflow :)

~~~
int_19h
Not if you compile with -checked+ - then it throws. Although the right way to
do this is to use a checked-block:

    
    
       checked { sum += bytes[i]; }
    

It's one obscure C# feature that I rarely see used, but which I find
indispensable, and really wish more languages adopted it. Integer overflow
vulnerabilities have become more prominent in the past few years, so perhaps
there will be some uptake. Interestingly, C# had this feature since the very
first release back in 2001.

~~~
wtetzner
It would be even better if checked was the default, and you had to use an
unchecked block to get unchecked arithmetic.

~~~
benaadams
Can set it at the project level; then you need to use unchecked instead:

[https://docs.microsoft.com/en-us/dotnet/csharp/language-
refe...](https://docs.microsoft.com/en-us/dotnet/csharp/language-
reference/keywords/checked-and-unchecked)

------
manojlds
Can someone update the link to
[https://github.com/dotnet/corefxlab/blob/49982d193b4853002a9...](https://github.com/dotnet/corefxlab/blob/49982d193b4853002a960770166713da858ef8da/docs/specs/span.md)

------
chiph
Glad to see that they'll help devs with the pinned memory issue (for buffers
used with IO Completion Ports). But had to look up what "RIO Sockets" were -
turns out the API helps you pre-register your buffers. Something you had to
"just know" to do before.

------
vinutheraj
Nice. Seems very similar to [https://github.com/abseil/abseil-
cpp/blob/master/absl/types/...](https://github.com/abseil/abseil-
cpp/blob/master/absl/types/span.h)

------
teovoinea
Reminds me of slices in Rust.

------
alkonaut
C# 7.0 works for the most part in .NET 4.5 or even 4.0

Here it seems there is a dependency on the 4.7 runtime, is this the case?
Where can I see a matrix of what C# versions work on which (full/desktop)
runtimes?

~~~
kevingadd
Span definitely requires VM updates (Mono has been picking up commits to
support it) so that feature at least is not going to work on 4.5/4.0 without a
polyfill.

~~~
benaadams
You can use the NuGet package for downlevel support
[https://www.nuget.org/packages/System.Memory/](https://www.nuget.org/packages/System.Memory/)

The runtime changes make it faster, smaller and include Jit optimizations for
it (bounds check elimination etc)

------
sharpercoder
I read the blogposts, I have 10 years of C# under my belt (with very scare
unsafe usage). Why is this important for me? What problems does this feature
solve?

~~~
mlazos
The fact that it can point to unmanaged memory is a benefit for me. Just last
week I’ve been porting some performance critical code to C++\CLI and this will
allow me to use a span of of object instead of having to pin arrays. One other
application is that it is now possible to implement a memory pool using a
contiguous piece of memory and only returning slices of it, previously you
would have to have multiple instances of a fixed size array in the pool.

------
bhouston
I wrote an incredible similar class for protected memory accesses for c++ back
in 2008. Lead to beautiful code and in debug mode it could verify accesses.
Subsets, copies, type conversions etc were beautiful. Also combined it with a
shared pointer class as well. Probably should open source those...

------
kuschku
I’m not sure if I’m missing something, but isn’t this basically just the same
as Java 7’s Buffer, ByteBuffer, etc?

These, too, support the same types of backing memory (backed by native memory,
stack-allocated memory, or a Java array), same access pattern, etc, and are
also used by many third-party libraries now for such stuff (including many
networking libraries, and graphics libraries such as LWJGL3)

What’s so innovative about these?

~~~
zokier
> I’m not sure if I’m missing something, but isn’t this basically just the
> same as Java 7’s Buffer, ByteBuffer, etc?

Aren't those Java types heap-allocated and GC managed? I think the big deal is
that Span can be allocated anywhere (native, stack, heap).

~~~
kuschku
Nope, the wrapper is, but a ByteBuffer also allows different allocation
strategies, which is how different libraries use them.

For example, LWJGL3 has a separate stack where it allocates them.

~~~
pharrington
There is no way to use stack allocated memory in Java.

LWJGL's MemoryStack _emulates_ a C style stack by preallocating a set amount
of off-JVM-heap memory and decrementing/incrementing an index into it.

~~~
pjmlp
> There is no way to use stack allocated memory in Java.

Not directly, but most JIT compilers do it.

It is tricky and fragile, but one can make use of the JIT logs and try to re-
write the code so that it takes the right decision regarding escape analysis.

As for an actual support at language level, we need to wait for the outcome of
projects Valhalla and Panama.

