Hacker News new | past | comments | ask | show | jobs | submit login
Is C# a low-level language? (mattwarren.org)
167 points by benaadams 18 days ago | hide | past | web | favorite | 71 comments



If you really wanted to, you can program in a subset of C# that is similar to C. You can avoid the garbage collector by using a native memory allocator and then manipulate pointers with the usual * and -> operators.

While you probably don't want to write a whole program like that, sometimes it is helpful to use these features when porting code from C or for performance reasons. See for example this implementation of memmove that is used for copying strings and other things:

https://github.com/dotnet/coreclr/blob/7ddd038a33977b152e856...


Uh, wow.

    // Managed code is currently faster than glibc unoptimized memmove
    // TODO-ARM64-UNIX-OPT revisit when glibc optimized memmove is in Linux distros
    // https://github.com/dotnet/coreclr/issues/13844


This is basically what I do. We don't have real time constraints, but we do have a limit to how long processing can take so we need to shave as many ms off how long our code executes as possible, so we pretty much use just the bare minimum features of the language and profile the crap out of everything.

One could argue that C# wasn't the best choice, but that was out of my hands.


Are you able to share any examples of functions where writing C# like that gave measurable benefits? I'm over in web-land and it's rare that C# itself is the bottleneck.


One example from my line of work that could conceivably translate to web-land would be parsing numbers. I wrote a version of double.Parse() with less features for an order of magnitude speedup. The program was doing data analysis on a bunch of CSV files, and just parsing the files was way slower than EBS I/O.


You mean, something like this?

https://blogs.unity3d.com/2019/02/26/on-dots-c-c/

TLDR: arguably the most popular game engine of our time has already implemented exactly that (although in more of an experimental and unstable stage yet).


Not quite. This has been possible in C# for 15 years with AllocHGlobal. It's handy for avoiding the speed and especially memory[1] overhead of managed objects.

1. https://codeblog.jonskeet.uk/2011/04/05/of-memory-and-string...


For those saying a GC is too high level, malloc is also a high level abstraction. There are relatively simple implementations of malloc, but allocators like jemalloc are starting to look quite a lot like a GC in the ways they manage memory.

The real issue with GCs is the lack of control that makes it very difficult if not impossible to do things like grouping allocations together. However, this capability comes at a cost since then you can’t move around existing allocations without going to extreme lengths and can easily lead to memory fragmentation.


C# is managed code, so no, it’s not low-level by any reasonable definition of the term and calling it so would be confusing to anyone getting started in this stuff.


C# _targeting the .Net runtime_ is managed code. There are subsets, including Unity's new HPC#, that compile directly to assembly.

I think it's important to separate the language from the runtime so people come out less confused versus conflating them to cover the 90% use case.


Pedantically asserting that it's possible to use a subset of a language and feed it through a special compiler to write low level code is not the way to make it so people are less confused.

We live in an era where you can boot Linux in JavaScript. If every time you make a statement about computing you have to list the exceptions, that's sinking in to the Turing tarpit for documentation.


Normally I'd agree with you, especially when concerned with languages like JavaScript. But C# is a different beast. It's being used quite frequently in large gaming projects and even operating systems where it is frequently less managed.You can even use it out of the box in a more, but obviously not entirely, unmanaged way to increase performance.

I don't consider this pedantic in this case. But the line isn't hard so we'll have some disagreement how to teach / not teach this.


Do you have any examples of projects where it's used in a "less managed" context? Are you just talking about avoiding GC, or something more advanced?

Even when you write in a low level style using the .Net/Mono VM, you're still fully protected by the guarantees of the language and runtime. You can't chase bad pointers around through memory or overflow the stack. It's impossible to access an array without a bounds check unless the JIT can prove it's not needed. The safety this provides is night and day to what it's like coding close to the metal (I develop games in C# and have written a fair amount of C/asm).

But as you say, we can agree to disagree about that...


> You can't chase bad pointers around through memory or overflow the stack. It's impossible to access an array without a bounds check unless the JIT can prove it's not needed.

Many native languages have these features without having a managed runtime.


You can absolutely do everything you described. You can use raw pointers in C#, to the extent that almost any C code can be converted 1:1 to C#.

The memory safety goes out the window when pointers are in play; this is why we have 'unsafe' blocks.


I don't think you're giving C# enough credit in this case. We're not talking about a totally new language that uses the same syntax. This is just an AOT compile of efficient C#. The inefficient things are flagged as compiler errors instead of lint warnings.

If you used the same code in a full .NET runtime deploy it would still be very efficient and most likely vectorized as well.

Unity's Burst compiler is not nearly as mystical as their marketing team would have you believe at first glance.


People are only confused if they keep mixing up languages with implementations, which are two different things.


C# can also be compiled directly to Assembly via NGEN (only supports dynamic linking though), AOT compiler for WP 8.x, which was later replaced by .NET Native for UWP.

.NET Core 3.0 will bring in CrossGen for AOT, and you also missed Xamarin/Mono support for AOT.


This conflates the language, a particular implementation of that language, and a particular runtime that said implementation uses.

As far as the language goes, you can do anything in C# that you can do in C in terms of memory-unsafe programming - raw pointers, pointer arithmetic, unions, allocating data on the stack - it's all there, and it's not any harder to use. Since semantics are the same, it should be possible to implement it just as efficiently as in C. At that point, the only overhead that you can't remove by avoiding some language feature or another is GC, although you can avoid making any allocations through it. So the real question is whether a language with GC can be considered low-level.


Unity is also going to do some C++ -> C# conversions https://blogs.unity3d.com/2019/02/26/on-dots-c-c/


Unity has gone far beyond cross compilation of C# to C++ now. Their Burst compiler is capable of compiling C# straight to assembly with vectorization optimizations. They’ve even ported core C++ components of the engine into bursted C# because the resulting code is faster and a quarter of the size. It’s impressive stuff.


I wonder why they choose c# over other language common for scripting / gaming engine, like lua c++ or even java. I think C# is good and designed well (over java), and I wonder if it's the same for them.


https://blogs.unity3d.com/2019/02/26/on-dots-c-c/ has a high-level overview for why not C++.

Moreover, Unity has invested heavily in the c#/.net/mono ecosystem; the editor runs on mono, and the runtimes are all either IL2CPP (an in-house translation from .NET bytecode to C++) or mono as well. It would be a truly crazy amount of work to switch, for little or no clear benefit. If one were picking a language today to write an engine in, Rust might be a reasonable competitor, but it's also fairly new and is arguably less noob-friendly. C# offers a nice ramp from new programmer to performance-critical AAA with burst.


C# was the only choice they had of a typesafe language that had a runtime portable to many platforms (Mono).

C++ is terrible for fast iteration times and is capable of hard-crashing with simple mistakes, Lua is not typesafe and the library support is weak, and Java is still not a portable language in 2019 (PS4, Xbox, Switch and even iOS).

C# has ended up being a wonderful choice to build games in, and now that the 2019 version of Unity has an incremental GC we are entering a golden age with the tool.


Unity features three scripting languages; JavaScript/UnityScript, C#, and Boo


This is technically true, but these days using anything other than C# is strongly discouraged.


UnityScript and Boo are deprecated and most people strongly discourage from using it.


They most likely chose C# because mono they originally targeted macs and mono compiled to that. Ultimately it was the best of the three languages they chose to support (Javascript and Boo being the others).


I remember back when they called C a high level language- because it wasn’t assembler. Yes, I’m old. But, my point is that there probably isn’t a definitive answer to that question that won’t change with changing technology.


Speaking as another old person (50+) yes i too remember when C started becoming in-vogue for apps (shortly before VB4) and it was considered high-level by a lot of folks. Even in a COBOL enterprise system in which i worked at the time C was considered more high-level than low-level although we used it to build a runtime for COBOL apps. I have seen this type of discussion go on for decades, literally, but to my simply mind low-level means at the machine instruction set level. Everything above that is something else and i think the label of high or low level is frankly both arbitrary and meaningless. In recent years having spent time working with C on Arduino and Python for apps/hacks, and I also have dabbled with pascal(several variants of it), the only term i can think of to meaningfully describe any language is 'fit for purpose'. And if your purpose is to do ray-tracing from an environment that needs be supportive of programming noobs then maybe C# with tweaks is the best fit. If your environment requires burning programs directly into silicon then maybe verilog is best fit for purpose.

Its not about what you like, what you know, what is trendy, what is top of the pops this week - its about getting shit done and that happens when you use the right tools. The tech industry, like every other industry, ultimately involves people and people are as subject to fashion, peeves, and emotional attachment to their investments(i.e. the languages you know well) - a comment on C# vs C++ seems to start wars just as religious as vi vs emacs as we had back in the 80s with COBOL vs RPG3!

End of rant.


Well, C already "adds a lot of value". For example, the mere act of defaulting to a calling convention, is already a commitment. The default generated, will not always be locally optimal. No matter how much the compiler rethinks its default choices ("optimization"), you will still end up with sequences of assembly instructions that will be considered stupid on closer inspection. Hence, you already unavoidably pay a price there. In the real world, nothing comes for free!


Some people say that C isn't low-level (though I disagree). I personally don't think of C# as even remotely low-level.


The author takes this into consideration:

  * yes, I know ‘low-level’ is a subjective term 
I used to work in the embedded world, and there C has historically been considered a "high-level" language for many platforms. However, we have started to see languages like C# and Python make their way into the world of tiny micros thanks to more power coming to the platforms themselves and the clever work being done by MicroPython and Microsoft.


The problem is that most legacy code (especially firmwares) is C and will never have a chance to be rewritten in another language. If you get a chance into VLSI field you will be amazed that they are still relying on Tcl to do complicated stuffs. Only part of the world's software (most in IT companies) can afford rewriting in a new language regardless of costs/upstreams.


I think that’s the point of this article. No C# isn’t remotely a low level language, yet it has all these features that allow it to drop pretty low when necessary.


>Some people say that C isn't low-level

Then again some people will say anything. It's how things are used in practice that matters.


I think the article is pointing out that C# offers many of the benefits of low level languages like C/C++, including some control over memory layout, passing by ref, calling conventions with P/invoke, and soon, SIMD intrinsics.


I tend to think being able to locate, describe and manipulate objects in native memory tends to define a low level language as is the ability to control latency. These are the reasons most garbage collected languages aren't low level.

Garbage collection traditional means you can't meet hard latency contracts. And by definition you have no control over where garbage collected objects are in memory.


You can actually do some of that by pinning objects in unsafe blocks along with using the FieldOffset attribute.

This is where the term "low level" is a bit ambiguous and relative. If you're trying to bit bang some serial protocol on a microcontroller with tight timings, C# and the GC are going to give you a hard time. But even C might give you a hard time here. I've had to drop down to inline assembly because it's much easier to count the cycles it'll take and insert a few nops where necessary to get the timing right.

Then, if you're on a non-realtime OS like Windows or Linux, no language is going to give you hard latency constraints.

Going further, it would be difficult to bit bang with tight timing constraints on x86 altogether due to speculative execution.

So from that perspective, C, x86, and modern operating systems might all be too high level to achieve your goal.

But for other goals, you might have a different answer. One interesting example was Xbox 360 games.

Game developers have traditionally used C++, because they have performance and latency constraints: if you're targeting 60 fps, you don't want any frames taking more than 16.6 ms. Indie developers had to use C# to make games for the 360. Unfortunately, the .Net Compact Framework it used had an old, slow, non-generational GC, which would cause hitches. So games would do one of two things to avoid hitches. Either they'd design their games to make zero allocations per frame, or they'd design their memory layout to be very simple so the GC would run fast. Structs are very helpful here because they don't generally allocate on the heap.

So was C# low level enough for game development on the Xbox 360? I'd say mostly. It gave you some hassle with the GC, but ultimately you had enough control over performance and latency to make it work.


I think theres a drift over time.

If you're inputting hand assembled hex, then an assembler is high level. If you writing ASM, C is high level.

C is now thought of as low level, I'm sure many formerly 'high' level languages will become low level also.


It can be considered low-level as long as you consider CLR the target machine. In the same way Java is equally low-level for the JVM and Julia is for LLVM. But this is not strictly correct as the final target for whatever still is the physical machine (x86, ARM or whahtever) and neither C# nor Java are low-level relatively to it.


Slightly to the side of the topic, but this quote here:

>Yep, Bob uses i++ instead of ++i everywhere. Meh, what’s the difference.

I know what the difference is, but if you're just using it on its own line or in a for loop, why would you use `++i` instead of `i++`?


Because you should write what you mean, not just assume the compiler will convert what you wrote into what you mean.

    i++
means

    var temp = i;
    i = i + 1;
    return temp;
where as

    ++i
means

    i = i + 1
    return i;
So if you actually mean to create a temp then use i++ , if you don't use ++i

See example of actually implementing them

https://stackoverflow.com/a/3846374


Why didn't you ask the question the other way around? :-)

For some inexplicable reason the postfix operators are usually taught and learnt first, and thus considered more natural, even though there are more objective reasons for the opposite, at least in C++ world.


> For some inexplicable reason the postfix operators are usually taught and learnt first, and thus considered more natural

The (unary) postfix operators might have semantics that are more surprising than the prefix variants, but the postfix syntax itself feels a bit more natural and fitting with other syntax of the Algol language family, IMHO, when compared to the prefix syntax. I think that's what leads people to introduce it first.

You can sort of read `i++` as an "abbreviation" of `i += 1`—in both cases, you've got an operator that follows an lvalue variable, and which both reads from and mutates the memory referred to by that variable. The only difference is that one has an additional "argument"—the amount to mutate the lvalue by—while the other has a "default" argument of +1.

`++i`, on the other hand, is a surprising bit of syntax, the first time you see it. It takes (and mutates) an lvalue... on its right! None of the (few) other unary prefix operators in Algol-like languages take lvalues, so it's kind of a shock.

(Also, for newcomers to programming, unary - [numeric negation] is commonly interpreted as part of the grammar of a number literal rather than being an expression; and unary ! is commonly seen as just "part of the language" (i.e. less like & and |; more like && and ||). Unary ~ is the only one I'd expect it would occur to any new programmer as even being a "unary prefix operator" rather than just "the way the language is." And it's a pure-functional transformation of its input.)


I too prefer prefix operators due to simpler/more intuitive semantics.

But postfix operators have simpler/more intuitive presentation.

++i and --i ressemble + +i and - -i

Whereas i++ and i-- resemble abbreviations of i += 1 and i -= 1


> even though there are more objective reasons for the opposite, at least in C++ world.

That's my question. What are those reason?


Google style guide has me sort of leaning to always using ++i:

https://google.github.io/styleguide/cppguide.html#Preincreme...


Habit?


This has a click-baity title. A mostly math/calculation heavy C++ program is translated into a mostly math/calculation heavy C# program. You could replace either side with any language and get the same outcome. At best, the end of the article lists some hacks you _could_ write in C# to get what you consider "low-level", but ultimately defeat the purpose of using a "high-level" language in the first place. If you're using unsafe calls and platform intrinsics, it's more than a one off situation, and the language doesn't encourage it, then maybe you're trying to hammer a nail with a shoe.


If they replaced C# with a language like ruby I suspect it would not get within the same order of magnitude of performance.

I agree that it may not be the best choice, but it is still interesting to see what is possible.


Sure, but I could write a computer for some new flavor of assembly tonight. The output would not by performant, but the language would certainly be low level.


Or python.


Yeah, agreed, the real problem does not occur in mere math calculations.

The problem occurs when you liberally allocate a dynamically-sized list of dynamically-sized lists (possibly of a union of different data types). That situation will always throw up an engineering trade-off between memory size, algorithm performance, and code complexity.

The solution will usually consist in severely constraining and restricting runaway flexibility in the data structure being managed, simply because it won't badly affect real-user use cases. Once in a while the strategy will exact a serious and real price anyway, but hey, you cannot please everybody.


People jump to clickbait way too quickly these days, the title is clearly tongue-in-cheek.


So you're saying JavaScript isn't a low level language because it has operators and function calls?


Yes. Original usage is anything but assembly is high level. Anything that has auto memory management is "very high". Words change in our field faster than most and those were just terms in the glossary of textbooks that have been made into other textbooks.


It was a joke. I agree with the GP.


As soon as you default to a particular policy of how to allocate, and most importantly free, memory on the heap, you have committed yourself. Low level means: no commitment in that realm, aka, do what you please. The fact that you can possibly overrule that default, does not make much of a difference, because almost any language allows this.

C does not even have a standard (dynamically-sized) list concept built in, because that would amount to committing oneself to a default heap allocation/deallocation policy. All you can get is a contiguous range of bytes with some vague, default interpretation as to what it is supposed to be, through the use of malloc/free (possibly hacking it through realloc). That is why C is considered low level.

Still, in a true low-level solution, you would use the sbrk system call directly. So, in a sense, C may already "add too much value" to be considered truly low level.


You can easily can sbrk (or mmap) from C directly.


Yes, exactly. You can even do that from any arbitrary scripting engine that supports executing arbitrary C functions.

For example, here, complete access to the linux/bsd/osx system call interface, including sbrk, from lua: https://github.com/justincormack/ljsyscall

Just load libffi (https://sourceware.org/libffi) or a similar module in any arbitrary language, and off you go!


>That is why C is considered low level.

C isn't a low-level language https://queue.acm.org/detail.cfm?id=3212479


It's subjective but recent language and runtime updates have made massive updates in performance and flexibility. There are projects like Unity, RavenDB document store, FASTER KV store, Trill analytics, ML.NET and many more that show that C# is incredibly capable of competing at low-level/high-performance scenarios.


Why was the conditional expression in the C++ code converted to an if statement in the C# code? C# has conditional expressions.


It's been a while since I've used C#, but I'd guess it's because the C++ is doing a "+=" with ",", to cram two statements in there, while C# considers "+=" a toplevel statement only and doesn't allow it to be used with ",".


Note the use of the comma operator in the C++ version. C# has no equivalent, hence the introduction of 'temp'. The first part is just adding to o.x, the bit after the comma is the result of the expression, which is what is passed to min. The += part would not evaluate to a value. It's ugly.


I would say no, but its all about lingo, low level languages, or system languages, can be used to bootstrap a computer.


Can't you just do pinvoke. It is easy enough to do. Just design a good interface in the c side to pass the data.


[flagged]


Go uses a garbage collector so I wouldn't classify that as a low level language since you have little control over GC pauses and the latency in creates.


Symbolics and Astrobe seem to have seen it differently.

Astrobe is still in business.


Are trees tall?




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: