Hacker News new | past | comments | ask | show | jobs | submit login
Ancient JavaScript is a safer language for integer programming than modern C (noyu.me)
3 points by TazeTSchnitzel on April 2, 2022 | hide | past | favorite | 8 comments



More things change, more stay the same.

mc4ndr3 commented: "... Explicit width types have better defined behavior. ""

TazeTSchnitzel story link reference to c & c type sizes.

Type widths/sizes do not impart encoding information, just limit how many bits referencing in a given programming instanc. (use of 32 bits means don't have to chain 4 8bit groups together).

java script still has the same C type size issues, just instead of integer & (void), it's disguised as a visual package called unicode.

https://seriot.ch/resources/talks_papers/20141106_asfws_unic... )

Simple example:

Historically, just had to worry about confusing letter/digit o/0 and l/ (simple to just check the ASCIC, Ebcdic, or other numerical value of the encoded for character used.

Unicode raises that to a whole new level.

What's the collation order for a printed unicode character that's defined by 4 different unicodes?

What's the collation order for printed unicode character that can be defined 6 different ways using one or more different unicode characters?

Does it really matter if it's UTF_16, UTF_32, UTF_64?


My tl;dr of the article:

1. JavaScript's numbers behave better than C's! (All the behavior the author is praising is because the numbers are IEEE floating point numbers and for most of JavaScript's history there weren't integer types. This was fine...until you started dealing with integers that didn't fit in the mantissa and you expected them to behave like integers.)

2. C works on too many kinds of computers and doesn't make assumptions I like! (Of course C can't guarantee that you have word lengths that are multiples of 8 bits, because it was—and probably still is somewhere—used on machines that don't have that. It can't provide standard ways to find out the endianness of the host because that may not be available from the host, or the host may support multiple endiannesses like x86 does. Of course C can't provide for 32 bit 2's complement integers because the hardware it runs on may not be 2's complement or 32 bit. Just pretend you're running on an exotic audio focused microcontroller with 12 bit words, 1's complement integers, and with endianness defined in the microcontroller's manual.)


Explicit width types have better defined behavior.

JavaScript doesn't so much have integers as reals.

JavaScript has exceptionally weak typing, see the WTF tech talk.


> Explicit width types have better defined behavior.

Integers in old JavaScript have an explicit width: 32 bits.

> JavaScript doesn't so much have integers as reals.

There is only one Number type, but it supports integers and integer operations, including two's-complement.

> JavaScript has exceptionally weak typing

So does C.


No, java script has two built in number types, number and bigint ( https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data... )

number type represents only it's value. number type has only one integer with multiple representations. (hence why doing bit manipulations in java script overrides the whole concept of using the language representation of the value of a number. e.g. is number all zero's except of last bit a float, an integer, or a boolean?)

With BigInts, you can safely store and operate on large integers even beyond the safe integer limit for Numbers. A BigInt is not strictly equal to a Number, but it is loosely so.

Two's compilment is a numeric encoding scheme, not a numberic value. Messing with an encoding scheme means directly manipulating bits. Not safe to do in java script because of behind the scenes / hidden from programmer using java script language type coersions.

****

Ah 'weak/strong type' is pretty vague (https://stackoverflow.com/questions/430182/is-c-strongly-typ... && https://stackoverflow.com/questions/376611/why-interpreted-l... )

scripting (java script) is dynamically typed.

compiled C is staticly typed.

C++ can infer types at runtime.

**

Was what interprets/runs java script on a given machine compiled as 64bit program or 32bit program?

Hence, is javascript number an actual 32bits, actual 64bits, or is a 64bit value a 32bit with bigint extensions behind the scenes.

During the execution of a javescript, is a check to see which java version (old / new) is being used done before doing 32/64bit bit manipulations so there are no "is it safe to use bit manipulations with bigints" hint: The reason why javascript can nolonger safely represent integers (not integer values).


> No, java script has two built in number types, number and bigint

BigInt is a new feature, I am only talking about Number. Hence “old”.

In the ECMAScript 3 specification (and later versions), it's specified that the bitwise operators convert the value to a 32-bit signed integer, with two's-complement wraparound behaviour. The input (Number) and output (Number) is notionally a 64-bit float, but this doesn't mean there aren't integer operations. And in fact in any modern JS engine, integer values of this kind will be stored as integers, not floats, for efficiency's sake.

There is an entire 32-bit C-compatible abstract machine (asm.js) built on top of this primitive!


Higher level languages are based on lower level languages. aka Have to manually do more things in lower level language to get the 'built-in' features of the higher level language.

C is a system/software engineering language, not a data science/"I don't do/use low level OS/HARDWARE stuff"/someone else needs to write the higher level stuff I want to use" language. So, using C as a scripting language, implies implimenting a scripting language in C first. (vs. https://hackernoon.com/javascript-compilation-epoch-ebfb7b5b... )

C is basically assembler language macros.

Thre are many types of binary encodings. Binary format consists of sequence of 0's & 1's, any order, usualy arranged in groups of 8 0/1's.

Standards are used to define how a binary format encoding is interpreted/used/defined. UTF-8, UTF-16, UTF-32, ASCII, etc.

notes on https://hikari.noyu.me/blog/2022-04-01-javascript-is-a-safer...

"... JavaScript has exactly one integer size supported by its integer operations: 32bits" Umm... on a 64bit machine, wouldn't this rehash the x86/dos/windows 80's/90's 8/16/32 etc issues? number is 64 bits & bigint goes beyond 64bits. JavaScript can no longer safely represent integers.

Bit operations are implemented as a cast to an integral type followed by a cast back to a double representation. It is therefore not a good idea to use bit-level operations in Javascript.

C language is not the same a C math library! Can check for NaN via appropriate C math library function.

C base type is integer. All other types are based off inter type, so can bittwiddle the base type to change endianness. Is there an ASN.1 java script defined using C?

https://developer.ibm.com/articles/au-endianc/ && https://commandcenter.blogspot.com/2012/04/byte-order-fallac... So, if C code isn't endian neutral, and endian-ness matters, better do the order check.

Can explicity define type width in C via bit enumeration without switching to assembler.

Struct layout is not endianness! And one can explicity specify ordering of variables in a C struc (vs. default of letting the compiler decide and/or writting the equivalent assembler code)

Removing integer inodes with javascript is "safer" than using C? aka how does javascript impliment a hardware static electricity protection device?

High-level scripting languages are written in C. So, how can C functions that the high level scripting language uses be slower than the same C functions the high level script language bases it's functions on?

Certainly higher level programming language by default takes care of details behind the scenes which would otherwise have to explicitly write out in C. (per difference between high / low level language)

C was built for speed. If want similar checks/guardrails that javascript provides, use/add/program the correct C tool/feature (aka compile time, run timchecks, lint tools, aditional code that someone wrote in C for the higher-level language)

Scripting languages were setup so can focuse on specific domain aspects/tasks and not how to impliment the language aspects. (typical emphasis is ease of use and immediate execution ) https://web.archive.org/web/20041010125419/http://www.doc.ic...


Using text because it's better than binary format?

human readable doesnt not mean the text format is unambiguous. aka don't preform checks/validations because not specification of what to check/validate

see https://seriot.ch/projects/parsing_json.html && https://news.ycombinator.com/item?id=28826600

just as messy as ambiguously specified ASN.1 binary formats.

Very important to know the requirments/specs for what's being implimented!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: