Hacker News new | past | comments | ask | show | jobs | submit login

I would go one step further and suggest that all physical quantities should either have the units in the identifier name or encoded in the type system. Meters, seconds, milliamps, bytes, blocks, sectors, pages, rpm, kPa, etc. Also it's often useful to explicitly distinguish between different unit qualifications or references. Seconds (duration) versus seconds-since-epoch, for example. Bytes versus page-aligned bytes, for another.

Having everything work this way makes it so much easier to review code for errors. Without this means that as a reviewer you must either trust that the units and conversions are correct or you should do some spelunking to make sure that the inputs and outputs are all in the right units.




100%. This is a baseline requirement where I work. If you don't either include the units in the identifier name, or use the type system where possible, your code is not getting merged.

The only people I've ever met that think this is unnecessary are also the same people that ship a lot of bugs for other people to fix.


> The only people I've ever met that think this is unnecessary are also the same people that ship a lot of bugs for other people to fix.

I find feelings about this depend a lot on how much ceremony the type system involves; and just how many units and references to them there are in the system.

Asking bash coders to write "sleep 5s" instead of "sleep 5" - I doubt you'd get any objections at all.

But if you're putting a foo.bar.Duration on a getDefaultTimeoutFromEnvironment on a setConnectTimeout on a HttpRequest on a HttpRequestInitializer on a NetHttpTransport on a ProviderCredential to make a simple get request? People who've come from less ceremony-heavy languages might feel less productive, despite producing 10x the lines of code.


Well ignoring the silly stuff this would just boil down to something like:

    request = NetHttpTransport.Request()
    request.setConnectTimeout(getDefaultTimeoutFromEnvironment().seconds)
which is verbose, but at least it's fairly clear.

And yes I'm calling a class named HttpRequestInitializer silly, I don't care if some language decided it should exist.


    request.setConnectTimeout(getDefaultTimeoutFromEnvironment().seconds)
This is actually a very good example of what not to do, with the mistake being in whoever implemented setConnectTimeout()

I actually don't know this particular API, but I'm used to timeouts being in milliseconds, so that code looks wrong to me.

Much better API, and what the article is talking about, is to change this to:

    .setConnectTimeoutMilliseconds(getDefaultTimeoutFromEnvironment().seconds)
Now the mistake is obvious, and even if the original developer doesn't notice it will stick out in a PR or even a causal glance.


That's one of the benefits of having it exposed as a type-enforced parameter. setConnectTimeout() could take a Duration, which contains the amount and the unit, and therefore wouldn't care if consumer A provided a timeout in seconds, and consumer B provided a timeout in milliseconds.


Totally agree, but then I would expect the code would be:

    request.setConnectTimeout(getDefaultTimeoutFromEnvironment())
with getDefaultTimeoutFromEnvironment() returning a Duration.

Ideally this is consistent throughout the codebase, so that anything that uses a primitive type for time is explicitly labelled, and anything using a Duration can just be called "Timeout" or whatever.


This example is nice but if you put arguments metadata in the function name, you have to have one main argument, the function name can prove cumbersome if you have 3 or 4 arguments with units like

    .setPricePerMassInCentsPerKilogramsWithTimeoutInMilliSeconds(100,2,300)


I think you should rather do

  .setPrice(priceInCents=100, massInKg=2, timeoutInMs=300)


I'd argue there are better API patterns for this though -- keeping in mind this values code readability (and correctness) over micro-optimization:

    .setPriceInCents(100);
    .setMassInKilograms(2);
    .setTimeoutInMilliseconds(300);
or

    .calculate({
        priceInCents = 100,
        massInKilograms: 2,
        timeoutInMilliseconds: 300,
    });


Values with an intrinsic unit scale up to many units and values. If you declare the parameters of a "calculate" function as USACurrency, Weight and Duration you can write

    calculate(100cent, 2Kg, 300s)   
    calculate(0.1dollar,4.7lb/*approximate*/,5min)
    calculate(something.price(),whatever.weight(),options.getDuration("exampleTimeout"))
    calculate(USD(0.1),Kg(2),Minute(5))


All fair points. In this example I was suggesting what this might look like at the border of the application where you need to talk to some (standard) library which doesn't use the same convention.


A class named 'HttpRequestInitializer' and taking 10 lines to set a timeout on a HTTP request isn't merely hypothetical: https://developers.google.com/api-client-library/java/google... - and that's not counting any import statements.

(although getDefaultTimeoutFromEnvironment was artistic license on my part)


True, but like I said that doesn't make it not silly. Just harder to fix.

Edit: Also, overriding a class method dynamically inside a function? I usually program python these days and even I think that's wild.


Its an interface with only method. I suspect that most of us would just us a lambda now.


I'll never quite understand why it wasn't simply a function in the first place.


Java pre-8 only had anonymous classes, there were no lambdas


And Java lambdas are still syntax sugar for one-method anonymous classes.


This is where I divert. You just hard coded seconds into your test. Now your tests that cover this must take seconds to finish; thankfully you were not testing hours or days!

My last shop was a Go shop and one test I think shows this off was an SMTP server and we needed to test the timeouts at different states. The test spun up half a dozen instances of the server, got each into the right smtp state and verified timeouts in under 10ms.

The environment that set the timeout would either be "timeout_ms=500" or "timeout=500ms" (or whatever). This is where you handle that :)


Not sure I fully understood your objection, but the reason I specified seconds is because I presumed the setConnectTimeout to be part of the default HTTP library, which likely doesn't adhere to the same conventions, and that it expected seconds (which seem to be the usual for http libraries as far as I can tell).

Of course if the setConnectTimeout method was part of the same application you could just pass the timeout directly, but at the boundary of your application you're still going to have to specify at some point which unit you want.


If you're testing things with timeouts it's often a good practice to mock out the system clock anyway. That allows testing your timeouts to be nearly instantaneous and also catches edge cases like the clock not moving forward, having a low resolution, moving backward, ... deterministically.


The test could simply mock the default value to something reasonable like '.1 seconds' and test that duration instead, so I don't think this is a real problem.


This is actually revealing a different problem: the system clock as an implicit dependency. YMMV depending on support of underlying libraries, but I will typically make the clock an explicit dependency of a struct/function. In Go, usually it’s a struct field `now func() time.Time` with a default value of `time.Now`.


Many timeout functions take seconds as a floating point. So you could time out on 0.05 seconds (5 milis). But now the code is clear and less prone to bugs.


In Go, you don't pass a float, you pass a duration


Which is unitless, hence there is no problem. `time.Duration.Seconds()` returns a floating point.


The 80/20 approach of renaming "timeout" to "timeoutSeconds" or "timeoutMillis" is also valid. They key takeaway is to not make assumptions.


> I find feelings about this depend a lot on how much ceremony the type system involves; and just how many units and references to them there are in the system.

Ideally it'd be as simple as:

typealias Dollars = BigDecimal

typealias Cents = Long

that's valid Kotlin, but the equivalent is doable in most languages nowadays (Java being a big exception).


I recommend against that, and use proper (wrapper) types.

I don't know Kotlin, but in most languages if you alias 2 types to the same base type, for example Seconds and Minutes to Long, the compiler will happily allow you to mix all 3 of them, defeating the protection full types would bring.


that's correct. typealiases are the wrong solution here. the better solution would be value classes, but of course, the unit shouldn't be the type.


> typealias Dollars = BigDecimal

> typealias Cents = Long

If I ever have to deal with monetary values in a program where someone thought this was a good idea, ... well, it really won't be the worst thing I’ve ever dealt with, but, still.

(If you have dollars and cents as types in the same program, they darn well better work properly together, which is unlikely to work if they are both just aliases for basic numeric types.)


don't do this!

    typealias Cents = Long
    typealias Meters = Long
    
    Cents(3) + Meters(4) = 7L
that's exactly the thing we want to prevent.

the classes shouldn't be for units at all, the type should represent the type of unit.

so instead of a class Seconds you should have a class Duration, instead of class Meters you should have a type Distance. that's because the unit of the different types can be converted between each other.

Dollars and Cents are a bit of a bad example because currencies can't be easily converted between each other, as conversion is dependent on many factors that change over time. meters, yards, feet, lightyears, miles, chain, furlongs, whatever describe a multiple of the same thing though, so a different type for each unit isn't necessary, as the input that was used to create the instance isn' usually needed. the counter example would be datetime with timezones - a datetime with a timezone convers a lot more information than the datetime converted to UTC.


What type of industry/product do you work in/on? And what sort of languages do you work in?


Not the commenter, but I work in scientific software development and it's just a minefield of different units, so being explicit is generally very useful. Even if you can assume you can stick to metric (you can't), what different users want varies across countries. For e.g. here in the UK we often want to represent data as millilitres, but in France the same measurements are often centilitres.

I don't use libraries to enforce it though, we did try this but found it quite clunky: https://pint.readthedocs.io/en/stable/


It varies across different users from the same city! The same family, even!

One piece of equipment I just finished working on measured and displayed vacuum at different sensors in PSIg, kPa, Pa, mmHg, and inHg. The same machine, the same measurement at different stages in the process, five different units!


Not OP, but I see that a lot in the aerospace and heavy industry sectors.

We keep laughing about "if we engineered bridges as we engineer software" ... the truth is that the areas where correct software matters tend to write very robust code, and the rest of the industry would be well advised to take notice and copy what they see.

Of course, writing robust code is a skill, and it takes extra time.


It takes time to learn, and to learn the value, and time to agree with the team that it's sensible. With this sort of thing - proper naming of variables - I disagree that it takes longer at point of use.


> and the rest of the industry would be well advised to take notice and copy what they see

I don't agree.

There is a good reason that aerospace industry writes robust code - in invests time (money) to avoid disasters that could cause, among other things, loss of human life.

On the other hand if for example some webform validation fails because the code was written as fast (as cheap) as possible, who cares really.

That is just a tradeoff, in aerospace industry you spend more to lower the risk, somewhere else you don't.


>On the other hand if for example some webform validation fails because the code was written as fast (as cheap) as possible, who cares really.

Who knows. Maybe a billion dollar company that can't fulfill orders. Maybe a million people who suddenly can't use their bank online.


Yes, sure, but a "billion dollar company" in this case does not represent the whole industry.

You can probabbly find a some specific non-critical case in aerospace industry, but surely based on that example one would not suggest that the whole aerospace industry should just copy what they see in frontend dev.

Context matters, there are exceptions, but the standard practices are based on average scenario, not on extremes.


I'm not saying that all code should be developed under standards designed for embedded control system software. I'm just saying that "oh, it's just web stuff so it can't be important" is ridiculous.


>>> and the rest of the industry would be well advised to take notice and copy what they see

>> I don't agree.

>> [web form validation example]

>> That is just a tradeoff, in aerospace industry you spend more to lower the risk, somewhere else you don't.

> I'm just saying that "oh, it's just web stuff so it can't be important" is ridiculous.

This feels like a straw man.

The original argument is that the rest of the industry (that includes web, but a lot of other parts also) should copy what they see in aerospace industry.

I believe that would not be appropriate for the rest of the industry to just copy practices from any other part because each segment has its own risk factors and expected development costs and with that in mind developed their own standard practices. Nowhere did I state that the "web stuff can't be important" nor that there is no example of web development (form validation) where the errors are insignificant.

That said, I will go back to the "billion dollar company that can't fulfill orders" / "million people who suddenly can't use their bank online" catastrophe; this happens all the time. Billion dollar company doesn't fulfill orders, error is fixed, orders are fulfilled again, 0.0..01% of revenue is (maybe) lost.

In aerospace industry a bug is deployed, people are dead. No bugfixing will matter after that moment.

How can this two industries have the same standards of development?


Yes, as a comparison, let's just take all the units (em, px, %) out of writing CSS and see how fun that becomes to review and troubleshoot.


I'm certain that the Mars Climate Orbiter had a lot to do with this practice.


Also not OP, but I work on graphics software and we frequently deal with different units and use strict naming systems and occasionally types to differentiate them.

Even more fun is that sometimes units alone aren't sufficient. We need to know which coordinate system the units are being used in: screen, world, document, object-local, etc. It's amazing how many different coordinate systems you can come up with...

Or which time stream a timestamp comes from, input times, draw times (both of which are usually uptime values from CLOCK_MONOTONIC) or wall times.


As a bonus, coordinate systems can be left-handed or right-handed, and the axes point in different directions (relevant when loading models for example).


What does a type for ‘seconds’ do that an ‘integer’ doesn’t?

I may be misunderstanding this.


The type is "duration", not "seconds". "seconds" is the unit. You can think of the unit as an operator that converts an integer to a duration.

The advantages are:

- An integer doesn't tell you if you are talking about seconds, milliseconds or anything like that. What does sleep(500) means? Sleep 500s or 500ms? sleep(500_ms) is explicit

- It provides an abstraction. The internal representation may be a 64-bit number of nanoseconds, or a 32-bit number of milliseconds, the code will be the same.

- Conversions can be done for you, no more "*24*60*60" littered around your code if you want to convert from seconds to days, do fun(1_day) instead.

- Safety, prevents adding seconds to meters for instance. Ideally, it should also handle dividing a distance by a duration and give you a speed and things like that.

Under the hood, it is all integers of course (or floats), which is all machine code, but handling units is one of the things a high level language can do to make life easier on the human writing the code.


A function might take multiple integer arguments, each in different units. Separate types for each unit guarantees you won't pass the wrong integer into the wrong argument.

Eg

  func transferRegularly(dollars(10000), days(30)) 
Meaning clear.

  func transferRegularly(10000, 30)
Meaning obscure, error prone and potentially costly


With some languages like Python, you can use keyword arguments too (even out of order).

Eg. you could simply do

  transferRegularly(amount=10000, period_in_days=30)
I am always amazed how new languages never pick up this most amazing feature of Python.

Though obviously, this code smells anyway because 1. repetitive transfers are usually in calendar units (eg. monthly, weekly, yearly — not all of which can be represented with an exact number of days), so in Python you'd probably pass in a timedelta and thus a distinct type anyway, and 2. amounts are usually done in decimal notation to keep adequate precision and currency rounding rules (or simply `amount_in_cents`).

Still, I am in favour of higher order types (eg. "timedelta" from Python), and use of them should be equally obligatory for unit-based stuff (eg. volume, so following the traditional rule of doing conversions only at the edges — when reading input, and ultimately printing it out).


I see keyword arguments as slightly different though. The keyword is like the parameter name. The value is still a plain integer and (theoretically) susceptible to being given the wrong integer. In contrast, unit types allow for hard checking by the compiler.

In practice, with good naming it won't make much difference and only shows up when comparing the docs (or intellisense) for an API with how it is actually used.


> transferRegularly(amount=10000, period_in_days=30)

dollars(10000) is still better than this example, because: 10000 what? Pennies? USD? EUR?


Someone else caught me out on that too by suggesting making it `amount_in_dollars` elsewhere in the thread ;)

Now you can say how there are also AUD, CAD...

The point was simply that if units are needed due to lack of specific type being used, it's nicer to have that in the API when language allows it.


PHP got this property in PHP 8.

One problem with this approach is refactoring. If you wanted to refactor your example with the parameter "amount_in_dollars" then you would either have to continue maintaining the legacy "amount" argument, or break existing code.


So you mean just like with, eg. renaming a function? I agree it's an issue compared to not doing it, but a very, very minor one IMHO, and legibility improvements far outweight it.


Renaming a function comes with the explicit implication that the API has changed. But it might not be clear to someone maintaining a Python application that changing a parameter name might change an argument - that is not the case in any other language (until PHP 8).

Guess how I discovered this issue :)


Well, if the approach was more pervasive, you'd be used to it just like seasoned Python developers are. :)


if you type check, then it ensures that only a 'second' can be passed to the function. This requires you to either create a second, or explicitly cast to one, making it clear what unit a function requires.

As per the article, if you dont have proper names and just an 'int', that int can represent any scale of time...seconds, days, whatever.

In python youd need something like mypy, but in rust you could have the compiler ensure you are passing the right types.


Having a type system to figure this out for us would be great, but there are languages where this may not be possible. As far as I know, Typescript is one such example, isn't it?


Depends. Yes, newtyping is pretty awful in TS due to its structural typing (instead of nominal like Rust for example).

You could perhaps newtype using a class (so you can instanceof) or tag via { unit: 'seconds', value: 30 } but that feels awful and seems to be against the ecosystem established practices.

This is indeed one of my gripes with TS typing. I'm spoiled by other languages, but I understand the design choice.


I was recently dealing with some React components at work, where the components would accept as an input the width or the height of the element.

Originally, the type signature was

    type Props = {height: number}
This naturally raises the question - number of what? Does "50" mean `50px`, `50vw`, `50em`, `50rem`, `50%`?

I've ended up changing the component to only accept pixels and changed the argument to be a string like this:

    type Props = {height: `${number}px`}
Of course, if passing a "0" makes sense, you could also allow that. If you want to also accept, say, `50em`, you could use a union-type for that.

I think this could actually work for other units as well. Instead of having `delay(200)`, you could instead have `delay("200ms")`, and have the "ms" validated by type system.

Maybe the future will see this getting more popular:

    type WeightUnit = 'g' | 'grams' | 'kg' | ...;
    type WeightString = `${number}${WeightUnit}`;
    function registerPackage(weight: WeightString): void;


> This naturally raises the question - number of what? Does "50" mean `50px`, `50vw`, `50em`, `50rem`, `50%`?

Because it's React, the expectation is "px", using a string with a suffix to override it: https://reactjs.org/docs/dom-elements.html#style


You can do something like this:

    export interface OpaqueTag<UID> {
        readonly __TAG__: UID;
    }
    export type WeakOpaque<T, UID> = T & OpaqueTag<UID>;
And then use it like this to create a newtype:

    export type PublicKey = WeakOpaque<Uint8Array, {readonly PublicKey: unique symbol}>;
To create this newtype, you need to use unsafe casting (`as PublicKey`), but you can use a PublicKey directly in APIs where a Uint8Array is needed (thus "weak opaque").


I tried to solve this problem in a way that's reasonably pleasant here https://github.com/spion/branded-types


The pattern you want here is branding.


You can simulate type opaqueness/nominal typing with unique tag/branded types in ts. We’re using it on high stake trading platform and it works very well.


Most answers here are answering your question literally, and explaining why you'd add a type for "seconds". But in reality you shouldn't create a type for seconds, the whole premise of the question is wrong.

Instead of a type for seconds, you'd want a type for all kinds of duration regardless of the time unit, and with easy conversion to integers of a specific time unit. So your foo() function would be taking a Duration as an input rather than seconds, and work correctly no matter what unit you created that duration from:

    foo(Duration.seconds(3600))
    foo(Duration.hours(1))
    foo(Duration.hours(1) + Duration.seconds(5))


You can construct a linter that will prevent you from trying to add seconds to dollars.


A better example would be seconds to minutes. I think i recall a jwt related cve related to timestamps being misinterpreted between sec/ms for example.


Adding seconds to minutes can actually make sense. A minute plus 30 seconds is 90 seconds, or 1.5 minutes. Whether your type system allows this, though, is up to the project.

You can't add seconds to kilometers, or to temperature, or to how blue something is.


Ideally you have a compiler that simply refuses to compile ambiguous code


Ideally, there is no compiler.


Ideally, you would detect all errors before runtime. Usually the compiler is the last gate to make that happen.


Ideally, hardware execution would happen on a language that can be proven correct and does not allow the programmer to make syntactic or semantic errors.

The world is far, far from ideal.


Isn't there an implicit conversion?


Well your function could then accept multiple types (Miliseconds, Seconds, Minutes, Hours) and do the conversion between those implicitly.

The units are also extremely clear when they are carried by the type instead of the variable name, where a developer could e.g. change some functionality and end up with a variable called `timeout_in_ms` while the function that eats this variable might expect seconds for some reason.

If it is typed out you can just check if the function performs the right action when passed a value of each time unit type and then you only ever have to worry about mistakingly declaring the wrong type somewhere.

But wether you should really do all that typing depends on how central units are for what you are doing. If you have one delay somewhere, who cares if it is typed or not. If you are building a CNC system where a unit conversion error could result in death and destruction, maybe it would be worth thinking about.


Tells you what the integer represents, so you aren't off by several orders of magnitude.


What does the integer represent? Nano, milli, microseconds, seconds, minutes, hours, days?


I _really_, _really_ like F#'s 'unit of measure': https://docs.microsoft.com/en-us/dotnet/fsharp/language-refe...


This. I really miss it when working outside of F#. I work on scientific code bases with a lot of different units and have been burned by improper conversions. Even with a high automated test coverage and good naming practices, such problems can go undetected.


It's a big omission that it doesn't support fractional units. These come up in things like fracture mechanics for stress intensity.


Are you talking about ksi√in or MPa√m? That is not a problem.


F#'s units don't support 1/2 dimensions. So you can't do dimensional analysis through the type system.


God I love F#'s unit types. C# is okay and all, but F# is, IMO, the best FP-language-on-a-corporate-backed-VM ever, even if the integration with the underlying VM and interop can get a bit fiddly in places (Caveat, my opinion is about 7 years old).

Yeah, you heard me Scala.


Sadly it hardly got better, C# gets all the new .NET toys, VB less so, C++/CLI still gets some occasional love (mostly due to Forms / WPF dependencies), and then there is F#.


>Having everything work this way makes it so much easier to review code for errors.

Had one making me pull my hair put the other day in C#. C# datetimed are measured in increments of 100 nanoseconds elapsed since January 1st 1 AD or something like that. Was trying to convert Unix time in milliseconds to a C# datetime and didn't realize they were using different units. My fault for not reading the docs but having it in the name would have saved me a lot of trouble.


What the heck is with MS and weird datetime precision? I figured out some bug with a third party API was due to SQL Server using 1/300th seconds. Who would even think to check for that if you’re not using their products?


1/300 seconds is an odd one for sure. In the case of DateTime however, I'd say it is designed to hit as many use cases as possible with a 64 bit data structure. Using a 1970 epoch as is (logically) used for Unix system times naturally misses even basic use cases like capturing a birthdate.

It is quite hard actually to disagree with the 100ns tick size that they did use. 1 microsecond may have also been reasonable as it would have provided a larger range but there are not many use cases for microsecond accurate times very far in the past or in the future. Similarly using 1 nanosecond may have increased the applicability to higher precision use cases but would have reduced the range to 100 years. Alternately, they could have used a 128 bit structure providing picosecond precision precision from big bang to solar system flame out with the resultant size/performance implications.


God speed trying to parse dates from excel. They have bugs _intentionally built in_


Who would even think to check if the system is counting from 1970/01/01 you’re not using their products?


If they’re using ISO format I don’t really care what they’re counting from. But I care if some ISO dates are preserved exactly and some are rounded to another value… especially when that rounding is non-obvious. It took me months to even identify the pattern clearly enough to find an explanation. Up to that point we just had an arbitrary allowance that the value might vary by some unknown amount, and looked for the closest one.


This is an established standard.

It's really not any stranger than starting dates at 1 CE, or array indexes starting at 0.


> This is an established standard.

Established doesn't mean it is understandable without documentation. Anyone who is not familiar with it doesn't know why it starts in 1970 and counts seconds. You need to actually open the documentation to know about that, and it's name (be it unix time, epoch, epoch time or whatever) doesn't help in understanding what it is and what unit it is using.


The metric system is also not understandable without documentation, byt you don't need to explain it every time because every living person should have gotten that documentation drilled into them at age ten.

UNIX time is an easy international standard, everybody with computer experience knows what it is and how to work with it.


> everybody with computer experience knows what it is and how to work with it.

Thanks for the laugh.

The only ones who needs to know about unixtime is:

developers when they do something which takes it/produces it

*nix sysadmins

Everyone else with "computer experience" could live all their life without the need to know what unixtime is.


Yeah, that's what I meant. People who program or do sysadmin, ie anybody who will ever need to call a sleep function, should know what a unixtime is.


> anybody who will ever need to call a sleep function

Bullshit.

I never needed to know what unixtime is when I wrote anything with sleep().

All I needed to know is how long in seconds I want the execution to pause for, never ever I needed to manually calculate something from/to unixtime, even when working with datetimes types.


No, but as a person who has had a need for sleep() before, you are also the type of person who could be expected to know what unix time is.

Nobody is saying that the two things need to be connected, the point is that it can be name dropped in a type definition, and you would know what it means.


https://devblogs.microsoft.com/oldnewthing/20090306-00/?p=18...

Windows uses the Gregorian Calendar as its epoch.


I'm not sure I agree. When I convert fields to DateTime, I remove the unit suffix from the name. The DateTime is supposed to be an implementation-agnostic point in time. It shouldn't come with any units, and nor should they be exposed by the internal implementation.

The factory method used to convert e.g. Unix timestamps to DateTimes, now that should indicate whether we're talking seconds or milliseconds since epoch, for example, and when the epoch really was.


They do:

    .ToUnixTimeSeconds()
    .ToUnixTimeMilliseconds()
    DateTimeOffset.FromUnixTimeSeconds(Int64)
 
   DateTimeOffset.FromUnixTimeMilliseconds(Int64)

https://docs.microsoft.com/en-us/dotnet/api/system.datetimeo...

https://docs.microsoft.com/en-us/dotnet/api/system.datetimeo...


How does the C# DateTime type distinguish between dates and a point in time whose time just happens to be midnight?

Much of the C# code I've seen uses names like start_date with no indication of whether it really is a date (with no timezone), a date (in one particular timezone), or a datetime where the time is significant.

I'm certainly not a C# developer, though my quick reading of the docs suggests that the DateOnly type was only introduced recently in .NET6.


Yeah, before the new DateOnly (and TimeOnly) types, there was no built-in way in C# to specify a plain date. NodaTime[1] (a popular third-party library for datetime operations) did have such types though.

[1]: https://nodatime.org/


F# has unit support in the type system :)


An example for unfamiliar folks.

Many Languages

  var lengthInFeet = 2;
  var weightInKg = 2;
  
  var sum = lengthInFeet + weightInKg; // Runs without issue but is an error
F#

  [<Measure>] type ft
  [<Measure>] type kg
  
  let lengthInFeet = 2<ft>
  let weightInKg = 2<kg>

  let sum = lengthInFeet + weightInKg // Compile time error
More info at https://fsharpforfunandprofit.com/posts/units-of-measure/


What precisely could have been changed to make you realize that C# DateTime is not the same as Unix time? Perhaps Ticks could be renamed to Ticks100ns but I'm not sure how to encode the epoch date such that it is not necessary to read any documentation. I suppose the class could have been named something like DateTimeStartingAtJan1_0001 but obviously would have been ridiculous.

Naming is an optimization problem: minimize identifier length while maximizing comprehension.


> how to encode the epoch date such that it is not necessary to read any documentation

And you need to read the docs to know why some systems use 1970 as the reference point. Should we rename it to Unix1970datetimeepoch everywhere?


The wonder elm-units is such a pleasure to work with and does just that.

https://package.elm-lang.org/packages/ianmackenzie/elm-units...

Even if you don’t work in Elm take a moment to look at it.


Nitpicks:

Why would your type system have encoded unit for kilo-pascal, but not hecto-pascal, mega-pascal, micro-pascal etc?

If you only encode base units (e.g. seconds), then we should use exact-precision arithmetic instead of f32 or f64, which is sometimes an overkill.

If encoding all the modulos (kilo/milli/mega etc) I feel like there are some units may have name clashes (e.g. "Gy" -- is it giga-years, or gray)?

Should we encode only SI units, or pounds/ounces/pints as well?


(e.g. "Gy" -- is it giga-years, or gray)

In my opinion this is not a real problem, since ideally no-one should such meaningless abbreviations in code. Just write giga_years or GigaYears or whatever your style is, problem solved, doesn't get any clearer than that.


In defence of Gy, it isn't meaningless in astronomy - it's a very well used unit. Though I do agree that it might be less common in code.


I myself can't remember seeing Gy in astronomical papers, but I've seen Ga (gigaannum).

https://en.wikipedia.org/wiki/Year#SI_prefix_multipliers



Isn’t Gy a bit much even in astronomy? With 15 Gy you have the age of the universe right?


Your comment piqued my curiosity, and I looked at https://en.m.wikipedia.org/wiki/Future_of_an_expanding_unive... and found:

"Stars are expected to form normally for 10^12 to 10^14 (1–100 trillion) years"

So it seems Gy and even Ty units will be a reasonable scale for events during the period of star formation.


Ah, that is a good point. For some reason I was thinking only backwards. I never considered that there’s orders of magnitude more time in front of us.


Does the type system handle equivalent units (dimensional analysis)? eg, N.m = kg.m^2.s^-2.

Does the type system do orientational analysis? If not you to assign a value of work to a value of torque and vice versa, as they both have the above unit.

There are several other similar gotchas with the SI. I think descriptive names are better than everyone attempting to implement an incomplete/broken type system.


The Python package astropy does all these things. There's a graph of equivalencies between units.

0. https://docs.astropy.org/en/stable/units/index.html


Speaking of astropy units. I had a hilarious issue last week, which was quite hard to identify (simplified code to reproduce):

  from astropy import units as u
  a = 3 * u.s
  b = 2 * u.s
  c = 1 * u.s
  d = 4 * u.s
  m = min([a, b, c, d])
  a -= m
  b -= m
  c -= m
  d -= m
  print(a,b,c,d)
Output: 2.0 s 1.0 s 0.0 s 4.0 s

Note the last 4.0, while min value is 1.0

The issue is that a, b, c, d are objects when astropy units are applied and min (or max) returns not the value but an object with the minimal value, thus m is c (in this particular case c has the smallest value) so c -= m makes m = 0, so d remains unchanged. It was very hard to spot especially when values change and occasionally either one of a, b, c or d has the smallest value.

In-place augmentation of a working code with units may be very tricky and can create unexpected bugs.


This is really an issue with Python in general (specifically, mutable types).

You'd get the exact same behavior with numpy ndarrays (of which astropy Quantities are a subclass).


> This is really an issue with Python in general (specifically, mutable types).

Unit-aware values as a type where assignment as mutation is an odd choice though (normal mutable types do not exhibit this behavior, it’s a whole separate behavior which has to be deliberately implemented.) It may make sense in the expected use case (and as you note reflects the behavior of the underlying type), but more generally it's not what someone wanting unit-aware values would probably expect.


That sounds like a bug in astropy type definitions: did you get a chance to report it as one?

While it can sometimes be undefined behavior (a minimum of incompatible units), in cases like these it should DTRT.


Mutable types are hard.


I see dimensional analysis, but in this table[1], torque and work have the same unit, and that unit is J.

The SI itself states[2]: "...For example, the quantity torque is the cross product of a position vector and a force vector. The SI unit is newton metre. Even though torque has the same dimension as energy (SI unit joule), the joule is never used for expressing torque."

[1]:https://docs.astropy.org/en/stable/units/index.html#module-a... [2]:https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-...


This is the answer from boost units:

https://www.boost.org/doc/libs/1_78_0/doc/html/boost_units/F...

tl;dr is uses some sort of pseudounits to make dimensionally similar but incompatible units, well, incompatible.


Time units are messy.

Once we get to Gigayears, what is the size of single year? Just 365 days? Or which of Julian, Gregorian, Tropical, Sidereal? At even kilo prefix the differences do add up. Or would you need to specify it?

Days, weeks and months are also fun mess to think of.


> Once we get to Gigayears, what is the size of single year?

The appropriate system of units is context-dependent. The astronomical system, for instance, has Days of 86,400 SI seconds and Julian years of exactly 365.25 Days; if you have a general and extensible units library, then this isn't really a difficulty, you just need to make a choice based on requirements.


This is going to depend on the precision that you need.

(Calculations without associated uncertainty calculations are worse than worthless anyway - misleading due to the inherent trust we tend to put in numbers regardless of whether they are garbage.)


You'll want to leave the point floating to floating point numbers, but whenever you interact with legacy APIs or protocols you want a type representing the scale they use natively. You wouldn't want to deal with anything based on TCP for example with only seconds (even if client code holds them in doubles), or with only nanoseconds. But you certainly won't ever miss a type for kiloseconds.


Isn't floating point specifically for dealing with large orders of magnitude ?

(Economics of floating vs fixed point chips might distort things though ?)

Also, in the case that you meant this : you might need fractions of even base units for internal calculations :

IIRC banking, which doesn't accept any level of uncertainty, and so uses exclusively fixed precision, uses tens of cents rather than cents as a base unit ?


The standard symbol for a year is "a". Using "y" or "yr" is non-standard and ambiguous, and they should be avoided in situations where precision and clarity matter.


Thanks, didn't know. Although in astronomy they use Gy often. PS: Don't know why people downvote, your comment is useful.


Given a powerful enough type system, you can parameterize your types by the ratio to the unit and any exponent. Then you can allow only the conversions that make sense.


I don't think the parent meant to exclude hecto-pascals from their hypothetical type system.


Why wouldn't the type system be able to take care of that?


In my software project, all measurements exist as engineering units since they all come from various systems (ADCs or DSP boxes). We pass the values around to other pieces of software but are displayed as both the original values and converted units. We have a config file that contains both units and conversion polynomials, ranging from linear to cubic polynomials. Some of the DSP-derived values are at best an approximation so these have special flags that basically mean "for reference only". Having the unit is helpful for these but are effectively meaningless since the numbers are not precise enough, it would be like trying to determine lumens of a desk lamp from a photo taken outside of a building with the shades drawn.


I love when types are used to narrow down primitive values. A users id and a post id are both numbers but it never makes sense to take a user id and pass it to a function expecting a post id. The code will technically function but it’s not something that’s ever correct.


Having units as part of your types improves legibility for whoever's writing code too, not just reviewing. You won't make (as many) silly mistakes like adding a value in meters to another in seconds.


If your use of a value in units lives for so long as for it to not be clear in what unit it is in or should be in (eg. spans more than 20 lines of code), I think you've got a bigger problem with code encapsulation.

I think the practical problem stems from widespread APIs which are not communicating their units, and that's what we should be fixing instead: if APIs are clearer (and also self-documenting more), the risks you talk of rarely exist other than in badly structured code.

Basically, instead of having `sleep(wait_in_seconds)` one should have `sleep_for_seconds(wait_time)` or even `sleep(duration_in_seconds=wait_time)` if your language allows that.

But certainly use of proper semantic types would be a net win, but they usually lose out in the convenience of typing them out (and sometimes constructing them if they don't already exist in your language).


The Python package astropy extends numpy arrays to include units.[0]

It can convert between equivalent units (e.g., centimeters and kilometers), and will complain if you try to add quantities that aren't commensurate (e.g., grams and seconds).

The nice thing about this is that you can write functions that expect unit-ful quantities, and all the conversions will be done automatically for you. And if someone passes an incorrect unit, the system will automatically spit out an error.

0. https://docs.astropy.org/en/stable/units/index.html


I think you would enjoy programming in Ada.


When I first started to learn Ada, I found it one of the most verbose languages I’ve ever used. Now I find myself practicing those wordy conventions in other languages too.


Have you come across a system or language that handles units and their combinations / conversions well? I have a project I want to undertake but I feel like every time I start to deal with units in the way I feel is "proper" I end up starting to write a unit of measure library and give up.


> physical quantities should [...] be encoded in the type system.

But types are not just useful to specify the content of a variable, they are also useful to specify the required precision.

So, if there is a type for seconds in the type system, should it be a 32 bits int or a 64 bits float? Only the user can say.


A generic type would be useful then:

type DurationSeconds<Value> where Value: Numeric { ... }

DurationSeconds<UInt32>

DurationSeconds<Float64>


This is exactly what C++ 11 did with std::chrono [0] except it goes one step further and makes the period generic too.

[0] https://en.cppreference.com/w/cpp/chrono/duration


And let's not forget money. Martin Fowler has been (correctly) banging this drum for years: https://martinfowler.com/eaaCatalog/money.html


You don't need a type system to do this. Generic operators/functions and composite structures are sufficient and more flexible. Some languages let you encode this in type systems as well, but that's an orthogonal feature.


Add to this any type of conversion or relations between units.

PIXELS_PER_INCH or BITS_PER_LITER rather than SCREEN_SCALE_FACTOR and VOLUME_RESOLUTION, avoids all kinds of mistakes like inverting ratios etc.


For everyone using Python and willing to try this:

https://pint.readthedocs.io/en/stable/


I'm sorry, but this whole comment section is in some collective psychosis from 2010. You don't need to mangle variable names or create custom types (and then the insane machinery to have them all interact properly)

Have none of you ever refactored code..? Or all of your writing C?

Your library should just expose an interface/protocol signature. Program to interfaces. Anything that is passed in should just implement the protocol. The sleep function should not be aware or care about the input type AT ALL. All it requires is the input to implement an "as-milliseconds()" function that returns a millisecond value. You can then change any internal representation as your requirements change


> You don't need to ... create custom types > > All it requires is the input to implement an "as-milliseconds()" function

A.k.a. custom type!


That's also a custom type. But for libraries, custom types for everything in interface definitions are problematic. Let's say you use library A which has some functions that take a time delta, and another library B, that also has functions which take time deltas. Now both library would define their own time delta type, and you have two types for time deltas in your application and most likely need to add an additional 'meta time delta type' which can be converted to the library specific time deltas types. This will explode very quickly into lots of boilerplate code and is a good reason to use 'primitive types' in interfaces.

If you replace 'time delta' with 'strings' then the problem becomes more apparent. Each library defining its own string type would be a mess.


If you are working with a crappy language then that's probably true. But it doesn't have to be a lot of boilerplate and it can be local in scope. In Clojure it's a one-liner.

    (extend-protocol
    libA/stringable
    myLib/InputThing
    as-string [x] (myLib/to-string x) )

    (libA/some-func (myLib/make-a-thing))
And sure it can just be doing something as simple as returning an internal value directly or calling some common stringifying method. You do have a good point that you may have redundant stringifying protocols across libraries - which sucks.

> This will explode very quickly into lots of boilerplate code and is a good reason to use 'primitive types' in interfaces.

I feel you were so close and missed :) The natural conclusion would be to have primitive interfaces - not primitive types. This way libraries only need to agree to adhere to an interface (and not an implementation)


> Have none of you ever refactored code..? Or all of your writing C?

You seem to ignore the much broader context here: the article wasn't just about code. It included configuration files and HTTP requests.

It's also worth noting that depending on the language in question and the usage pattern of the function/method, using interfaces or protocols can cause considerable overhead when simply using a proper descriptive name is free.


> [...] create custom types (and then the insane machinery to have them all interact properly)

That's an argument against languages with type-systems that make this a PITA, not against the idea itself.


'Why do at build time what you could do with bugs at runtime'?


premature abstraction is the root of all evil.


You might find this interesting: https://github.com/SciNim/Unchained


There are usually libraries for doing this. For Python, there are several, for instance “Pint”.


I was excited when I found Pint, but then was disappointed : too much extra overhead for the project I was then working on.

(I settled on units in variable names instead, was essential when some inputs were literally in different units of the same physical dimension.)

EDIT : more on this, and on astropy (which I was not aware of and/or didn't exist back then) :

https://github.com/astropy/astropy/issues/7438


> Option 2: use strong types


This is just about the best justification I have heard for type calculus.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: