Hacker News new | comments | show | ask | jobs | submit login

I've been programing my own Ethereum smart contract (virtual currency) for awhile now. Here's some gotchas off the top of my head:

- You have about 500 lines of code to work with. This of course varies, but smart contracts have to be really small to fit in the max gas limit (6.7 million wei).

- You can't pass strings between contracts (coming soon).

- There are no floating point numbers. Since you're probably working with "money", this can make things tricky.

- You can break up your code into multiple contracts, but the tradeoff is an increased attack area.

- Dumb code is more secure than smart code.

- The tooling is very immature. You'll probably use truffle, which just released version 4. It makes some things easier, some harder. It's version of web3 (1.0) may differ from what you were expecting (0.2).

- The Ethereum testnet (Ropsten) has a different gas limit than the main net (4.7 million vs 6.7 million).




> There are no floating point numbers. Since you're probably working with "money", this can make things tricky.

That is actually a feature when it comes to working with money. You don't ever want to use floating-point arithmetic with monetary values due to its inability to represent all possible decimal fractions of your base unit. This is just as true for over-hyped blockchain stuff as it is for any imaginable application in the "classic" financial sector.

What you need is either an integer data type plus a fixed, implied number of digits that you want to handle (so for example 105 represents 1 dollar and 5 cents), or a fixed-point numeric type (like for example BigDecimal in Java; there's lots of equivalents in other languages, most of them have something like "decimal" in their names), which essentially just stores the integer value together with the number of digits and features mathematical operations of pairs of these with each other.


>That is actually a feature when it comes to working with money. You don't ever want to use floating-point arithmetic with monetary values due to its inability to represent all possible decimal fractions of your base unit. This is just as true for over-hyped blockchain stuff as it is for any imaginable application in the "classic" financial sector.

I feel like this is constantly repeated but is simply untrue and working in the financial sector (high frequency trading) I don't know anyone who actually uses fixed digit number systems to represent money. Normally what I do see are people outside of finance who need to represent money who read the common mantra about using a fixed digit representation only to eventually encounter all the issues that floating point was invented to solve. When they encounter those issues they then end up having to adapt their code only to basically re-invent a broken, unstable quasi-floating point system when they would have been much better off using the IEEE floating point system and actually taking the time to understanding how it works.

BigDecimal will solve the issue but it's absolute overkill both in terms of space and time, we're talking many many orders of magnitude in terms of performance degradation and it's entirely unneeded.

One simple solution that works very well for up to 6 decimal places is to take your idea of having an implied number of digits, but instead of using an integer data type, you use a floating point data type. So 1 dollar and 5 cents becomes (double)(1050000.0). This lets you represent a range from 0.000001 up to 9999999999.999999 exactly, without any loss of precision. Values outside of that range can still be represented as well but you may have some precision issues.

Another solution which is less efficient but more flexible is to use decimal floating point numbers instead of binary floating point numbers. There is an IEEE decimal floating point standard that can represent an enormous range of decimal values exactly. Intel even provides a C/C++ implementation that meets formal accounting standards.

Both of the solutions I list above are orders of magnitude more efficient than using fixed-point numeric types and are by no means difficult to use.


But you are working in HF trading. Speed is paramount there.

The common programmer has no clue about the numerical instabilities that floating point numbers are causing.

Your examples are exactly what I would expect from a capable engineer that is pondering the different pros and cons when you have to optimize your code.

Calling BigDecimals entirely unneeded tells me you are in a luxury position where you can draw from a pool of the best programmers that are out there.

For lesser programmers and for the areas where speed is not as important, BigDecimal solves the issues completely and without any clever thinking and most importantly, consistently correct without having to rely on the skills of the programmer.


Not every single aspect of HFT is insanely speed critical and even in non-critical areas we still don't use fixed decimal point arithmetic on principle. For example we do a lot of profit/loss reporting and analytics which are not speed critical, we write user interfaces and web apps in Javascript that work with money, so on so forth.

I will agree BigDecimal is fine to use if speed really doesn't matter, it's at least correct even if it's a nuclear option of sorts. What I really want to discourage is using fixed integer to represent some smallest unit of money, like using an int where 100 = 1 dollar, 1010 = 10 dollars and 10 cents. That is fundamentally problematic and objectively worse than using a floating point value.

But sure, if you genuinely don't want to think at all about the issue, use your language's BigDecimal. If you decide you want the much improved performance, then use a floating point value, either the IEEE decimal floating point standard or use binary floating point to represent some reasonably small unit fraction of a dollar, like 1.0 = 10^-6 dollars.


Honest question - why is that something to avoid? What does the floating point representation buy you that say an int64 doesn't, with the same choice of units ie. 1 = 10^-6 dollars?


What if you need to calculate the percentage difference between two monetary values?

Ints wont cut it there.


Sure they will. To truncated whole percents, (N2-N1)×100÷N1, using integer ops, gets you that, to whole percents rounded half up, ((N2-N1)×100+50)÷N1 does it. For additional decimal places, increase the fixed multiplier (and added factor for round-half-up) by the appropriate power of 10.

Even if you are dealing in units of hundreds of billions of dollars tracked to mils, this gets you up to, I think, a hundred-thousandth of a percent with int64.


But with floats you just use a divide operation.


“Won’t cut it” and “will require 1-2 more operations” are very much not the same thing.


Tracking the decimal point is all that comes to my mind.


To clarify for myself and maybe others: are you arguing against using a single integer scaled to the smallest precision someone thought they'd need N year(s) ago?

If so: totally agreed. Inflexible single-scaled-ints have been a plague in nearly every piece of production code I've seen them in. I'll take floats (well, doubles) any day over this.

Flexible scale tho (arbitrary precision / bigdecimal equivalents (I'd prefer bigdecimal over home-grown two-int variants, obv)) usually works out. And tends to avoid semi-frequent-though-usually-unimportant/unnoticed[1] issues with imprecise floats (~16 million for exact int representation[2] is awfully small, though doubles are generally Good Enough™).

[1]: usually. adding doubles/fractions often produces garbage-looking output unless you remember to round. but in my lines of work, nobody cares[1.1] if it's a couple cents off in analytics.

[2]: https://stackoverflow.com/questions/3793838/which-is-the-fir...

[1.1]: yet


Yes, we are on the same page. BigDecimal is appropriate for representing money, fixed point arithmetic is not. I simply argue that in situations where you care about performance, you can actually drop BigDecimal and use floating point decimal or even floating point binary and get perfectly good results that have exact precision so long as you model your domain.


Yeah, if you know your domain and know you're creating only an acceptable amount of error, by all means float it up. Floats are easy to use and predictable nearly everywhere, since nearly everyone follows the same spec.

Generally though, I think people don't quite do due-diligence here. And/or the domain changes enough that numbers get large enough to invalidate the earlier analysis. There are obviously exceptions to this though, and for them, they know what they're doing and why they're doing it and it's all good.


Self-quote: > any imaginable application in the "classic" financial sector

Kranar: > working in the financial sector (high frequency trading)

Okay, you got me there - you have pointed me to that single application from the financial sector where it is not totally acceptable to "waste" a few thousands of processor cycles to compute some money-related stuff. Granted, in HFT applications, these cycles may allow you to get your trades in front of those of the other HFT companies. You are forced to use CPU-accelerated (or maybe even GPU?) computations in this case, which automatically means "floating point".

But there's another difference that allows you do do this: you mostly don't have to care about exact, accurate-up-to-the-penny results. I assume most of your calculations are done to eventually arrive at a conclusion of whether to buy or sell some stock or not, and at which price. You have to take care of not accumulating too much rounding errors in the process, of course, but the threshold for these errors is set by yourself, and you can give yourself a bit of leeway on this, because it's all internal stuff, mostly probabilistic and statistics-based - the only stuff that may be audited by someone else and thus has to match the real numbers up to the last penny are the trades you do and the money you move around, and I bet all of this accounting-style stuff is recorded using...decimal numbers :D

I work in the retail industry (think cash registers, retail sale accounting, that kind of stuff) and pretty much any legislation on this planet would obliterate us if we'd tell them that the result of the computations of our systems may be some cents up or down from the real result - the one someone would get who just scribbled the numbers on a sheet of paper and added them up manually. Our customers have to pay taxes based on the sales that they account using our systems, and the tax calculations as well as the entire chain of processes they use to arrive at the final numbers are regularly audited by the respective governments. There are specific rules (of course differing by country) as to how many decimal places have to be used and how stuff has to be rounded in which particular cases in calculations that would require more decimals. We waste an enormous amount of CPU cycles just to strictly adhere to these rules - and that is not only absolutely necessary, but also totally okay, modern CPUs can easily accomodate this in our scenario.


>I work in the retail industry (think cash registers, retail sale accounting, that kind of stuff) and pretty much any legislation on this planet would obliterate us if we'd tell them that the result of the computations of our systems may be some cents up or down from the real result - the one someone would get who just scribbled the numbers on a sheet of paper and added them up manually.

It is especially in cases like this that you absolutely should not use fixed decimal point systems to represent money. It is exactly in these circumstances that your fixed decimal point system will eventually encounter a situation where it fails catastrophically.

Intel provides an IEEE decimal floating point system specifically for the purpose of adhering to legal requirements, they say so right at the top of their website:

https://software.intel.com/en-us/articles/intel-decimal-floa...

If fixed point math would have solved this issue, they'd have provided a solution involving that, but fixed point math is simply the absolute worst solution to this problem, even worse than naively using a 64-bit binary floating point number.

BigDecimal is also a viable solution, but it's entirely unnecessary and the cost really is enormous especially if you need to handle a large number of transactions. But sure, if you say performance genuinely isn't an issue go with BigDecimal.


> BigDecimal is also a viable solution, but it's entirely unnecessary and the cost really is enormous especially if you need to handle a large number of transactions. But sure, if you say performance genuinely isn't an issue go with BigDecimal.

That is exactly what we currently do, and speed of computations is definitely not an issue at all. If we have performance related problems (and we do sometimes), they have never in 10 years originated from numerical computation just taking too long, but always from other inefficiencies, quite often of architectural nature that uselessly burn a thousand times more cycles than the numeric part of the job.

BigDecimal provides exactly what we need: a no-brainer, easy-to-use, easy-to-understand and always correct vehicle for the calculation of monetary sums. It lets our developers focus on implementing the business logic, and also on making less of the described high-level failures that get us into actual performance troubles, instead of consuming a lot of mental share just to “get the computation right“ in all circumstances.


I think its generally assumed that the unqualified term "floating point" is generally referring to binary floating point, which is not the same as decimal floating point. You don't want to use binary floating point for representing money. Furthermore fixed-point refers to a decimal value with a fixed number of bits for the integral and fractional parts, which is not the same as previously suggested system with an implicit decimal point (e.g. multiplying by 10^N for N places of decimal precision). real fixed-point math can be a better and more accurate system if you're working within a well defined range. Regardless of the system you're using, there are only 2^N unique values that can be represented in N bits, binary floating point distributes them unevenly across a huge range. If you are representing a smaller range with a fixed-point system, it will necessarily give you better precision.


How on earth is using the wonkiest possible representation a better solution when you need precise results? In this case, even integer overflow would be a better failure mode than being a few cents off - because it would get noticed pretty quickly in comparison.

Hell, floating point isn't even a good way of doing imprecise arithmetic on larger numbers. What you would actually want is log arithmetic, but that's not what the hardware implements (even though it's simpler). It gives you roughly the same overall precision over each range but stops it from suddenly being cut in half as the exponent changes.


> pretty much any legislation on this planet would obliterate us if we'd tell them that the result of the computations of our systems may be some cents up or down from the real result

But you are not some cents up or down, even if you are calculating with billions, you still have 7 significant digits, that's a lot of head room for calculations.

Are there any financially sensible calculations where large numbers (order of magnitude billions) are multiplied with each other?


> [...] only to eventually encounter all the issues that floating point was invented to solve. When they encounter those issues they then end up having to adapt their code only to basically re-invent a broken, unstable quasi-floating point system when they would have been much better off using the IEEE floating point system [...]

Would you care to elaborate on some of these issues? I'm genuinely curious.


Sure, I'll embarrass myself here and use myself as an example since this is how I actually came to even learn about this to begin with.

When I started working in finance straight out of school (not in HFT at the time), I naively accepted the dogma that money should never be represented using floating points. I mean everywhere I went I would read in bold letters don't use floating point! Don't use floating point! So I just accepted it to be true and didn't question it. When I wrote my first financial application I used an int64 to represent currencies with a resolution of up to 10^-6 because that was what everyone said to do.

And well... all was good in life. Then one day I extended my system to work with currencies other than U.S. dollars, like Mexican Peso's, Japanese Yen, and currencies X where 1 X is either much less than 1 USD or much greater than 1 USD.

Then things started to fail, rounding errors became noticeable, especially when doing currency conversions, things started to fall apart real bad.

So I took the next logical step and used a BigDecimal. Now my rounding issues were solved but the performance of my applications suffered immensely across the board. Instead of storing a 64 bit int in a database I'm now storing a BigDecimal in Postgresql, and that slowed my queries immensely. Instead of just serializing raw 64-bits of data across a network in network byte order, I now have to convert my BigDecimal to a string, send it across the wire, and then parse back the string. Every operation I perform now requires potentially allocating memory on the heap whereas before everything was minimal and blazingly fast. I feel like there is a general attitude that performance doesn't matter, premature optimization is evil, programmer time is more expensive than hardware, so on so forth... but honestly nothing feels more demoralizing to me as a programmer then having an application run really really fast one day, and then the next day it's really really slow. Performance is one of those things that when you have it and know what it feels like, you don't want to give it up.

So not knowing any better I decided to come up with a scheme to regain the lost performance... I realized that for U.S. dollars, 10^-6 was perfectly fine. For currencies that are small compared to the U.S. dollar, I needed fewer decimal places, so for Yen, 1 unit would represent 10^-4 Yen. For currencies bigger than U.S. dollar, 1 unit would represent 10^-8...

So my "genius" younger self decided that my Money class would store both an integer value and a 'scale' factor. When doing operations, if the scale factor was the same, then I could perform operations as is. When the scale factor was different, I would have to rescale the value with lower precision to the value with higher precision and then perform the operation.

This actually worked to a degree, I regained a lot of my speed and didn't have any precision issues, but all I did was reinvent a crappy floating point system in software without knowing it.

Eventually I ended up reading about actual floating points and I could see clearly the relationship between what I was doing and what actual experts had realized was the proper way to handle working with values whose scales could vary wildly.

And once I realized that I could then sympathize with why people were against using binary floating point values for money, but the solution wasn't to abandon them, it was to actually take the time to understand how floating point works and then use floating points properly for my domain.

So my Money class does use a 64-bit floating point value, but instead of (double)(1.0) representing 1 unit, it represents 10^-6 units. And it doesn't matter what currency I need to represent, I can represent currencies as small as Russian rubles to as large as Bitcoin, and it all just works and works very fast.

64-bit doubles give me 15 digits of guaranteed precision, so as long as my values are within the range 0.000001 up to 999999999.999999 I am guaranteed to get exact results.

For values outside of that range, I still get my 15 digits of precision but I will have a very small margin of error. But here's the thing... that margin of error would have been unavoidable if I used fixed decimal arithmetic.

Now I say this as a personal anecdote but I know for a fact I'm not the only one who has done this. I just did a Google search that led to this:

https://stackoverflow.com/questions/224462/storing-money-in-...

The second top answer with 77 points yells in bold letters about not using floating points as a currency and suggests using a scheme almost identical to the one I described, where you store a raw number and a scaling factor, basically implementing a poor-man's version of floating point numbers.


Thanks for the detailed explanation of your thought process.

I guess you fall into the few percent of developers who are NOT meant to be addressed by the general rule of "don't use float for money". Like with all "general rules" in software development, it is intended to guide the >90% of devs who don't want and sometimes also aren't capable of fully grasping the domain they're working in and the inner workings of the technology they use. The majority of devs need simple, clear guidelines that prevent them from making expensive mistakes while wielding technology that's made up of layers and layers of abstractions, of which some (or even most) are entirely black boxes to them. "Playing it safe" often comes with other shortcomings like worse performance, and if those are not acceptable in a specific scenario, you need a developer who fully grasps the problem domain and technology stack and who is thus capable of ignoring the "general rules" because he knows exactly why they exist and why he won't get into the troubles they are intended to protect you from.

I have long thought that every developer under the sun should strive to get up to this point, and I still think that it is an admirable goal, but I came to understand that not all developers share this goal, and that even if someone tries to learn as much as possible about every technology that he comes in contact with, he will never be able to reach this state of deep understanding in every technical domain imaginable, as there are just too many of them nowadays. We all, no matter how smart we are, sometimes need to rely on "general rules" in order to not make stupid mistakes.


You know there's a numeric type in postgres? Not quite arbitrary precision, but large enough for pretty all practical purposes "up to 131072 digits before the decimal point; up to 16383 digits after the decimal point"

https://www.postgresql.org/docs/current/static/datatype-nume...


NUMERIC (without a precision or scale) is a variable-length datatype, and suffers all the performance problems mentioned above.

If you specify a precision and scale, performance of NUMERIC improves quite a bit, but now you can’t store values with vastly different magnitudes (USD and JPY) in the same column without wasting tons of storage on each row. You’re back to square one.


Thanks a lot for this answer -- the real-world scenario including your thought process is very insightful.


I worked on the exchange side(CME Group) and we used BigDecimal (and an in-house version that worked in a very similar way) quite often... Worked just fine for us as far as I know.

That being said, I agree with your sentiment that there are faster/more efficient ways to accomplish this. For something like smart contracts, I'd think they should be able to abstract this line of reasoning into an API that's consumable for those of us that aren't as familiar. Or is at least worth having the option of.


Additionally, the most widely used programming language in finance, Microsoft Excel, is based entirely on IEEE binary floating point arithmetic (though with a few hacks on top [1]).

[1]: https://stackoverflow.com/a/43046570/392585


True, but the number of weird mistakes due to floating point inaccuracies that happened in Excel spreadsheets is probably too large to be accurately representable in a 32bit float number.


What you are saying may be true of finance. But it's not true of accounting, in which exact figures are paramount.


I think a good system to look at copying in the real world is the Czech Koruna or Japanese Yen. It is a real world example of a currency that doesn't use decimal values. This simplifies the math involved and by having each unit have a very small value you can replace cents, and you don't have money "dissapear" due to being rounded off.


True, but the extra code around those solutions cost gas, of which your looking at around 500 lines worth.

How long is BigDecimal.class?


Providing a decimal type would of course not be the job of the developer of a smart contract - I would expect that the smart contract platform would anticipate the necessity of convenient decimal number handling and thus provide a suitable abstraction.

In the case of Ethereum, it could either be provided in the language or on the virtual machine level. I'm not that much into the EVM's inner workings to judge whether a machine-level implementation would be a good idea from an architectural viewpoint, but it would definitely deliver the best possible performance and lead to the smallest amount of contract code, thereby saving on gas to deploy the contract. As a second-best solution, the language could provide such an abstraction - it would at least be able to apply optimizations when compiling the code down to EVM opcodes, which an implementation purely on the contract level would not be able to do.


EVM is a 256-bit architecture, so it probably doesn't a BigDecimal equivalent for financial transactions.


Likely you don't need a generic implementation of Decimal.

What money do you want to represent, they all have a "atomic value".

For instance, when working with dollars store cents. When working with Eth, store wei. etc.

I can't think of a use case of money that needs decimals except maybe computing ownerships percentage. but that should never be stored, but rather computed.

Anything that needs to be converted is a "front end" view. all computations should use atomic store of value, thus no conversion is needed in contracts.


You might need to work with fractions of "atomic values" to get the desired level of accuracy. So you have to store not logical cents, but 1/100th of cents or something like that. Much easier and straightforward just use decimal-like type.


That's what happens in e.g. the Maker stablecoin project. Decimal fixed point to give sub-wei precision when prorating per-second compounding fees. Other than the normal simple arithmetic operations, we use "exponentiation by squaring" to take a decimal fixed point raised to an integer power.


Yep, when doing financial stuff, those half cents can count.


> What money do you want to represent, they all have a "atomic value".

This is mostly not true.

> For instance, when working with dollars store cents.

While external transactions often must occur in cents, internal accounts and unit prices often have smaller amounts. If you don't believe me, visit any US gas station and the prices will be in mils, not cents (and will usually be one mil less than some full-cent value per gallon.) Atomic units for financial applications are application-specific if there is an appropriate value at all, they aren't trivially determinable by looking at the base currency.

In some cases you really want an arbitrary precision decimal (or even rational) representation.


>For instance, when working with dollars store cents.

How does this work of you are, say, coding for a gas station where the price is in 1/10ths of a cent?

Alternatively, how does your banking application handle adding 10% interest to a bank account with 1c in it?


Some very good points. I recall that the mainnet once had a gas per block limit of 4.7 mil before it was increased. The increase is something that is voted on by the miners, and has been fluid over the years. (Btw, it is something very similar to a block size limit in BTC, although it limits the storage + CPU usage, rather than just storage like in BTC)

About the point on using floats, one should never use floats when working with money, this is because floats are not precise and result in rounding errors (eg. 0.1 + 0.2 results 0.30000000000000004, see for details http://0.30000000000000004.com ). One of the simplest approaches to solve it is to work with the smallest units, so if working with dollars then you can use cents and that means you can use integers which give more precision, which is how Solidity currently deals with it, by working with the smallest unit of Ether which is 'wei'. Some of the units listed here https://etherconverter.online


And it's exasperated by the fact that most Tokens use 18 decimal places of precision. So "1" would be 1000000000000000000. As a programmer, I just found it harder to reason about numbers like that.

I ran into problems testing my solidity contract with javascript, because javascript isn't great at big numbers. Or division.


Dang you got there 3 minutes before me!


Don't worry, I'm sitting in the same boat ;-)


> You can break up your code into multiple contracts, but the tradeoff is an increased attack area.

This is the way to get programs over 500 lines, and it doesn't increase the attack area as much as you'd think, since you can essentially hard-code the addresses of outside contracts into your code--it's not a combinatorial explosion.


> and it doesn't increase the attack area as much as you'd think

You should call Gavin Wood and tell him that :D


That makes sense, thanks. I hadn't thought of making it hard-coded (or just set at construction). The examples I'd seen were "upgradable" tokens.


Yeah, I did a lot of upgradeable stuff early, but have since moved to more hardcoded stuff for that exact reason.


A question about the 500 lines of code and tooling limitations. Is this something that is inherently an issue of the underlying technology or do you predict that this is something that is likely to be improved upon over time? And if you think it'll be improved, are we years away from big improvements? Less?


The 500 lines of code is, to be clear, just a guesstimate. It really is related to how those lines of code are compiled into Ethereum opcodes. Maybe you can get a thousand lines of code. My point is that it's pretty small, compared to what most developers are used to.

The max gas limit is voted on by Ethereum miners. They recently raised it to 6.7 million (I believe it was 4.5 million not so long ago). The gas is what is used to store the contract (thus increasing the block chain size), or to execute code (thus increasing the work load on the miners), so it's up to them.

More Eth = more $$, but they have to weigh that against the downsides. If the price of Eth keeps going up, they have less incentive to increase the gas limit, because they get more money for doing the same work.


Aside from the gas limit there's a hard 24KB limit on bytecode size, to deal with a denial-of-service vulnerability that cropped up last year. Possibly that will be improved in some future upgrade.


re: tooling

I've been using Truffle too. My biggest issue when getting started was the cognitive overload. If you want to try Truffle I wrote a couple of simple guides that helps you deploy your first Ethereum smart contract to a test net https://blog.abuiles.com/blog/2017/07/09/deploying-truffle-c...


Interesting point on the lack of floating-point support. I've dealt with storing fiat in real world software and using floating point is a big no-no due to the fact it is only an approximation by its very design. We therefore always stored money as integer cents/pennies etc. How is this approached with digital "money" where there is no smallest divisible unit?


Most of the time by using 18 decimals. One "Ether" is 10^18 "wei", the smallest unit of currency. Most cryptotokens built on ethereum also use that many units...but not all.

Bitcoin is, IIRC, 8 units of precision.


Ah that makes sense - if Bitcoin keeps going at its present rate could 8 units of precision simply not be enough? Would a hard-fork be the solution to this?


If one bitcoin were worth $100 million, then one "satoshi" (the smallest unit in bitcoin) would be worth a buck. So, 8 units of precision is probably plenty.


I'm not sure if there are infinitely divisible currencies? The cryptocurrencies I know (Ethereum, Bitcoin) _do_ have a smallest divisible unit (wei and satoshi respectively).


For passing strings between a contract you can return the string as Bytes32 and let the remote contract query a function to access that.


Wait you can’t call another contract’s function (from a contract) if it takes a string as argument?


just curious, what are you doing with ETH contracts? working on a product or just hobby?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: