Hacker News new | past | comments | ask | show | jobs | submit | victor82's comments login

The solution exposed don't seems competitive enough to solve with good rank the problem exposed here https://highload.fun/tasks/14

That site is really good to play with SIMD code.


The solution codes are not available as you can continuously update your solution.


What is actually the task here? Parsing is clear and it looks like one has to sum all parsed timestamps, but what does the additional comment about the shortest time mean?


It means that your solution is scored on how fast it runs.

Frequently we end up with solutions that use tricks that would not be applicable to the n=1 problem in TFA ("parse 1 timestamp") when we try to get the fastest solutions for "find the sum of all these timestamps."


How can I see Yuriy’s top solution for this task?


You can ask him but he probably won't share


Kind of a weird leaderboard if you can’t even check to see if answers satisfy correctness. I’ve seen plenty of leaderboards where the top answers would fail cases that weren’t covered by the leaderboards testing.


I think it's not such a big deal because the input generators are usually pretty good and because most solutions that give up some correctness for performance can pretty cheaply detect that they done goofed and fall back to a slow path where they start over and do it some other way. This ends up being useful because scoring is the median runtime of 3 successful runs (so if you fail one of the 3 then your submission fails completely). It also means that probabilistic solutions that would be correct in non-adversarial environments (like using a bloom filter or N bloom filters tuned to give an exact result given the input constraints with probability >99.9%) are admissible.


This particular solution does satisfy correctness, although I can't share the source code (as the competition is ongoing). Feel free to provide your input data and I'll run it to compare with your expected result.


Yeah, most of the leetcode top solutions are a hardcoded table of answers for the scoring questions followed by the general solution.


Is the code for some of the solutions available? Thanks for the link. That's a fun place.


Seems there is not bytearray implemented, can't test further :(


It is not explained why points 31 to 36 are valid, unlike HDL synthesis tools, C compilers are pretty deterministic, same input same output, except timestamps, etc.

If an UB adopt X behavior one time, a reproducible build will take same X behavior the next time.

Of course I am not negating the fact that this undefined in the first instance nor condoning the UB usage.


That is the observed behavior of the compiler not the defined behavior. (and while there’s some logical reason to assume that will likely continue to hold, the list is falsehoods programmers believe about undefined behavior, not observed behavior.)


(Post author here.) Agreed with the parent comment. To restate the same thing in another way in the hope of avoiding confusion:

It isn't a bug in the compiler if the compiler is non-deterministic in the presence of UB. Such behavior does not violate any guarantee provided by the compiler or language spec.

In fact, it's still not a bug if the compiler is non-deterministic, full stop. Many (most?) compilers have non-deterministic behavior due to a variety of factors. To get a sense of what's involved in getting determinism, look up some articles on why it's difficult to get reproducible builds working. It's a lot harder than it might seem at first.


Determinism is a QoI issue. It is an important QoI feature though and I think most compilers meant for production use at least strive to implement it. I guess things like parallel, distributed compilation make it quite hard.


"but it worked fine before" talks about the observed behavior


Yes. That is the sub-category of “among all the falsehoods about UB, these are the falsehoods that can be categorized as [improper reliance on] OB”.


> C compilers are pretty deterministic, same input same output, except timestamps, etc.

"Sometimes gcc will opt to use a randomized model to guess branch probabilities, when none are available from either profiling feedback (-fprofile-arcs) or __builtin_expect. This means that different runs of the compiler on the same program may produce different object code." (from the gcc manual)


Looked into this and it appears it was not the case; the docs were confusingly written, and the docs appear to have been rewritten some time ago

https://marc.info/?l=gcc&m=102068997020453&w=2


I was curious about that as well. If it is just an extension of 37-40, then that makes sense - since UB may or may not effect different runs of the same executable, then certainly the same is true of different runs on different builds of the same executable. But that is a confusing way to put it - it sounds like it is saying that reproducible builds are not possible, which I am skeptical of.


Location: Santiago, Chile

Remote: Yes

Willing to relocate: to USA/Singapore/Netherlands/Australia/UAE/Germany

Technologies: FPGAs, Verilog, AmaranthHDL, C/C++, Python, Golang

Résumé/CV: https://www.linkedin.com/in/victor-mu%C3%B1oz-79955093/

Github: https://www.github.com/vmunoz82/

Email: who_hn11@avoid.contact

Looking for HFT/ultra low latency, RISC-V, RTL or high scalability related position.


You can avoid using abstractions and write it directly on the silicon, it's not really difficult, it turns into a bunch of muxes, registers and adders (as shown in the link below) and solves the problem in just 333 clock cycles, using a minimum of power

https://gist.github.com/vmunoz82/49de4c63bee1768283162ec5406...

  class Euler(Elaboratable):
    def __init__(self):
        self.output = Signal(19)

    def elaborate(self, platform):
        m = Module()

        count5 = Signal(3)
        c3, c5 = Signal(17), Signal(18)

        cond3, cond5 = (c3 < 1000) & (count5 != 0), (c5 < 1000) & (count5 < 3)

        m.d.sync += count5.eq(Mux(count5 == 4, 0, count5+1))
        m.d.sync += [
            c3.eq(c3+3),
            c5.eq(Mux(cond5, c5+5, c5))
        ]
        m.d.sync += self.output.eq(self.output +
                                   Mux(cond3, c3, 0)+Mux(cond5, c5, 0))
        return m


Amaranth (previously nmigen) is super awesome declarative ways to build circuits. I like to think of it as a superior verilog alternative that lets you do RTL design in Python. Being able to use python constructs to dynamically build logic like in the example is super useful. You can integrate it with other things like scikit-learn, numpy, and script to build things like filters and hardware accelerators.


This is pretty neat! Is there a way to have arbitrary input for the number of cycles and actually emit the answer? It looks like the answer is being read in software from a register, if I understand the .vcd output correctly.


Well, that code is module without any context (ie: connected to nothig), it just have 2 inputs (clk and rst), and an "output" with the current sum, after it reach the desired value it keep stuck there (until reset).

So as the simulation code shows, it just tick the clock (with the "yield" statement) and read the "output" register and print it.

Of course you could connect this "Project Euler 1 accelerator" to a CPU, a SoC or whatever, or just connect LEDs directly to "output"


I'm not sure if I think this superiour to Verilog. It's way less readable but it does have the advantage of having insane metaprogramming ability. It's basically a circuit templating language. Really sucks the syntax is so unreadable.


According to golem.de's report, Prodigy has its own VLIW ISA, and Qemu achieves compatibility with the other mentioned ISAs.


  Location: Chile
  Remote: Yes
  Willing to relocate: Could be, depends on the place and project
  Technologies: C, Golang, Python, VHDL, Verilog, Amaranth HDL, FPGAs
  Résumé/CV: https://www.linkedin.com/in/victor-mu%C3%B1oz-79955093/
  Email: victor at avoid dot contact
  Github: https://github.com/vmunoz82
 
Hi! I am Victor, I have 20 years of experience working in technology, in this moment I enjoy working developing algorithms directly in RTL (FPGAs), I love to design scalable stuff from scratch.

If you are looking for hands on co-founder engineer, CTO, or want to implement algorithms in hardware, would be good to get in touch.


Thank you for sharing you all digital radio project!

It is a nice example of nMigen usage, sadly there are not many ultra-cheap FPGA boards with 5GHz SerDes to try it.


Thank you for the write-up, I love the way you use CMBC to construct your algorithm, seem like a way to synthesize code :) Thanks also for point CMBC, I was looking for something like that for years!


If you are already familiar with Python I could recommend start with this tutorial: https://github.com/RobertBaruch/nmigen-tutorial


Also, icestudio (https://icestudio.io/) is a gentle way to get introduced with minimal knowledge to the verilog side of things. As others have posted, the learning curve for HW can be steep and development tools tend to be in the dark ages as compared to those for SW development. eg. you will spend a lot of time hitting your head on a wall with debugging HW if you arent careful to follow a process of linting and simulating your design well before going to HW.

nMigen is highly recommended.


Excellent. Thanks to you both!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: