Hacker News new | comments | show | ask | jobs | submit login
An ARM killer from IIT-M? (factordaily.com)
187 points by prando 7 months ago | hide | past | web | favorite | 124 comments

To be honest though building a processor these days is not exactly difficult. That is especially true if you start with someone else's ISA and they have already created a GCC or LLVM back end for it. I was at Sun during the development of SPARC and as part of the Systems group we got to see a lot of the trade offs up front but these days transistors are not nearly so scarce. If you stick to 30 - 50Mhz for your first round you can simulate pretty much power on to shell prompt in a reasonable amount of time. Then there are the process specific issues that TSMC or Global Foundries or whomever will help you with translating your HDL into their most reliable node (probably 45nm or 90nm at the moment), then you'll build a test chip and it will likely do everything you want it to do and it will cost 10x what an equally powerful ARM chip will do, so you will really really want to use your chip instead of that one if you're going to design it into something. And that something has to be popular enough that you sell at least a million of them, otherwise you're going to eat a lot of Nonrecurring engineering (NRE) cost.

It is certainly possible, but it is a long game and you have to survive the early years. Go back and read the history of ARM (and Acorn), Intel (and IBM), Motorola (and Sun and Apple), and Power PC (and Cisco). Then read the history of the Z8000, the NS32032, the AMD88000, and the TI 9900. When you look at the history from where we sit today you will see that building a new CPU is the easiest thing in the world, staying alive until it is relevant is exceptionally difficult and requires quite a bit of luck in addition to good design.

I wish these guys a lot of luck, I would love to see a fully open architecture be even half as successful as ARM has been. They have to walk into this with their eyes open though, it is not going to be easy.

That is simply not true.

Building a CPU can be "easy" if you want to build something that works. If you want to build something that is power efficient and, at the same time, quite performant, then the complexity of the problem escalates very quickly.

Sticking at 30-50 MHz? What is this even supposed to mean? Digital hardware design hardly takes clock speeds into consideration, except during physical implementation in the later stages of the hardware development.

TSMC or Global Foundries will help translating HDL into a reliable node? Do you even know what you are talking about? This "translation" that you refer to is called synthesis and physical implementation and it is done by the design house. The foundry only provided the logic cells their technology and they manufacture the final die that is going to ship on the product.

The CPUs you mentioned were simple micro-controllers, they were not designed for heavy processing tasks, they don't even have a branch predictor which is something fundamental to have some significant performance.

Nothing about your post made any shred of sense in terms of hardware design.

I'm not sure why pedroaraujo is being downvoted other than in response to his criticism of HN topcommenter Chuck McManis. Pedro is correct; toy CPUs , like what Chuck mentioned building, are so trivial that undergraduate students can build functioning models of them in a few days or hours. They are quite literally thousands of times less complex than the CPUs (even RISC arches) that are discussed in the article.

I also concur with Pedro on the clock speed part - clock speed has absolutely nothing to do with your hardware design until you have written all the HDL, simulated its behavioral characteristics, actually synthesized the logic, completed the analog and I/O components of the design, and are actually preparing for tapeout. There's a reason only AMD and Intel make large CPUs for the consumer market, and it's not because college students are too busy.

I'm going to fade into the grey background for this, but ChuckMcM's comment is yet another disappointing step toward the total calcification of this forum. It's not a commentary on the article; it simply states, lazily, and ignorantly, that whatever it is they're attempting at IIT-M, it can't be too hard. What an an incredibly condescending and blasé thing to say, especially on a public forum where what you say can't be erased.

I'm actually quite glad to see another implementation of RISC-V from outside the US. The horde of nearly identical ARM cores on the market is terribly uninspiring.

> clock speed has absolutely nothing to do with your hardware design until you have written all the HDL, simulated its behavioral characteristics, actually synthesized the logic, completed the analog and I/O components of the design, and are actually preparing for tapeout

So you're going to do all of that without even thinking about your target clock rate? Unlikely

But your comment I can respect, as you explained things politely

> I'm actually quite glad to see another implementation of RISC-V from outside the US. The horde of nearly identical ARM cores on the market is terribly uninspiring.

Totally agree

"ChuckMcM's comment is yet another disappointing step toward the total calcification of this forum. It's not a commentary on the article; it simply states, lazily, and ignorantly, that whatever it is they're attempting at IIT-M, it can't be too hard. What an an incredibly condescending and blasé thing to say, especially on a public forum where what you say can't be erased."

I suppose it could be deleted though. I would hate to be a contributor to calcification.

One of the things that stuck out to me about the article was that everyone of their stated goals would be met "instantly" if they bought an ARM architecture license. They aren't cheap but in terms of man years of effort they aren't expensive either.

As a result of owning such a license they could make processors of their own design which leveraged they ARM infrastructure (this is what Apple does for example). That could meet their 'openness' of implementation goals, their 'made in india' goals, and their 'range of architectures from controller to server'. But they aren't doing that.

I understand you read it as condescending, but my intention was to be illustrative of the part of the problems they missed in that grand vision they wrote about in the article. How would you suggest I say that such that you would not hear it as a dismissal?

It wouldn't really fit with the PR narrative for them to just license IP from ARM, and they probably wouldn't develop nearly as much technical knowledge as they would developing the architecture themselves.

Aside, I hope my comment didn't seem too abrasive. It's hard not to get attached to something you work closely on most of your life.

I expect you are exactly correct on the narrative angle. Which is sad because there are so many great stories of people who worked on someone else's technology and then branched out and started their own. The Fairchild => Intel story is just one such example. And it makes me sad when the 'narrative' interferes with the goal. If the goal is to develop a vibrant CPU/semiconductor capability in India then that effort is cultivated by minimizing risk where possible and leveraging other work. And leveraging ARM would be a great way to train up on all the things you have to have/do in order to support a range of offerings for a given ISA.

As for abrasiveness, I read it as a passion more than insult. I also read it as a signal of frustration and I like to understand the roots of the frustration in order to improve things.

> I'm not sure why pedroaraujo is being downvoted other than in response to his criticism of HN topcommenter Chuck McManis.

Because he phrased too much of his disagreement as personal attack. That's against the HN rules.

And because he took the example of chips from two generations ago (which is about how far back you have to go to find non-mainstream but reasonable chips) and says that they "were simple microcontrollers". IIRC, the NS32032 was close to equivalent to the Intel offerings of the time. Didn't even have a branch predictor? Nobody else did in those days, either.

pedroaraujo's points get lost in the vitriol and in trying to apply today's expectations to examples from 20 years ago. Most of the discussion seems to be centering on his points, which is good. But the tone of the post is certainly downvote-worthy.

It seems both of you haven't read the article itself. The new CPU design is not about high-end CPU you expect from Intel or AMD. It is about CPUs for the IoT. We are talking about Cortex-M alike CPUs you find in micro-controllers from Infineon, NXP, Renesas, and alike. BTW there is competition to the ARM design. It is the micro-controllers from NXP former Freescale, which have their own CPU design heavily used in automotive.

IITM will start with tapeout of microcontroller class(C-class) CPU. But IITM is designing mobile, server and HPC class CPUs.


I am not sure where we have indicated that the focus is only on IoT class cores. The E and C class are aimed at that segment and this segment probably will account for the vast majority of cores shipped. So they are the first ones to be shipped. The I class for example, an early version which is on-line will be a full OO core - quad issue - dual threaded.Server, HPC and vector based cores are equally important for us. The core road-map is clearly outlined on the codeline.

The comment applies to both high-end CPUs and to microcontrollers as well. Power and performance are even more critical on a microcontroller than on an high-end CPU.

And yet you yourself said:

> The CPUs you mentioned were simple micro-controllers, they were not designed for heavy processing tasks, they don't even have a branch predictor which is something fundamental to have some significant performance.

So it seems his comparison was fair, after all? Given that the article is about microcontrollers too.

Do you think ARMs don't have branch predictors or multi-level caching? Did you know some supercomputers are being designed around ARM computers? x86_64 is not king anymore.

When you post a rude reply about a topic not many people have expertise in, the only thing up/downvoters have to go on is how you carry yourself.

My first gut reaction to his reply was "troll."

Funny, my reaction to the first comment was "douchebag"...

That's fair. I suppose I overestimated the significance of Chuck's username.

You have been exceptionally rude to a commentor that has not been rude at all but given some insight about the true difficulties of the making a viable CPU business.

> What is this even supposed to mean? Digital hardware design hardly takes clock speeds into consideration

You purport to correct him and then you drop this phrase, which is questionable at best and wrong at worse.

You do design something that has different clock speeds differently. (Pipelining, power consumption, delays across different parts of the circuit). This is not anything too fancy, this is digital design/IC design 101.

Your post makes a lot less sense than his

For a well known example of how this matters, consider the Pentium 4. Intel designed it with a deep pipeline on the assumption that they'd be able to reach higher clock speeds than was actually possible. At the intended 10GHz, the Pentium 4 would have had impressive performance for the time. But they couldn't achieve that clock speed, so they had to revert to the simpler pipeline design first used in the Pentium Pro and continue development from there.

> delays across different parts of the circuit

guess that's not "digital". After all i expect say gimp to run on my old linux 2.3 thinkpad or on a magnitudes faster i7, but of course with respect to the feasability of heavy effects and huge memory loads.

> Sticking at 30-50 MHz? What is this even supposed to mean? Digital hardware design hardly takes clock speeds into consideration, except during physical implementation in the later stages of the hardware development.

I think what OP means is that you can simulate in software a 30-50MHz version of your own CPU, without having the hardware ready, so you can starting testing if i.e. Linux can boot.

I thought that what OP meant is that if you can only run the simulation for a relatively simple CPU at, say, 1 MHz, you can run and debug firmware for a 30 MHz CPU effectively, even if it's a little annoyingly slow - your 1-second init routine takes 30 seconds, but that's OK. If you have a 1 GHz CPU, whose more complicated architecture can only be simulated at 500 kHz, then a simulation is really tedious. A 20-second boot IRL of, say, Android takes half a day on your simulated processor.

That's not to say there's not someone coming in this morning to step through the overnight simulation they ran, but it's a lot easier to work with when your simulation runs nearer real time.

For what its worth this captures it well. A processor design in the 30 - 50Mhz range can be usefully simulated on hardware on the engineer's desk (a workstation class PC).

Ok then.

We clearly see things a bit differently, but I also want to separate what I said, and what you read. I don't think they were necessarily the same thing. My intention here is to share with you some of the thinking behind what I wrote.

The article states, "The Shakti project is based on the RISC-V ISA started at UC Berkeley. ... The first chip of the Shakti series will be a C-class controller chip, an entry-level processor, which would find use-cases in IoT, smart cards, and security applications."

The current CPUs in this space (IoT, Smart Cards, and Security Applications) are the ATMega AVR series, the ARM Cortex M0/M0+ and a handful of others like the PIC32 and the MSP series from TI.

Designing the logic for such a CPU is an undergraduate exercise. It is "easy" in the sense that if you have been taught VHDL or Verilog, and someone hands you a RISC style instruction set architecture which defines the registers, instructions, and flags. A group of students should be able to get something working in a semester which simulates on a VHDL test bench. Here is the call out for Stanford's EE108 Class from the Fall of 2014[1]. Which states --

"EE108b introduces students to architecture and design of efficient computing and storage systems. The main topics include: overview of key techniques for efficient systems, efficiency metrics (performance, power and energy, cost), hardware/software interface (instruction set, data and thread-level parallelism), processor design (pipelining and vectors), custom accelerator design, cache memories, main memory, basic I/O techniques, and architectural support for operating systems. The lab assignments involve the detailed design of a processor on a FPGA prototyping system. The programming assignments involve optimizing the efficiency image processing applications on a portable computing platform."

I wasn't dismissing the challenge as something kids in nursery school can do, but I do see the challenge as something of a 'solved problem' for new EE's.

You asked about "Sticking at 30-50 MHz? What is this even supposed to mean? Digital hardware design hardly takes clock speeds into consideration, except during physical implementation in the later stages of the hardware development." which might make a bit more sense in the context of the previous comment about its a well solved problem.

When you are building chips in FPGAs and ASICs the "challenge" after you have it running in a test bench, is getting a timing closure. In case you are new to this, that is making sure that signals from different gates arrive in time to be used in the next stage of the operation. This is important because the propagation of signals takes finite time and, assuming you are doing synchronous design, you want to have all the inputs settled given their worst case timing before the clock edge looks at them and does the next step.

It has been my experience that when people do their first FPGA designs and the tool tells them how fast they can run reliably, they are disappointed. So they spend some amount of time floor planning and organizing their layout to speed that up. That problem is exacerbated in silicon because the velocity factor of polysilicon wires is much lower than that of metal wires. And typically to keep down costs and complexity you will want to run your signals on the metal layers. But sticking to that can spread out your gates and size is an issue as well when it comes to cost. There is a reason that the majority of the chips for the "IoT, Security, and Smart Cards" top out at clock rates under 50Mhz. It is a 'sweet spot' for cost versus chip complexity versus circuit design. If you want to build cost effective chips in that market you might find yourself in the same design space.

As for design houses vs foundries, I am sure you can use your own design house if you choose to, I expect if you use the foundry's [2] you might avoid delays associated with disagreements between what the design house feels like the process should be able to do, and what the foundry feels like it can do. There is nothing quite so disappointing as a design team saying the foundry "should" be able to do something and the foundry saying that it is "impossible." If you want some interesting stories around that you can read the history of the SPARC 10 and Sun's interaction with Texas Instruments (foundry). But my assumption was that any fabless efforts today would go to the design services of the foundry they chose to minimize both schedule risk and the risk that the two won't get along. It may still be possible to show up at a foundry with a shipping box of masks and say "here make me some of these" but it would not be my starting strategy.

And then you said (and this was me being unclear), "The CPUs you mentioned were simple micro-controllers, they were not designed for heavy processing tasks, they don't even have a branch predictor which is something fundamental to have some significant performance."

Perhaps the reason we saw this differently is that I have seen multi-billion dollar companies depending on a computer to run their business in the back office that didn't have a branch predictor. Before my time the machines didn't even have virtual memory. As a result, I don't think the first CPU out of the gate is going to be have multi-issue superscalar pipelined performance. I don't think the folks in the article do either. You see they mentioned they were starting simpler, like ARM did, like Intel did, like IBM did, like Sun and Apple did and every single company that has ever designed and deployed a new CPU design. It is absolutely a huge undertaking to improve the micro-architecture of a modern Pentium or the Power9 or the ARM A5x 64 bit architecture. If IIT-M gets to that point, that will be at least 10 years from now, maybe 15 or 20. And understanding that as I do I know that they need to get traction with their systems early on.

> "Nothing about your post made any shred of sense in terms of hardware design."

I can certainly agree with that, because what IIT-M is taking on is not hardware design. All of the hardware design between now and their first $100M is all well understood and doesn't push the envelope at all. They are trying to bring a new system into the world, one that has an ecosystem of software, peripherals, and a variety of CPU implementations. The article gave me the impression that the author of the article felt that if they just built a new CPU it would usher in a new era of computers from India. I think that is a necessary but insufficient step along the road, and that there will be many challenges unrelated to hardware design that will threaten the success of this effort decades before they have engineers wondering which form of register coloring provides fewer bubbles in a hyper-threaded execution pipeline.

[1] https://lagunita.stanford.edu/courses/Engineering/EE108b/Win...

[2] http://www.tsmc.com/english/dedicatedFoundry/services/design...

Amd29000. Moto88000. And DEC alpha. But this is long due for India.

Recall that the US semiconductor industry did have plenty of govt help at its inception. The Indian govt can similarly provide early support by being a customer. But still, it is going to be a tough row to hoe.

> And DEC alpha.

I really wish all these "open source" chips would just steal the DEC Alpha microarchitecture and implement it. It's been almost 20 years--even the patents on EV7 should have run out by now.

It was a really clean, mostly orthogonal microarchitecture that had most of the really annoying implementation corners filed down. EV5 is a very straightforward chip and maps very well to low power implementations (see StrongARM). EV6 is pretty straightforward in modern technology nodes.

And, best of all, it's documented. Compilers exist. Operating systems exist. Old hardware exists that can be used for validation and initial development. It wouldn't require you to build an entire ecosystem up front in order to bootstrap.

IIRC one of those homegrown Chinese HPC chips were rumored to more or less use the Alpha ISA.

That being said, I wonder if at this point all the low level software (kernels, toolchains etc.) have bitrotted to the point that you're not buying much, if anything, compared to using RISC-V.

And it's not like Alpha doesn't have those irritating ISA idiosyncrasies. E.g. the most crazily relaxed memory model known to man. No byte load/store until, uh, EV56(?). And it's big endian, which per se isn't wrong in any way, you'll just be dealing with endian bugs forever and ever when porting since the rest of the computing universe, for better or worse, has converged on little endian (x86, ARM, even Linux on OpenPOWER)

Regarding the big-endian nature of it: kinda, but also kinda not. Network endianness is generally big, so it'd work well in those instance without needing to do any juggling. Plus (speaking from experience many decades ago on the 68k) reading big-endian memory was so much easier that trying to decipher things in a hex editor on ix86. Though, granted, this is both apocryphal and a minor thing to worry about.

You'd be surprised how many workloads end up spending large amounts of time in memcmp / strcmp. The chief advantage of big-endianness is that a 64-bit read is already in the order necessary for strcmp or memcmp. This means you don't need specialized instructions for fast strcmp / memcmp. Though, I would guess in a pipelined implementation with good pre-fetch logic, you'll usually not pay any time penalty (and only a single instruction in size penalty) for a byte order swap instruction in the middle of strcmp / memcmp.

> No byte load/store until, uh, EV56(?).

I believe that you can blame a MIPS patent for that, IIRC.

> And it's big endian, which per se isn't wrong in any way, you'll just be dealing with endian bugs forever and ever

Actually, that was one of the biggest benefits to porting Linux to Alpha. A HUGE number of assumptions got cleaned up. I suspect the current ARM ports of Linux would be almost a decade behind without that.

That's what some people were doing with SuperH; I don't know how the project has progressed.


[If you wanted to resurrect the Alpha, though, please do not resurrect the crazy memory consistency semantics, where you need to put a full memory barrier between

    r0 = load(&x);
    r1 = load(r0);

Do you have any pointers to good documentation on the Alpha memory semantics?

The memory semantics are of course so lax so that the processor doesn't need to check the computed addresses of all in-flight stores before issuing a concurrent memory operation. My understanding is that it's mostly a footgun for assembly programmers, but I'd love to see any estimates of the speed and power efficiency benefits of the lax memory semantics.

This is one of the raging debates within the RISC-V group and elsewhere. Is weak memory semantics really worth it or can x86 style TSO be implemented optimally.Lot of anecdotal evidence but no conclusive experiment. As part of our work we hope to try multiple memory models in our I-class and get some definitive answers.RISC-V by the way supports ARM/PPC style weak memory model or optionally an x86 style TSO.

ARM/PPC is fine, but Alpha took it too far in my opinion (and also didn't have a read memory barrier instruction---nowadays acquire and release memory barriers would probably be preferrable).

How else will I be able to bank the lower-level cache for the absolute highest in performance?

+1 for using the archaic phrase "tough row to hoe", and for using it correctly. :-) In the few times I've heard someone say it, they say "road to hoe" which doesn't really make sense.

Also, MIPS, founded by the original RISC paper guys. Thirty years ago DEC was selling DECstations with MIPS R3000 CPUs, where you could enjoy the occasional program crash (and "core dump") due to memory alignment errors, under BSD 4.3 or so.


Depends a lot on what you are building. If you're talking about a von Neumann machine with symmetric registers -- basically a PDP-11 derivative aka "C machine" -- then you're right: you "simply" have the hard part of, you know, designing a chip, which is an art, and at any appreciable clock frequency, de facto mixes digital and analog design.

But as soon as your stray off that path you're on your own (and you still have the chip design problems as well). The Mill guys have been at it 14+ years (admittedly none full time AFAICT).

The Mill folks are a good example of going right for the high end.

Well, first of all this is for an IoT CPU, so there will probably be less competition there and it can probably be tuned to specific niches. But I have to disagree that building a new CPU is easy. This is only true if we're talking about "toy" CPUs like some of the RISC-V implementations put out by UCB undergrads, these run at ~200MHZ clocks and with sub-1 ILP. If you want to make a high end and energy efficient CPU, it's still quite a difficult task and most novel CPU projects (e.g. the Mill, I guess the Adapteva Epiphany might count, although it's not really a CPU) don't have an existing GCC/LLVM backend.

I think the best way that RISC V can be successful is to market to people who want open hardware which they are doing . IoT would be a good place to market it or just electronic prototyping for Makers. The HiFive does this.

Then why do so many companies spend so much money buying ARM cores, they have the ISA (more or less) and the compiler, why doesn’t every chip design company make their own processor? I know for a fact that some companies that had their own line of ARM competition ended up using ARM in the end

Adapteva has nice write-ups on the challenges and a solution that worked for them in their market:




There's also companies that are a cross between ASIC's and FPGA's called Structured ASIC's that can get the initial NRE down for a prototype. eASIC and Triad Semiconductor are two. eASIC has a lot of IP on it already.

The last point is that operations such as DARPA show that government funding can knock out the worst of the NRE. India is using government funding. The reason we normally see things done that way still being very expensive is that those receiving the government funding are trying to maximize profit, get lots of patents, and eliminate competition that threatens that profit. An alternative for those aiming for public benefit is to work out the bugs from an initial CPU or SOC whose NRA was government backed to then release just over cost or at a decent markup. Each new wave of funding that produces new iterations soaks up the high NRE with the steady sales revenue doing the rest. Might work.

OpenPITON and PULP are two things that came to mind as having potential there with one an alternative to SMP/multicore in server side and other for embedded. PULP actually used that model that I can tell.



Glad to hear that they'll be taping out one of the Shakti chips so soon. Can't wait to see what comes of the higher-end server (or workstation, if you're me) chips.

That said...

> “The capitalist computing bourgeoisie want to enslave us all with proprietary processing architectures, but the proletariat eventually produces its own processor alternative – an ISA for and by the people, where instruction sets aren’t subject to the whim of the royalty-driven class, and where licensing fees don’t oppress the workers’ BOMs (bill of materials),” writes Kevin Morris in the Electronics Engineering journal, lending colour and gravitas to what’s at stake in the processor industry.

Comparing RISC-V to communism is pretty grody. RISC-V is precisely a capitalist revolution, shedding a layer of state protection of the ISA from the market, and unleashing the greed, passion, and proprietary zeal of the world on the task of bringing designs to market.

RISC-V was developed by the state, in particular, by UC Berkeley, a public school. And IIT Madras is also another public school. So again, a branch of the Indian state is developing this Shakti processor.

There's really nothing capitalist about this at all.

I'm sure Bluespec, TSMC, SiFive, NXP, Samsung, Qualcomm, Microsemi, Micron, Huawei, and NVIDIA don't think of it like that, the RISC-V foundation is also not UC Berkeley. Also, strictly speaking the UC system ain't the state, if they were, they would not be allowed to generate copyrights from work produced by employees. IBM at least partially (if not completely) covered the fabrication of the original UCB RISC-V chips.

Though I'd generally agree in the case of IIT Madras, which is specifically engaging in a state effort to reduce import dependence in microelectronics (especially for defense and national infrastructure).

Strictly speaking, the UC System is the state. That IBM funded some of RISC-V doesn’t make RISC-V a capitalist product nor does TSMC manufacturing some silicon. The US Government and the University of California can claim intellectual property rights including patents, copyrights and trademarks. That SiFive is incorporated to exploit RISC-V as was Sun Microsystems to exploit 4.2 changes nothing.

The U.S. government can not claim copy rights on work produced by government employees. Read the statute before commenting on it, especially if you're using it as a point of disagreement.

The one way the U.S. government can come to own copy rights is by purchasing them from original rights holders. No copy right exists for any work of a U.S. government employee.

You may wish to read 15 USC § 290e. There are other exceptions.

That is why most likely they will not succeed in economical sense. Commercial tech space is just way too fast for governments to follow efficiently.

On the contrary, we started Shakti because our lab can track technology faster than a commercial outfit ! I am an ex-industry architect and a startup CEO who came back to academia started this effort because the pvt sector was moving too slowly wrt CPU research and deployment. Done well govt entities can move really fast once you figure out how to deal with bureaucracy. And we can dream way bigger.

I agree, having worked in the American national lab system I can say that government research can easily keep pace with industry. Especially in risky or inherently costly industries, I'd argue that government funded research labs that work in collaboration with industry and academia are possibly the best equipped.

Thank you for the effort to free knowledge.

That is a compliment I will gladly accept. Look I come from a long line of theologians and philosophers dating back a millenia plus . The Indian ethos of not charging for knowledge has prevailed this long and I am not about to break a tradition ! It is not that we are averse to the ways of capitalism, folks from my family have funded and run some of the biggest tech startups in the valley. I have also run startups earlier and am incubating a couple now. So I know what free money implies. But the core ethos of Shakti is openness and that will never change. See my other effort - finlibre.org. Open source in banking and insurance.

I hold that there is no incompatibility between charity and capitalism. The market is an emergent behaviour of the personal desires and values of individuals. There's nothing anti-market about wanting to contribute to a public good voluntarily, nor to taking work that is ideologically more attractive to you (even at reduced pay).

It is not charity. Computing has become a public infrastructure like roads and water supply. So it is by nature a public good. The underlying foundation has to be based on the principle of the commons and not controlled by one entity. In processors just an ISA will not do, there needs to be open and state of the art implementations at every segment. That is the goal of Shakti. For example We also have GPS/NAVIC receivers and LTE modems in the works. Our comm research groups are far large than our compute groups, so no manpower shortage. They in fact came up with a short lived WiMax like standard which saw wide deployment via IIT incubated startups.

We shall see. Free money has hidden costs.

Really? Free and open source software—in this case, an unencumbered ISA—seems like an easy parallel to communally managed means of production.

Of course an implementation would be far more useful, as it would be the likely target for lawsuits.

I wonder if the C-bashers were saying something similar when Lisp machines were reigning unchallenged.

It's definitely to be welcomed, but India barely has the kind of interests, state-apparatus or companies that'd want something like this to succeed. India neither has Baidu, nor Tencent, nor Wechat. They have Flipkart, which is barely an Alibaba, and is on a long drawn collision course towards merging with Amazon. The startups I've seen generally seem to target foreign markets, or to service people who service foreign markets; this is inherently a tiny subset of India's population.

Considering the 'prestige' and money that comes with working in the US, I'd be very surprised if the country can ever accumulate enough talent to do anything fundamentally significant (esp. since all of the relevant Education/Industry is entirely Anglophone).

I also feel there is generally a lot of self-flattering that goes on, often for this very reason. I'm old enough to remember the embarrassment that was to be 'India's answer to OLPC' (also conceived at an IIT of note).

Of course any new venture has high probability of failure.

However, But India barely has the kind of interests, state-apparatus or companies that'd want something like this to succeed. They have Flipkart, which is barely an Alibaba...

Flipkart last week announced a Designed and engineered in India Smartphone [1].

And this was done by a homegrown company smartron.com. Their website shows how heavily invested they are in IoT (apart from mobile phones). This company was founded by a Mahesh Lingareddy, who it seems was a co-founder for a company which was acquired by Intel for $250 million.[2] - so he has all the resources (or credibility to raise funds) to put in place the execution plan for their vision of "India’s first global brand with focus on innovation, design, engineering, products and platforms." [3]

Or in other words, things are changing - past is not an indication for future failures.

[1] www.flipkart.com/billion-smartphone-launch-store

[2] https://www.msn.com/en-us/finance/technologyinvesting/2-acqu...

[3] http://www.smartron.com/about.html

More about Smartron: www.thehindubusinessline.com/info-tech/smartron-plans-rs-1500cr-investment-to-expand-operations/article9622268.ece

Thanks for pointing this out. I was much too cynical in the above post.

Why do Baidu and Tencent count, but Infosys and TCS not count? Each of these have companies with >10B USD of revenue, so it's not like the Chinese companies are any larger. And don't tell me, it's because of the difference between product and service companies, because that is meaningless distinction from the perspective of a hardware-marker. They all need hardware to run on and arguably a service company needs more.

> Considering the 'prestige' and money that comes with working in the US, I'd be very surprised if the country can ever accumulate enough talent to do anything fundamentally significant (esp. since all of the relevant Education/Industry is entirely Anglophone).

That was true maybe 10/20 years ago. Right now, thanks to Trump and Republicans, the US is much less attractive destination, especially for top-tier researchers. NSF funding rates are in the single digit percentage range, every year seems to see a new assault from state-level republicans on state university systems. Rising xenophobia and white nationalism obviously don't help the cause.

The situation in India is way better. There is more money available for research: the Shakti project is one example but many other big centers are being funded in various "institutes of national importance." There is a bi-partisan commitment to increasing funding and hiring for Indian research and there is a growing talent pool emerging from the non-IIT undergraduate colleges.

Twenty years ago, every R1 university in the US was better than every university in India. Now, once you get past the top-10 or 15, beyond the Berkeleys, MITs, Princetons and maybe Purdues, it's not at all clear that it is a good idea for a young assistant professor to go to say Virgina Tech over IISc or IIT Bombay. And in fact, these Indian institutes are consistently hiring people who would have been a shoo-in for a faculty position at places like VT.

> Why do Baidu and Tencent count, but Infosys and TCS not count? Each of these have companies with >10B USD of revenue...

umm, probably because both of these companies are not exactly known for their cutting edge r&d.

On the contrary, both Infosys Research and the TCS Innovation Lab are organically growing research labs who have been steadily producing quality scholarly work for some years now.

I'm not aware of anything of note coming from Baidu research. All they've done is a splashy celebrity hire of Andrew Ng, which is the exact opposite of how one would set up a serious research organization.

Even these people didn't claim they have cutting edge r&d. IITM has implemented that is already open source and continuing their work in open source

> the US is much less attractive destination

The US is still extremely attractive, especially for STEM talent.Indian educated migrants think of Trump as a temporary phase; and given how migrants power the American tech industry, they feel impervious to most Trump related threats.

If we're trading anecdotes, then I'm an Indian-educated migrant with a PhD from one elite US school who is current a postdoc at another elite US school and I've accepted an offer from an IIT.

Trump is by no means a temporary phase. The demographic trends that led to Trump will continue for a long while.

The "big sort" means that 70% of US population will be represented by 30 senators by 2050. Guess which party the other 70 senators representing primarily rural populations will belong to? The republican advantage in the electoral college will continue to grow as the rust belt becomes older and whiter. The supposed democratic flips of Arizona, Georgia and Texas due to rising Latino population are still far away in future and in the meantime, republican voter id legislation and gerrymandering of state-level districts to control state-level legislatures will ensure they remain red for the foreseeable future.

What all this really means is that the current situation -- where winning the primary is far more important than the general for republican politicians -- will continue for at least a decade or two more. Which means we WILL have more Trumps. The scary part of this is that the next guy won't be as incompetent at implementing his own agenda as this one.

With luck, Trump, the tightening H-1B/GC situation and the recent wave of xenophobia worldwide will push some talented people who would've otherwise left to stay in India. Modi has also pushed to make starting a business a quick process w.r.t paperwork.

yes, most Americans don't realise just how many Indian people work building chips there - IIT and its ilk train more than 25% of the chip designers in the US (my rough guess from working in the industry) ... if Trump chases them all home and they go on to build great things, as I'm sure they will, Trump may be responsible for destroying the US semiconductor industry single handedly


> Eschew flamebait. Don't introduce flamewar topics unless you have something genuinely new to say. Avoid unrelated controversies and generic tangents.


Sure, except immigrants under work visas are not entitled to any welfare such as unemployment/medical - in fact we lose our status as legal immigrants as soon as we become unemployed so we must leave the country. Keep the hate flowing though.

I don't have any objections to a country looking out for its own interest, resisting illegal immigration or immigrants that otherwise cannot contribute much to the economy, but they are not the only ones the government is cracking down on.

> the recent wave of xenophobia worldwide

your welfare state is not the world model


Germany Sees Welfare Benefit Costs More Than Double

Asylum seekers received nearly $5.91 billion in welfare benefits in 2015


and your assessment of employment rates among legal immigrants in Europe is simply not borne out by the statistics


If you read my comment, I was talking about restrictions on WORK visas in the US, not refugees in Europe.

Resisting immigration of a particular demographic, one that arguably contributes most to the coffers of the said welfare state, is well within epistemological domain of the word 'xenophobia'.

you're talking about the US

I'm responding to

> the recent wave of xenophobia worldwide

The costs of the refugee crisis

The German government will have to spend 50 billion euros on refugees during this year and next, a new study estimates. In order to balance these costs, researchers urge financial restraint elsewhere.


I agree with this comment so much, it hurts.

Very unfortunate but true. Here’s to hoping atleast this time would be different. We need to look past the “If it’s India it has to be ridiculously cheap” factor. Everytime the Indian Space Agency reports an achievement, I always look for the “done for the first time, learnt about X” than the “look, I have done the common Space thing at 1/10th of the cost”. I wonder what all the talented ISRO would achieve if they had resources in the scale of NASA...

> I always look for the “done for the first time, learnt about X” than the “look, I have done the common Space thing at 1/10th of the cost”

Water on the Moon

As the lead architect of Shakti and the guy who helped kick-start the project, I figure I am owed my 2 cents !

1. We never positioned it as an ARM killer ! That was the imagination of the reporter who wrote the article.

2. Shakti is not a state only project. Parts of Shakti are funded by the govt, these relate to cores and SoCs needed by the Govt. The defense and strategic sector procurement is huge, runs in the 10s of billions of USD.There is significant funding in terms of manpower, tools and free foundry shuttles provided by the private sector. In fact Shakti has more traction with the private sector than the govt sector in terms of immediate deployments.

3. The CPU eco-system including ARM's is a bit sclerotic. It is not the lic cost that is the problem, it is the inherent lack of flexibility in the model.

4. Shakti is not only a CPU. Other components include a new interconnect based on SRIO, GenZ with our extensions accompanied by open source silicon, a new NVMe+ based storage standard again based on open source SSD controller silicon (using Shakti cores of course), open source Rust based MK OS for supporting tagged ISAs for secure Shakti variants, fault tolerant variants for aerospace and ADAS applications, ML/AI accelerators based on our AI research (we are one of the top RL ML labs around). 4. the Shakti program will also deliver a whole host of IPs including the smaller trivial ones and also as needed bigger blocks like SRIO, PCIe and DDR4. All open source of course. 5. We are also doing our own 10G and 25G PHYs 6. A few startups will come out of this but that can wait till we have a good open source base. 7. The standard cores coming out of IIT will be production grade and not research chips.

And building a processor is still tough these days. Try building a 16 core, quad wide server monster with 4 DDR4 channels, 4x25G I/O ports, 2 ports for multi-socket support. All connected via a power optimized mesh fabric. Of course you have to develop the on-chip and off-chip cache coherency stuff too ! 8. And yes we are in talks with AMD for using the EPYC socket. But don't think they will bite.

Just ignore the India bit and look at what Shakti aims to achieve, then you will get a better picture. I have no idea how successful we will be and I frankly do not care. What we will achieve (and have to some extent already) is - create a critical mass of CPU architects in India - create a concept to fab eco-system ind India for designing any class of CPUs - add a good dose of practical CPU design knowhow into the engineering curriculum - become one of the top 5 CPU arch labs around

Shakti is already going into production. The first design is actually in the control system of an experimental civilian nuclear reactor. IIT is within the fallout zone so you can be sure we will get the design right. If you want any further info, mail me. My email is on the Shakti site. G S Madhusudan

Also everyone is welcome to review our code and to contribute to the codebase. It is an open source project. IIT-M incubated it but we want it to be community driven. FPGA based dev packages should be announced by jan , based on standard low cost FPGA boards. Dev parts based on ASIC parts will probably be announced 2nd quarter of next year assuming Feb/March tapeout.We would be glad to help other universities do their own tapeouts. One of the goals of Shakti is also to demystify the backend process.

Congratulation on the project. Where can one find the source? Is this the one? https://bitbucket.org/casl/shakti_public

Yes, we will update the C Class next month since our private line has a lot of foundry specific code that needs to be removed. The I class needs more work but the design is in place. It will also move to quad issue and would be a Cortex A72/75 class core. More importantly the basic slow IPs, UART, I2C, quad/Octal SPI, SDRAM controller, JTAG, DMA, PLIC will be FPGA and silicon proven and production quality. Will be very useful to other developers (non RISC-V also) as would the AXI bus.

You said "open source Rust based MK OS..." Does that mean an OS written in Rust? I had not heard of that. As I recall the Rust support for RISC-V is quite limited for now until the LLVM patches go upstream (which will be soon).

I agree with your statement "I have no idea how successful we will be and I frankly do not care." There will be a lot of benefits as a result of this work regardless of the immediate outcome. Keep building the future!

There are a lot of comments here on HN that seem completely uninformed of what's actually happening with RISC-V. They are about to be surprised.

Rust support is getting better. Our Rust OS is lower priority but it is needed for our tagged ISA support since we need language extensions to add security semantics to pointers. Also Rust is a much safer language. We need all this since safety critical systems is a key Shakti target.

Is this similar to the stuff lowrisc is doing?

Yes, we are also collaborating with them. The HW is the simpler problem to solve. SW life-cycle for creating tags, embedding them in the code and ensuring that teh binary has not been tampered is the challenge and to ensure portable tag semantics. We have a dedicated security group working on it. Unlike US universities we have carte blanche for hiring faculty and scholarships for MS/PhD. Funding is not the issue, getting good faculty and students is ! Some of you should head over here for an MS or PhD.

Thanks for the clarifications - always good to hear from the source.

I think there are many people waiting for some real RISC-V ISA silicon. The existing chips are really under powered samples at this point. So, this article made me glad to see this announcement.

That is what the team is spending is spending its time on, PPA optimization so that we have a good idea how far we can go. Iterating with as many process corners as possible to get the right compromise.. But we have to be careful, it is a new low power process node. The backend team is a very accomplished team from a leading VLSI design services entity who are helping us out. That team is also doing a 7nm tapeout simultaneously for a commercial customer, so they know how to route a chip ! Take apart a flagship mobile phone and you will see their handiwork. It is not a bunch of students trying their hand at PnR. Our team sticks to the architecture, design, coding and verification since these are our core strengths. We also get our work audited by external entities. Bottom line, we are as professional and process oriented as any commercial outfit.

I got a lot of emails regarding Shakti's positioning. As can be seen from our webpage, specific cores have been positioned to offer alternative to commercial cores including ARM's cores. Whether ARM gets affected by this or not is not for me to say. But to conclude that SHAKTI and the RISC-V eco-system in general could affect ARM significantly is not an unreasonable conclusion.So the reporter is well within the realm of reason to arrive at the conclusion he did. I am just making a distinction between causal reactions and intent !

>3. The CPU eco-system including ARM's is a bit sclerotic. It is not the lic cost that is the problem, it is the inherent lack of flexibility in the model.

When not even Google can get a CPU without built in IME/PSP "secure" processors running insecure code, that seems like an understatement.

It seems like almost everyone is ignoring the epic fallout of a truly malicious entity discovering and silently exploiting a remotely exploitable security hole in for example Intel ME. They could more or less shut down whole countries by a key press.

And that would be just the start. When you can't be sure if there is malicious software remaining deep inside your processors, you pretty much have to shred them. And the motherboard, and the disk drives (http://spritesmods.com/?art=hddhack), etc. You simply would have to replace your whole IT-infrastructure. Then you need to try to figure out what sensitive data was lost... It would be a nightmare for sure. If I were Google or Amazon I would be absolutely paranoid.

How much will it cost for USA to implement a replacement for Social Security Numbers after the Equifax leak?

As our dependence on IT increases it becomes a more attractive target, so a well protected IT-infrastructure will be an absolute necessity in the near future. To be able to build a secure infrastructure, we need a hardware foundation that we can trust. We simply don't have that today!

Shakti sounds like it can be the foundation we need, I very much hope so.

if you haven't seen it: https://www.youtube.com/watch?v=PLJJY5UFtqY

He's a security expert working for Google. Here he is explaining just how bad the situation is.

here he talks about lack of trustable CPU:s https://youtu.be/JCa3PBt4r-k?t=7m22s

Thank you for all the efforts. I am interested in MCU synthesis on FPGAs and this is a really interesting project.

I just have one request though. If you are interested in Makers taking interest in your project please make sure your documentation is top quality. Also try to provide cheap dev boards so that the community can provide Linux or some RTOS port. I being in India it would help immensely if you have your dev resources at lower costs(FPGAs which I really doubt but dev boards with your processor, yes) so that people like me can help you with some OS ports. You can also follow the route of Adapteva Inc which can provide you with a template of good quality product (i.e. Good documentation, source code etc).

I am very much interested in the maker community and would certainly do whatever it takes to engage that segment. Doc is th Achilles heel of efforts like ours and we intend to address that. Our code for example is extremely well commented and structured.We have 20 character variable names !We work with Xilinx, Altera and Microsemi and will target low cost dev boards but more importantly also tapeout ASICs for arduino boards.But first need to focus on our tapeouts.

This looks very interesting. Its high time an Indian institution has some framework for designing semiconductor chips from scratch and perhaps even mass manufacture them for selling them to the rest of the world.

My question is about Bluespec SV and the random logic synthesis of the written code. Is it good enough for good power consumption figures or is that not really a concern at this point? Do you know how its reception has been in industries in the last 4-5 years? Is it still the 'technology of the future'?

Another question I have is, this project being open source, are you seeing contributions from other educational institutions or industries? What would be needed get more institutions involved?

That sounds like an ambitious project, I hope it becomes a success!

I think the world can use some competition in the server space between different ISAs, the current Intel(-and-sometimes-AMD) monoculture is problematic for a number of reasons.

> 3. The CPU eco-system including ARM's is a bit sclerotic.

Silly question, but what do you mean by "sclerotic" here? I had to look up the word and only found medical definitions.

I looked it up too. It also means 'rigid and unresponsive'

These folks have no qualms with ARM. They are building their own chips using the RISC-V ISA. This literally has nothing to do with ARM or Intel.

The ISA isn't the hard part in building a chip, the building the chip part is. The thing they have going for using an existing ISA is that they get kernel and toolchain support.

They have many millions in support in the project and see all the reasons why they will succeed. What would be awesome is if AMD gave them assistance in getting their parts running in EPYC sockets.

The Actual ARM license and IP, is relatively tiny in overall cost of the chip. It is a very small amount of money for time to market and ecosystem.

ARMv8 ( Specifically the aarch64 part ) is like a clean start. I doubt RISC-V offer any advantage, especially we should now know the uArch is only tiny part of the equation, the implementation matters a lot.

Apart from the fear of lock in or ARM suddenly hike the price 100x, what exactly is the benefits and motives for RISC-V?

Or is this more of NIH Syndrome?

The Risc-V ISA is far more modular and extensible. Also we needed the 128 bit support for our NVM storage research and single address space OS work. No MMU, T-ISA based security, VIVT caches. Other govt projects do have ARM licenses, we chose not to go that way for purely technical reasons. Our lab does have academic collaboration with ARM.

Good to see these fellas in action. It is nice to see that India is trying to set up a chip manufacturing ecosystem. There will be a lot of pitfalls though. From my long stint in chip design, I predict that the initial phases will be difficult. Like, the first few efforts might be defective/have odd timing bugs/power/reset problems and it will be very discouraging and frustrating. If they can push through this phase, then I can already imagine India's rise in the IoT/strategic or whatever industry they are targeting. I also hope that this effort pushes the government/private sector to set up fabs in India. Stay strong, all of us are bidding for you guys. A quick lookup on the internet shows that SHAKTI is from R.I.S.E group. We are waiting to see you rise!

Glad to see open source catching up to pro proprietary in this space.

I’m assuming all their code is in this repo: https://bitbucket.org/casl/

Very inspiring news! :)

Looking forward to the day when we would be able to run popular open source SW frameworks like NodeMCU and Linux on open hardware like this.

Submitted my first PR to the bitbucket repo (a minor one) https://bitbucket.org/casl/shakti_public/pull-requests/1

> The landscape is full of thick trees that have set down roots for decades. Bats, deers, and monkeys, seemingly comfortable with the few thousand bicycle-riding students, academics, and staff on campus, are easy to spot. The air, thanks to a recent bout of rains in Chennai, is as good as a hill station

In addition to the above description, a picture should have been inserted above or below the paragraph.

http://iglc2018.com/images/iit/4.jpg https://qph.ec.quoracdn.net/main-qimg-2200d147343aa841a43a07... http://2.bp.blogspot.com/_OgX4b2N9a8E/S7OVv7jxhmI/AAAAAAAAAh... http://www.mbaskool.com/2013_images/photography/landscape_na...

We are one of the last refuges of the endangered black buck. The spotted deer and bonnet macaques are more common in India. Odd that a university campus in the middle of a city is a refuge. The pic is that of an albino black buck, an even rarer mutant.


I am happy when I can see a kestrel or a roe deer from my office. It must be amazing to work in such a beautiful scenery!

Everything has to be monkey proofed, going to the refreshment room when a monkey gang is around is like running the gauntlet ! We are talking really aggressive simians. But totally worth it, I could spend hours watching the monkeys. Surprised that we do not have a simian cognition research group on campus. The slender loris are really beautiful too but I have never seen them.


Now if I could get the monkeys to learn bluespec ...

Thank you very much. They're beautiful! An ideal academic environment imo.

Did I miss the privileged ISA becoming standard?

Sure they don't want to release a chip using the draft.

It reminds me of the Loongson/Godson effort.

These guys will have to drop out and start a funded company if they want this tech to go anywhere. I don't expect IIT-M to be able to manufacture and sell the finished design.

I'm not aware of universities doing manufacturing and selling finished design, they more or less license IP.

ARM can’t be killed at this s point. Positioning themselves as ARM killer would not be the wisest move on anyone’s part.

One way to dethrone the competition might be to focus on openness:

There are also security reasons for wanting to make a processor for India. “We don’t know really whether the processor we are getting from outside is trustworthy. Is it secure?” asks Kamakoti. “Suppose I want variants of a processor, for different needs – not just strategic, even civilian needs – I have to basically rely on the processor available to me, and fit my application to that. It is something like I bought the slipper and I am cutting my feet to fit into it.”

But in the case of ARM, you _do_ know, if you manufacture it yourself. ARM licenses intellectual property, not sells chips. Once you buy a license, you can inspect it and make sure it doesn’t have the bits you don’t like.

That means I have to develop all the code from scratch anyway. the arch lic just gets me a ISA spec doc.Might as well start with another ISA that is free. But the issue is bigger. Shakti is also an exercise in using a high level HDL. We would not exist but for Bluespec, we really get a 6x to 10x productivity increase over Verilog. I also get the ability to do true formal proving of the correctness of my code. There is a reason why DARPA hose bluespec for their secure processor work.

No, an ARM license gets you the implementation of a core in source form. You are free to modify it if needed, as many licensees do.

Depends on the license.

My understanding is that you also need to license from intel if you have a chip.

I think there's a "law" that goes something like "if there's a question in the title, the answer is no"

I agree with this. I was just looking at the microcontroller market, and I can tell you that AVR is going to be around a long time (like PIC).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact