Hacker News new | past | comments | ask | show | jobs | submit login
The “terrible” 3 cent MCU – a short survey of sub $0.10 microcontrollers (cpldcpu.wordpress.com)
251 points by jerryr 7 days ago | hide | past | web | favorite | 94 comments
 help




These have their uses.

A friend who frequently does contract development in the toy space has (or at least used to have) a favorite go-to MCU that costs under $0.06 in bare die. It is essentially a 6502 with 100 bytes of RAM and a metric butt-load of mask-programmable ROM. It was originally designed for greeting cards. He has designed it into toys.

It is hard to use, you need a dev kit and a good relationship with the distributor to get the documentation. It only makes sense in high-volume products, since it comes as passivated bare die so assembly requires a die-bonder and expoxy encapsulation depositer.

Not for everyday use. But as my friend says: “You haven’t lived until you have spent an entire afternoon arguing over $0.05 on the BOM.”


That sounds very similar to my experience with toy development. For a toy that played a bunch of pre-recorded sounds, we used a 4-bit Winbond MCU (their MCU division is now Nuvoton) that had a tiny bit of RAM and a ton of mask ROM. Firmware development was done in assembly and targeted a huge (physically large) emulator for test/debug. When we were satisfied with the firmware, we'd send it off to our CM, who would then order the parts with our FW in ROM. They'd get back bare die parts, which were wire bonded to the PCB and then epoxied over (that miserable "glop top" packaging, which is the bane of many teardowns). Development was a bit painful, but high volume production was extremely cheap.

Edit: Oops. I conflated projects. The toy project actually used a SunPlus MCU, not a Winbond MCU. It was an 8-bit RISC CPU running at 5MHz with 128 bytes RAM and 256KB mask ROM. The ROM held both the program and audio samples. I don't recall what encoding was used for the audio.


Well, I'm curious which project used a 4-bitter. Jack Gansle and Robert Cravatta did a survey a while back:

http://www.embeddedinsights.com/channels/2010/12/10/consider...

http://www.ganssle.com/rants/is4bitsdead.htm

The two examples were timepiece designs and Gilette Fusion ProGlide. On top of getting yours, I'm curious if any of these cheap MCU's in the article could today have met whatever your requirements were for a 4-bitter?


It was also for a low-cost audio application, but it wasn't a toy. This was back in 2001 or so. The MCUs in this article all only have ~1KB ROM, which wouldn't have been enough for our audio samples. We needed >256KB. The "4-bitness" was just incidentally what Winbond offered with a large ROM at the time. However, the SunPlus that we later used in the toy also offered a large ROM with an 8-bit CPU for a similar cost. So, while I can't authoritatively say that 4-bit is dead, it does seem like there are a lot of alternatives in similar price ranges now.

Great story, thanks for sharing. I for one pretty much constantly live the life he alludes to at my company bomquote.com. In China, we actually negotiate in increments of ~1/6th of a penny.

Why not a tenth?

The yuan-usd exchange rate has floated around 6:1 until recently, with yuan denominated down to 0.01. So possibly that.


Don't you mean fen? https://en.wikipedia.org/wiki/Fen_(currency)

1/6 of a penny is 0.0016 USD, or 0.01 CNY: https://www.google.com/search?q=0.0016+usd+to+cny


I guess the 1/6 comes from a currency conversion

Ah, yes, there was an article here a year back about the original Furby using that same configuration. The article actually had the annotated 6502 source code.

https://news.ycombinator.com/item?id=17751599


It´s very common to find those chips in cheap home alarm systems.

In Brazil, Holtek has a huge presence in that niche.


>6502

Without question, one of the nicest platforms to have in multitudes of thousands, at low energy and cost ..


Honestly I'd prefer an ARM or an 8086 or an AVR. I imagine I'd prefer a J1A too but haven't tried. The 6502 makes it a pain to do anything in any language higher-level than assembly, even C, and being 8-bit means you're constantly facing tradeoffs between making things fast or making them correct for more than 128 or 256 items.

Is a "expoxy encapsulation depositer" the thing that makes those black blobs?

Yes - but more it's two machines - the one that place the raw die on the board then stitches the bonding wires, and the one that throws the epoxy blob on top of that

"metric butt-load of mask-programmable ROM"

What would that be in this context? I'm guessing we're talking KB rather than TB?


I'd guess about 1 MB order of magnitude. That would be a butt-load even for samples.

1 MB would be enough for 45 seconds audio at 8 bit PCM @22 kHz. If they're half competent, they could use ADPCM or much better (and cheaper to decode too!) vector quantization.

With VQ, you could go even down to ~1 bits (~6 minutes of audio per megabyte) per sample while maintaining good audio quality. Decoding is very simple, so 6502 would have plenty of oomph to do that.


Using the trivial BTc encoding, you could archive a good compression ratio and generate audio without using a DAC . https://www.romanblack.com/picsound.htm

MB == Metric Buttload.

Makes sense.


> [Vector quantization] Decoding is very simple.

Do you have an encoder/decoder to point us to?


VQ is a very well known technique. For something quick, you might be interested to check this blog post about (unmodified) C64 playing high quality audio at 44.1/48 kHz, VQ compressed down to 96-128 kbit/s: https://brokenbytes.blogspot.com/2018/03/a-48khz-digital-mus...

There's some decoder code there as well.


Arithmetic coded uniform quantization works very well and is straightforward to implement!

the poster mentioned it was for greeting cards, I remember their used to be a lot of greeting cards that would play a recording when you opened them. I bet they loaded low-quality samples into the ROM for that.

The 6502 can only address 64K, so I'd assume 32K of ROM at 8000-FFFF with 128 (not 100) bytes of RAM mapped at both 80-FF (zero page) and 180-1FF (for the stack) and 128 MMIO bytes at 00-7F. It depends on how much they've modified the base 6502 design, though; the above is for "not at all".

In such an integrated design 6502 can address exactly as much as you want/need it to.

Just implement bank switching for top 32 kB of address space. 256 banks (= one 8-bit register) * 32 kB = 8 MB. Need more? Add another bank switch register and you got up to 2 GB.


For a $0.06 part? Uh, clearly not TB...

Amazing numbers. Can you give me a more accurate number than metric buttload? Closer to 8K or 64K, for example?

500k to 1M IIRC. Originally indended for audio assets.

Absolutely amazing! There are some very credible sampled synths that used 1M of ROM.

Some Philips sonicare toothbrushes use(d) a 4-bit microcontroller from an obscure Swiss company. (From memory, since I can't find the EEVBlog video teardown) 52 bytes of RAM, custom size mask ROM, ? Kilohertz clock speed. It makes sense, they just needed a timer for the "2 minutes of brushing is up" feature, and maybe some battery management. It still surprised me that it was worth the hassle to save a few cents, even if they sell millions of the things. They must make insane margins: $80 for a vibration motor and $25 brush refills.

It's pretty easy to make the hassle worthwhile, if you are selling (particularly many) millions of things. Even saving a dime on BOM for a million units shipped will pay an engineer year, roughly.

I still remember a friend relating with a mixture of horror and fondness that in 15 years probably the biggest impact he ever had to the bottom line of [big computer manufacturer no longer in existence] was re-routing a PCB in a way that let them make it smaller with no functional change. The materials cost savings over product lifetime was in the 7-8 figures range, he claimed.


A product I was working on had different variants based on what parts are populated. There were config resistors to help route the signals based on what was populated. I replaced that all with a simple software lookup table in firmware to reroute the signals correctly. Furthermore I figured out how to auto-detect the dozens of supported configurations on first bootup. The PCBs were very small so I made the board designers job a lot easier due to eliminated resistors. The factory folks were thrilled that they didn't have to manage the jumpers.

I would hire you. These techniques should be applied pervasively.

Could you write a guide? I would buy it.


Thanks for the kind praises. Unfortunately I'm not much of a writer though I have lots of interesting experiences that I tell my friends and coworkers.

A big part of how I'm today is due to my boss at that time. One of the most brilliant engineers I worked it. He also gave us a lot of time and freedom to think of these kind of approaches. Sadly these days a lot of companies are always in a rush so one can't think things through properly.


I did a teardown of a Sonicare toothbrush that used an 8-bit PIC 16F1516 microcontroller. There's a lot more going on in the toothbrush than I expected. I expected a simple motor, but there's a mechanically-complex resonant coil mechanism, driven by an H-bridge. There's some expensive manufacturing in there. Another interesting thing was the toothbrush has a "pressure sensor" to tell if you're brushing too hard, but it's really a Hall-effect sensor.

http://www.righto.com/2016/09/sonicare-toothbrush-teardown.h...


Maybe one of these?

https://www.emmicroelectronic.com/catalog?title=&term_node_t...

EM Microelectronics is actually not so obscure. They belong to the Swatch group and are specialized in ultra lower power analog and mixed signal circuit. Obviously, first for watches.


Are you thinking of this Braun teardown? https://www.youtube.com/watch?v=JJgKfTW53uo That indeed uses a 4-bit micro from a Swiss company (The only sonicare teardown I found was a forum post)

Yep, that must be it. Thanks!


I don’t know how, or why, but I’ve bought the same Phillips toothbrush for $20 from AliExpress when my original broke.

And I buy the heads from Asia too for a fraction of the store price.


The heads aren’t Phillips, just clones right? I would expect the brush you are buying to be a clone also, or at best a good knock off. Phillips is pretty disciplined about their pricing internationally.

I buy cloned heads (but not the main brushes) and find them completely adequate.


The base was legit (or a 3rd shift kinda thing?), but the heads are definitely generics.

Agreed, they work fine.


I’m very doubtful they were legit, anything labeled third shifts are almost always fakes. I can’t find any soniccare on alibaba (just a lot of clones), they have stuff on TMall, but all the prices are normal.

It is interesting that its possible to produce chips at this price for more than a short time period. Also that they are all basically PIC clones or variants speaks to the core competency of these companies is manufacturing and operational efficiency, not MCU design.

I think a fascinating experiment here would be to invest some time in an unencumbered scalable design that could be implemented very inexpensively (say less than 10K gates). Would these manufacturers pick up that design and run with it, making variants and parts that people could buy? It would get Microchip off their back (several have been sued apparently).


> I think a fascinating experiment here would be to invest some time in an unencumbered scalable design that could be implemented very inexpensively (say less than 10K gates). Would these manufacturers pick up that design and run with it,

I have done that thought experiment many times. I have a ISA that I sketched out many years ago but never did anything with. Often, I have thought it would be a real hoot to put up a working Verilog model on Github with a public domain license just to see if I could bait somebody in China into manufacturing it for me so that I could buy it off Digi-Key :)

Of course, the CPU isn't really the value any more. One salesman for an ARM licensee said it best: "Look at the die photos from any of the ARM licensee's. We are all just selling value-added flash." And as far as that goes, it isn't the CPU that drives the part design-in decision. It is having good, bug-free, peripherals in the right mix and a reasonable tool chain.

So the "free CPU design as wild oats" idea has appeal, but it would need an LLVM back-end to go with it, at minimum, and then unencumbered Verilog for a collection of basic peripherals.


I think you should still post it to github :-). Agreed on the peripherals as well, I good set of open source peripheral models in verilog would be a useful addition to the mix.

Have you heard of opencores.org? Its been around for a long time.

Here's this experiment sort-of running:

WCH, which sells cheap mcu's(like $0.25 8051-mcu with usb/16Kflash) is working on a risc-V bluetooth mcu:

https://www.cnx-software.com/2019/02/16/wch-ch572-risc-v-mcu...

And since risc-v has versions somewhere around 5K-10K gates, well, a lower end mcu isn't far, probably.

Still, it would be interesting to think how we can get from that, to standard peripherals, and standard pinouts, used by at least 2 mcu makers. That would be interesting.


> And since risc-v has versions somewhere around 5K-10K gates, well, a lower end mcu isn't far, probably.

5-10k? Seriously? That's amazing.

Z80 had 8500. According to Wikipedia, even truly simple and bare bones 6502 had 3510 or 3218.


The Z80 had that many transistors, not gates

The CMOS rule of thumb is one gate is 4 transistors (eg a 2-input NAND). But the first Z80 was nmos, and so the equivalent count is harder to state. 3 transistors per 2 input nmos gate is a rough first order guess, but in nmos wide gates still need only a single load resistor, and designers sometimes used dynamic logic to save even more transistors.


Ohh... This is embarrassing! I work in embedded/low level and I have always thought that gate == transistor. It's really obvious in retrospect...

I stand corrected.


It's not a crazy idea, there are indeed logic types that implement some gate types with a single transistor. For example, https://en.wikipedia.org/wiki/Resistor–transistor_logic

These were more commonly used back when transistors were expensive.


Designs like the GreenArrays F18A core show that you can do dramatically more than the 6502 or Z80 did in a similar number of transistors. The J1 and J1A are free descendants. The MOStek and Zilog hackers were wizards but they were working under serious time and, in the Zilog case, compatibility constraints. We know enough to do better now.

(https://news.ycombinator.com/item?id=20688443 is on the same topic.)

(As others commented, those are transistor counts.)


Smallest irsc-v core i know: https://devhub.io/repos/kammoh-picorv32

~1000 Luts. 1 Lut = 6-24 gates on average. a bit pmore but still pretty close.


A LUT can be 10 gates and it can be 100+ gates. You just can't compare FPGA LUTs to gates like that.

FPGAs have things like block RAMs and multipliers. Those require a ton of gates, but don't increase required FPGA LUT count by much.


Chips inspired by Chuck Moore's designs would fit this niche.

I would think a lot of manufacturers are looking to RISC-V for that kind of eventuality. But in the meantime, and looking at that chart, it seems a lot more convenient to just rip off the PIC.

I don't think they are "ripping off the PIC", I think they are reimplimenting the PIC ISA, just like 8051s are reimplemented all over the place

I don't think a RISC-V would fit in this gate count niche. Just the register file for an RV32-C core is half these chips' RAM.

Yes, the PICs don't have a register file at all. They also have quite a simple instruction set, again providing gate savings, but then the size of program ROM may be a bit larger as a result.

The ALU can be a reasonable chunk of the processor size, and an 8-bit ALU is going to be much smaller than a RISCV ALU. Although I read somewhere that some of the Z80s, although an 8-bit processor, had a 4-bit ALU, and also I read somewhere that the 32-bit NIOS processor has a 16-bit ALU. But whether that's true or not ...

I designed a size optimised 16-bit MSP430 clone for small size low cost machxo3 FPGAs that used an 8-bit ALU. It a good way of keeping the number of LUTs down when optimising for size over speed.

For something like the low cost ice40 FPGAs, a PIC would probably be a very good match for those too compared to RISCV, because ice40 doesn't have distributed memory, which is what you'd like for register files (otherwise I expect one of the block RAMs would be used for the register file, and ideally you wouldn't want that to happen).


They're using long-obsolete process nodes, so they can use surplus equipment, but at the cost of power efficiency. (Yes, that obsolete.)

Could mcu's made using modern processes be as cheap ?

I don't know, but so far nobody has managed to do it. These microcontrollers aren't just cheaper than other microcontrollers. They're cheaper than most discrete transistors, cheaper than 555s, cheaper than shitty op-amps, cheaper than linear regulators, cheaper than many diodes.

The other issue is that the older nodes are more reliable. They're already matured where they understand how to make everything work right. On top of that, each process shrink introduces new challenges that can cause components to fail. The most modern nodes seem like they make everything somewhat broken coming out the door with designers building in mitigations for that. They don't last as long.

The older nodes don't seem obsolete if component reliability is a concern. All my concepts consider their potential. They're quite limited in performance, storage size, and energy, though. There's a tradeoff. Lots of companies want a cheap, reliable, simple CPU/MCU. That's where the oldest nodes shine. That said, the newest nodes are tiny enough that one might make a 2 out of 3 setup with extra error correction like Rockwell-Collins' AAMP7G CPU. Might still be pretty cheap... per unit (not development cost)... on 28nm CMOS or SOI process. Haven't seen an attempt.


I think Padauk is using 1.3-micron or something? Can't find my notes. I don't think you have to go quite that far into the stone age to get reliability, at least not for digital.

Do you have funding to do a MOSIS run or two? I wonder if we could find some.


No funding at the moment: researching on the side while working a main job. The older nodes for MOSIS were 350nm and 500nm. The fabs for these penny chips can be stone age in comparison. You're right that you don't need to go that far given the highly-reliable POWER's and Alpha's weren't built on stone-age nodes. Edit to add that, based on MHz I remember, the Alpha's were 350nm-500nm. Fits MOSIS well. :)

If you do a project, I suggest 350nm since it's the last node that you can visually inspect the resulting silicon. It's as fast as you go before you need electron microscopes and such. It's also more likely that the open tools for hardware will be able to handle such a node instead of deep sub-micron. Finally, there's old research in transistor-level optimization that might be applied to it in new, open tooling. Might let people do standard cell that inches a bit closer to performance and energy use of custom designs.


At a certain point the packaging is the biggest cost of production.

When getting into a project, I usually work with someone in the hardware design space (I can do larger electronics myself and thus prototypes, but I'm hired for very small circuit projects) and after creating a prototype, we usually start searching for the the cheapest MCU (and other components) that fit the project. Spending a lot of time doing assembly and making it fit the constrained memory will pay off when doing million factory runs. A lot of MCU's I work with are faster (or only marginally slower) than my beloved Z80 I grew up with an programmed many years, but usually have less (sometimes far less) memory. I have not worked with kilohertz since the early 80s but he, if it fits (it almost never does) I am all for the lowest priced and power friendly MCUs I can find.

It's funny but I've been driving a Z80 with an arduino-mega recently, and I actually timed it for the first time a couple of days ago. I'm getting 6000 clock-cycles a second. 6Khz, rather than that 6Mhz it would be cable of running standalone.

Why is it so slow?

Mostly because I'm doing a lot of manual work reading/writing to the address/data-bus and there is some overhead in the code I've got for emulating RAM & I/O code.

It should be faster, and could be if I reworked it. But I'm mostly using the arduino as a crutch right now until I get hooked up to real driving circuitry so I'm not overly concerned.


Any consideration as to why a chip might be cheaper (e.g. worst labor practices, worse environmental practices, etc)?

I'm going to hazard a guess that a lot of these cheapo microcontrollers have cores, uh, not developed in-house.

Around the year 2000, a PIC 12C508A was about 1 Dutch guilder @ 5000 pcs.

Today, Mouser lists 1 pcs @ € 0,908, 100 pcs @ € 0,863.

The guilder/€ is 2,20371 ( muscle memory ).

The MCUs in the article are 12C508A class, one is an actual clone.

So for ~ € 0,80 at quantity, the 12C508A currently costs about 1.75 guilder. 20 years later.


I wonder how much of Microchip revenue is from old chips at very profitable prices.

I did a Digikey comparison once, and Microchip alone provided roughly 30% of all MCUs on digikey. It has, for example, an incredible 175 MCU models with exactly 64B of RAM, 1.75KB of ROM and just 5 I/Os.


That is just combinatorial explosion of fairly small number of features. I took a peek at the example you provided, and drilling down there seems to be for example 39 SKUs for PIC12x615 chip:

* Two voltage options (although why on earth do they need 2-5.5 and 2-5 separately?)

* Two main packaging options: tape and tube, and tape is furthermore available as full reel, digireel, and cut tape

* Three different temperature ratings

* Five different device packages from tiny 3x3mm DFN to full-sized DIP

So already from these fairly simple options you get total of 2x4x3x5 = 120 different combinations for essentially the same chip. So they don't quite have all the combinations, but there are few outlier options and overall the combination coverage is pretty wide which explains the inflated SKU count. I'm not sure what conclusions can be drawn from this exercise besides that Microchip seem to be willing to provide their chips exactly as customer wants them.


> They address a specific category of low-cost, high volume, non-serviceable products with limited functionality. You need to wait for the push of a button and then let an LED flash exactly five times? You need to control a battery-operated night light? The sub $0.10 MCU is your friend to reduce BOM and shorten development time.

If you're allergic to 555s, I guess?


The logic you'd need to add to a 555 to flash an LED exactly five times would likely cost more than doing it in an MCU.

And at the last moment management would ask you if you can make it flash six times.

These are cheaper than 555s, more easily available in various footprints, need almost no supporting passive circuity or glue logic, lower power usage, and more flexible to boot.

Popularity of the 555 waned a long, long time ago.


More accurate timers, too. It's bizarre to think that "program a microcontroller" is actually cheaper - even in small quantity - than the 555 designs I remember building as a kid.

(In single quantities from LCSC the cheapest 555 is $0.07 and the cheapest Paduak microcontroller is $0.04 - in quantity the 555 is $0.04 and the Paduak is $0.025. Plus fewer support passives with the Paduak.)



For the push button example I can envision using a pair of 555s or a 556 as a timer for the flashing and a latch to store the success condition, but how does it count to 5? You need more logic in here, so why not a cheap programmable chip?

The microcontrollers we're talking about here cost the same as a 555, or less, and don't need external timing components.

>Due to lack of programming tools and evaluation boards I was only able to review most devices by datasheet

hmm. Well I don't blame the author but that kinda killed my interest


I wonder what the lowest-priced chip you can pick up with a reasonable toolchain that is accessible to hobbyists?

PIC10F200 in 8 pin DIP is about 41 pence (~50 cents) in 1-off here in the UK; great for quick hacks (e.g. I just used one as a 60KHz clock generator). The IDE is free, works on Windows/Linux/Mac and there's an online version too.

Attinys cost like $.50/piece and are supported by avr-gcc.

You would be surprised by the places sub-$1 mcus appear, even down to a dime. Car manufactures use them all over the place, even ECUs(at least until recently; I'm no longer "in the biz"). Yes, shaving a few pennies from the BOM of a $30K device is still worth doing when you produce them in huge volume.

The top end padauk ("PFS173") actually seems pretty reasonable chip, comparable to bottom end attinys with only quarter of the price.

After I got over being spoiled by ARMs I fell for Padauk's stuff which I now think are awesome and makes almost everything that leaves the building a candidate for a micro. It breaks the old rules where you had to have a certain level of complexity in the product to justify using a processor. Their stuff are economical replacements for discrete logic and a no brainer for the ubiquitous ADC->serial use cases.

My hardware guy came on really heavy against the Padauk stuff because it was put to me that the temperature range wasn't wide enough particularly at the high end where supposedly we had to operate at 125C. I actually think attinys made it in the design instead of the Padauks just because of personal preference and an unwillingness to share the project rather than unavailability in automotive temp range.

So the tubes of Padauks and a pair of ICEs I brought in so it would be very easy to play with sit unused and I am now at a different place that places more emphasis on what all of the engineers think rather than just whoever happened to draw the long straw for the project.


I wrote code for a MCU that did power sequencing on a consumer product. 4K of code, 2K of RAM, some EEPROM, a 16Mhz clock . . . 18 cents.

It was a fun little project: http://www.dadhacker.com/blog/?p=1911


Sounds like the size where forth code might be used. Anyone done that?



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: