Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: MicroTCP, a minimal TCP/IP stack (github.com/cozis)
204 points by cozis on Oct 31, 2023 | hide | past | favorite | 41 comments



Hello HN! This is a project I've started this year to learn about sockets and network programming. Nothing serious, just a hobby project! I'd love to hear your opinions about it and feel free to ask questions


What is the licensing for this code?


(I'm not OP)

1. There is no license, so it's proprietary code.

2. It's a student project, you shouldn't use it for anything.


> 2. It's a student project, you shouldn't use it for anything.

I'm a graduate, should you use it if I write it?


Not if you call it a student project! Probably your TCP/IP stack should not be someone's hobby project. A good threshold test: do you need to care about the license? If so...


I think we're making the same point really - I meant that it's not that OP's a student that means you might not want to actually use it for something, that just seems a bit mean/gatekeepy, but also naïve, to me.

I don't think there's any reason not to experiment with this any more than similar ShowHN hobby work from anyone else. i.e. follow its progress, maybe toy with it in your own hobby thing.

And honestly, I knew more about how to write a TCP/IP stack when I was a student than now. If I could do a better job now it would only be from some experience writing other code to RFC spec.


It's surprisingly good code for a student hobby project. I dipped in earlier hoping to find some silly gotcha to preen about, but it's well structured, follows a close read of the RFCs, and for its problem domain it probably needs to be in C anyways. I agree, the author shouldn't sell themselves short if they don't want to. But I also kind of took the "what's the license" question as a bit snippy, which I assume that preceding commenter did too.


Cool project! I also created a userspace network stack for Wallpunch, a censorship circumvention tool I built. It wasn't clear from the article if you were planning on working on this further or leaving it as is, but if you want to keep perfecting it I have a suggestion for testing:

An HTTP/echo server you control is great for the bare essentials, but there are so many mind-boggling ways things can go wrong in the real world. I think the only way to catch those edge cases (and even so you'll never catch them all!) is to use it on as many different devices for as many real world applications as possible.


Hey, happy you liked it!!

> It wasn't clear from the article if you were planning on working on this further or leaving it as is

At the moment I'm working on it on and off in my spare time. I think I will continue that way until all major features are done and stable

> An HTTP/echo server you control is great for the bare essentials, but there are so many mind-boggling ways things can go wrong in the real world. I think the only way to catch those edge cases (and even so you'll never catch them all!) is to use it on as many different devices for as many real world applications as possible.

Yes, I'm realizing that. From the beginning I've been thinking about going the unit test route but couldn't find a way to make it work. Thanks for the feedback!


Cool project. It reminds me of another minimal TCP implementation from Viewpoints Research Institute back in 2007. They managed to do it in under 200 lines of code by writing a parser that directly interprets the ASCII art diagrams in the IETF RFC. Which is kind of a bizarre approach, but very clever and somewhat self documenting.

http://www.vpri.org/pdf/tr2007008_steps.pdf

It's not really intended for production use, more as a demo of a new experimental programming language.


Thank you so much for sharing this.

What would you call this paradigm?

Ian Piumarta (this researcher) calls it "self implementing". https://www.piumarta.com/cv/bio.html

My best effort so far is "example driven programming", which is turrible.

I did something similar for HL7 specifications, at about the same time. Our consultants (what the kids today would probably call business analysts) would hammer out an "HL7 interface" document. (Think human readable OpenAPI for the time.) It'd serve as an actual contract, between us and the hospital, of sorts.

Being lazy, I wrote a parser (not BNF based, like Piumarta's work) which scrapped the "interface" and generated code. Implemented in minutes, instead of days or weeks.

I've since applied that strategy to other domains, with similar results.

But I have the hardest time describing this paradigm. Even when people see demos, they don't quite get it. This stuff is supposed to be hard, right? Surely I'm cheating somehow.

Naively, I've been thinking that if I coined an insipid new phrase, maybe it could become an inscrutable meme. Like "Agile Methodology" and "Extreme Programming" did. Catnip for PHBs.


Amazing! I would have NEVER thought of that! That's just another level of "following the standards"


"The dream is to serve my blog from an STM32 board!"

That (uControllers and small systems in general) should be among the best use cases as there is obviously a much higher demand for a low footprint network stack in that field, also with an eye to Single Pair Ethernet, should it become cheaper and widely supported in common uCs in the future.


As a networking nerd I respect the project OP. Very good learning exercise. There are still many unsolved problems in networking, too. It's a bit of an esoteric field.


I love these tiny stacks because it shows people that not everything needs to be a super complicated implementation that takes years.

Of course it needs tap but it might not be too hard to get it to work with an phy driver directly.



How did you even get started with this? There are many systems I'd love to learn to implement from scratch like databases, network stacks, and caches, but the task seems so vast and daunting that it's hard to get started. Learning from existing codebases is also difficult b/c there's so much code to go over and understand. Can you elaborate on what your process was to build this without any help?


The strategy that works best for me is to just start at the simplest step & iterate from there. Do research online to find blog posts / articles describing how to solve specific problems. When doing reading, make sure to note terms of art that repeat so that you know what to look for.

So the first step I would do to start on a TCP/IP stack would be "how to open a raw socket" (granted you have to know the magic phrase here). There's also tons of "how to implement TCP stack" articles (same for ethernet etc) that probably are good starting points if I didn't know that magic phrase. The other thing is to find people to talk to who know more than you to answer those questions.

Once you have that, you can setup a real TCP socket on one end and a raw socket client on the other & then the same thing in reverse. Then look up the TCP/IP framing on Wikipedia (+ read up on the broad overview of how TCP works and the various parts). Once you have the framing implemented, try implementing the basic sequence to establish a connection. Then keep noting what features a full TCP stack has, which are required, & which are missing from my implementation (that's when I'd start reaching for the standards that everyone references).

Of course, if you want to do something novel beyond just "hey I have confidence that I can implement such a stack myself" (e.g implementing something with certain performance characteristics) that requires a deeper understanding of how things work and a good filter on possible ideas & which ones are going to likely work out the best (that is gained through expertise, creativity & intelligence)


If you made this into a book, I'd buy it! Especially if it was structured so that each chapter had a coding exercise that built on the previous one so that by the end I had a working stack.

My favourite programming book is structured like this "elements of computer systems" also referred to as "from nand to Tetris"


I think it boils down to the ability of being able to divide a big problem into smaller parts and understanding in what order you need to takle them. The more you try building big things the better you get at doing it.

After getting into university I decided to build an interpreter[0]. For someone who didn't even have the notion of a parser, it just felt like an unaproachable task. Even though, I sticked to it and the architecture became clear in time. That's another thing, even if you try to build something and miserably fail at it at each iteration, you still get better at it. The feeling of the task being too vast and daunting is just a feeling and knowing there's something on the other side makes it easier to power through it.

Hope to see your database on HN someday! :^)

[0] https://github.com/cozis/Noja


I wrote a little embedded TCP stack a while back as a learning project. I just read the RFCs and coded away. I'm sure it's the world's least efficient stack, but it wasn't too hard to get basic functionality.


I am not very familiar with this kind of stuff either but I have read through "Beej's Guide to Network Programming" https://beej.us/guide/bgnet/html/split/. Which seems like enough to implement something like OP posted.

As for db's/other interesting things, I haven't read them myself but this site seems solid https://build-your-own.org/. If anyone has any real experience with this site, I would love to hear it!


No OP but I think there’s three parts to really getting TCP/IP:

- Read the spec(s)

- Learn the OS APIs deeply

- Look at a bunch of real network data in e.g. Wireguard


> Look at a bunch of real network data in e.g. Wireguard

I think you probably meant data in e.g. Wireshark?

But another good suggestion (or maybe what you meant) might be to look at 'real' (production) networking code in e.g. Wireguard, I imagine.


Bene!


Something similar from... a while ago. How does it compare?

https://web.archive.org/web/20060615041317/http://www.sics.s...

uIP is an implementation of the TCP/IP protocol stack intended for small 8-bit and 16-bit microcontrollers. It provides the necessary protocols for Internet communication, with a very small code footprint and RAM requirements - the uIP code size is on the order of a few kilobytes and RAM usage is on the order of a few hundred bytes.

https://github.com/adamdunkels/uip

https://en.wikipedia.org/wiki/UIP_(software)

(I believe uIP was extracted and improved upon from Contiki, a C64 OS with TCP/IP support written in C in 2002: https://www.c64-wiki.com/wiki/Contiki)


Haha, oof, it's evil to compare to that! Adam is a "10x programmer" genius.


Yeah, it felt a bit evil, but I also really wanted to share this because it predates HN and deserves sharing. And then the MicroTCP poster went AWOL...


Was about to post the same. Contiki actually came to fame though as it was commonly used for sensor networks an IoT application. Particularly integrating 6lowpan (ZigBee IPv6) made it interesting for us at the time.


[flagged]


> Wtf is a TCP/IP stack?

You may want to consider toning it down. Readers here would be expected to know all this.

And yes TAP is a user mode raw interface to the NIC.


> You may want to consider toning it down.

FWIW I didn’t take OP’s comment to be of a criticizing tone, I just assumed they were legitimately curious and looking to improve their understanding. I don’t think there’s anything for them to “tone down” in that regard.


> Readers here would be expected to know all this.

I'm not asking about the OSI model. I've long moved beyond that. Sorry that wasn't clear from my message I was typing in haste. I suspect very few readers here have implemented a TCP/IP stack or would even know where to begin. So, I'm acknowledging this.

There is a comment in this thread asking, "How did you even get started with this?" and it was able to successfully solicit the response that I was looking for.


TCP was designed in 1974. Ethernet was designed in 1980. UDP was designed in 1980. ICMP was designed in 1981. ARP was designed in 1982. Think about the computing power available in 1974 or 1980-1982. If you break it down into chunks and follow the RFCs it's manageable. tap gives you easy access to read/write ethernet packets. From there you can start sending ARP queries and responding to ARP requests. ICMP is simple, adds support for that and now you can ping. UDP? just as simple. TCP is more work but most of that is error handling and timeouts so skip them for now.


Start at the hardware/IP layer. Easiest would be serial port for the hardware, and there are at least two methods of sending IP over serial, SLIP and PPP---there are documents out there that describe how those work. Then work on IP. IP itself is fairly easy---it's just best effort and IP options are just that---options. Then ICMP. Once you have that, then you can at least ping the device. UDP is the next easiest to implement, it's just portnumbers to an otherwise IP packet.


the osi model layers describe exactly what is going on in embedded systems (well all), though. look at a mcu board with ethernet. find the magjack. then the PHY, then the MAC, then...


> the osi model layers describe exactly what is going on in embedded systems

Unless it doesn't. OSI describes non-existing stack from the long gone past.


The OSI model describes the OSI protocols, does not really describe what goes on exactly.


That's a big assumption. Who says they are expected to know this?


Personally I say yes but who cares what I say. Let's see what this poll says in a few hours.

https://news.ycombinator.com/item?id=38092364


Your poll is asking a different question. You said people expected to know this. Implying that there is a minimum level of knowledge required to participate in hacker news.

Your poll is asking do they know what a TCP IP stack is. Even if 100 percent of respondents know, it doesn't validate your claim that hacker news requires this knowledge.

Btw I got really good responses to my questions about networking recently. Should I not be asking these types of questions?

https://news.ycombinator.com/item?id=37842863


the purpose of using TAP is to be able to write IP stack in user mode code; however TAP is not the only way on linux, you could also use raw(AF_PACKET)or XDP socket, they also allow you bypass kernel IP stack




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: