Hacker News new | past | comments | ask | show | jobs | submit login
The Basics of Apache Kafka (pankajtanwar.in)
81 points by the2ndfloorguy on Sept 7, 2021 | hide | past | favorite | 11 comments



At the computer science level, there's What Every Software Engineer Should Know About Real-Time Data Unifying from 2013:

https://engineering.linkedin.com/distributed-systems/log-wha...

Reading it was an aha. At its core, the log is the simplest thing that might work in a lot of cases.


Not a CS myself, but that was a fascinating read (and very approachable). Thanks!


We use Kafka a lot within the company I work for. I think it's great. The only thing I miss is fast lookups based on some key and/or the ability for subscribers to only receive messages for certain keys.


That’s most likely the trade off between the queue and a log. The property of the log is having to look at every message.

The prefixes can be somewhat done with a custom partitioner, the lookup requires another technology. The only way to know the final value under the key is to look at the complete log.


oddly enough, zeromq has the prefix/key style subscription.

you might look at ksql for filtering purposes or just tried to work such things into your topics/partitions?


Kafka 2.8.0 removes the mandatory zookeeper dependency.


This should be labelled as Show HN. One comment - towards the end you say

> You might think that Kafka would be using queue data structure internally. It’s not true. Kafka uses a “log” data structure. It is a persistent data structure which allows only appends, no editing, no deletion. In detail, we will cover this some other day.

Almost immediately followed by:

> Ok, I have a confession . I lied. A Kafka topic is not just a single queue. It’s a combination of queues which helps kafka scale. Every queue is called partition.


> This should be labelled as Show HN

No, it's a simple blog post.


Yes, posted by the author of the blog post.


From the FAQ [0] -

> Show HN is for sharing your personal work and has special rules.

The first line of the rules [1] -

> Show HN is for something you've made that other people can play with. HN users can try it out, give you feedback, and ask questions in the thread.

[0]: https://news.ycombinator.com/newsfaq.html

[1]: https://news.ycombinator.com/showhn.html


You’re correct. I have learned something new:

> Off topic: blog posts, sign-up pages, newsletters, lists, and other reading material. Those can't be tried out, so can't be Show HNs. Make a regular submission instead.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: