Hacker Newsnew | past | comments | ask | show | jobs | submit | hijinks's commentslogin

Location: Southern CA | Remote: Yes, remote only | Willing to relocate: No

Technologies: Kubernetes, Terraform, AWS, Go, Python, Docker, Helm, Argo, Kong, Prometheus, VictoriaMetrics, Vector, Quickwit

Resume/CV: [available on request]

Email: open4work2026@gmail.com (want to keep it anonymous because coworkers are on HN)

24 years in infrastructure, currently Lead Platform Engineer at a security AI startup where I built the platform team from scratch (0 to 5 engineers) and architected multi-tenant Kubernetes infrastructure across 40+ clusters with per-tenant KMS encryption and Argo-based progressive delivery.

Before that, spent nearly 6 years as Staff SRE at Iterable where most of my work centered on making infrastructure cheaper and more reliable. Replaced ELK with Quickwit for 65TB/day log ingestion (saved $1.2M/year), migrated Lambda pipelines to Vector ($60k/mo down to $8k/mo), and stood up VictoriaMetrics to replace Datadog for 33M metrics ($30k/mo savings). I tend to find the project that saves the most money and then actually build it.

I write Go -- built Kubernetes operators for RabbitMQ, Redis, and Vector (open source: github.com/zcentric/vector-operator). Terraform is my IaC of choice and I've used it everywhere from a global Predix platform (Japan, EU regions) to scaling Kubernetes clusters from 10 to 300 nodes on spot instances for satellite image processing at Urthecast.

I've managed teams at three companies (up to 6 engineers) and served as product owner for SRE groups, but I'm open to both IC and management roles at the Staff/Principal/Director level.

Looking for: DevOps, SRE, Platform Engineering, or Cloud Infrastructure roles. $200K+ total comp. Remote only, US-based.


i had to do this for ssh

host * SetEnv TERM=xterm-256color


is there a better way then bloom filters to handle needle in the haystack type searches where the haystack might be terabytes of data and you only want a few lines?


There are a lot of "better than Bloom" filters that work similarly in some aspects. I have used Cuckoo [1] and Ribbon [2] filters for Bloom-type applications. If you have an application where you do a lot of one kind of searching, it may also be worth implementing a specialized variant of a data structure. I needed a Cuckoo-type filter on the JVM but only for 64 bit integers and I was able to make a smaller, faster code base that was specialized to this data type instead of handling generic objects.

You need to know up front whether you need to be able to dynamically add entries to the filter or if your application can tolerate rebuilding the filter entirely whenever the underlying data changes. In the latter case you have more freedom to choose data structures; many of the modern "better than Bloom" filters are more compact but don't support dynamic updates.

[1] https://en.wikipedia.org/wiki/Cuckoo_filter

[2] https://engineering.fb.com/2021/07/09/core-infra/ribbon-filt...


I wonder how often in the wild people are tuning for a 1% false positive rate versus a much lower one, like .1%. You do quickly reach data set sizes where even 1% introduces some strain on resources or responsiveness.

Cuckoo claims 70% of the size of bloom for the same error rate, and the space is logarithmic to the error rate. Looks like about 6.6 bits per record versus 9.56 bits for bloom at 1%. But at .5% error rate a cuckoo is 7.6 bpr. In fact you can get to about a .13% error rate for a cuckoo only a hair larger than the equivalent bloom filter (n^9.567 = 758.5)


Cuckoo filters can do even better with the small adjustment of using windows instead of buckets. See "3.5-Way Cuckoo Hashing for the Price of 2-and-a-Bit": https://scispace.com/pdf/3-5-way-cuckoo-hashing-for-the-pric.... (This significantly improves load factors rather than changing anything else about the filter, and ends up smaller than the semi-sorted variant for typical configurations, without the rigmarole.)

My fairly niche use case for these kinds of data structures was hardware firewalls running mostly on SRAM, which needed a sub one-in-a-billion false positive rate.


thanks.. i'll read up into these.. always amazes me that companies like datadog somehow made log search quick


cant wait.. i guess on the 27th they are dropping support for SAML


people have seemed to forget that there were some people in LA that use to walk outside in gas masks due to all the air pollution.


if op was working at Meta for 10 years and they started at 2013 they probably have more then enough money stashed away to not worry about bills for a few years.


highly likely they have millions from the stock alone, assuming they didnt sell


correct, the guy is not exactly strapped for cash after 10 sweet years at Facebook. He's 10M+ NW probably.


they are planning to allow you to run your logs in your own datacenter/cloud and put something like a proxy there or being built into quickwit that your logs show up in the datadog UI

My guess is you will be billed per gig or something but not nearly the cost of shipping your logs to DD


too many people and not enough land in areas where people don't have to drive 3 hours to work.

Want pricing to go down then we need to build more dense housing even an hour drive from the city. The days of wanting a big backyard are coming to an end for most home owners.


> Want pricing to go down then we need to build more dense housing

You need mass transit and transport integration. House density can only move you so far (and it's not very far).


Luckily, density is what makes mass transit viable. It's more cost effective to run a driverless metro every three minutes in an urban core than to run a mostly-empty bus once an hour in a distant suburb.


> Luckily, density is what makes mass transit viable.

Up to some point, and it's not even that high...

What really makes mass transit viable is integration.


What if we put the jobs closer to the people instead of making the people get closer to the jobs? Just drop a big ol’ tech park in the middle of Oakland?


Who is the “we” in that sentence? Is there a Central Planning Bureau that forces “jobs” to be placed in certain locations? What jobs would you place near the people, whatever that means?


I often see comments in the theme of "dense housing is the panacea."

You can't really run power tools in dense housing, correct? Or fix stuff yourself? Sounds awful.


I was born, raised, lived in dense cities. I've lived in semi-suburban life as well. Unless you're into some hobbies that requires such tools, you just never use it? And when you have to... you just use it? I live in an apartment building in a city, and once a month or so, during daytime, people use tools and it's no biggie.

To each their own though. I definitely grew to understand that if someone was raised in rural or suburban life, it would be extremely hard to adjust to hardcore city life, and vice versa. But I don't think we should be blocking build ups for one, if there's demand.


We just bought a place in a dense area of The Hague, and I run a table saw + shop vac frequently as we renovate. No complaints yet, just keep my hours between 10-6. Lots of other neighbors doing similar stuff too.

There are lots of benefits to density. Our grocery store and day care are less than ten minutes away on foot, because there's a ton of people so we can support these kinds of businesses (also weed, hair salons, bars, cafes, boutiques, secondhand stores, restaurants, play cafes, etc etc.)


you can use the vscode cline to give a task and it uses a LLM to go out and create the app for you.

In django i had it create a backend, set admin user, create requirements.txt and then do a whole frontend in vue as a test. It even can do screen testing and tested what happens if it puts a wrong login in.


does anyone use this?

I'm really starting to get sick of companies that claim they operate at petabyte at scale and find you need to spend 400k a month to support that scale.


Thousands of active deployments globally.

How many open source log systems work at PB scale given any number of resources? Also FWIW, OpenObserve can ingest data at 28 MB/Sec/Core (We are working on optimizing it even more) and ingesting 1 PB of data would cost just $435 based on on-demand prices (AWS m7g family).


That doesn't answer the question of who? A (rightly) cynical reading of what you posted could just be "thousands of active deployments" you did for yourself to prove benchmarks.


Machines I would use for benchmarking would go down after some time and won't be active.


Still didn't answer the "who" part.


We will publish many names on our website soon.


Why is it only 28 MB/core-second?

Is that production rate, inbound bandwidth, rate to persistence, rate to processed, or rate to display?


Compute power is required to process and store the incoming data.

It's not "only 28 MB/Sec/Core". Try doing same with Splunk/Elasticsearch - You won't go past 5 MB/Sec/Core (Typically it will be lower) on their best day.


To what state?

Suppose I have 28 GB of trace data in memory on a machine and then I fire that off. What do I have after 1000 seconds?

Do I just have a file of 28 GB of raw trace?

Do I have 28 GB of raw trace in memory ready to be indexed?

Do I have a data structure in memory ready to be searched?

Do I have the full trace information rendered on my screen (or a aggregated visualization derived after processing all the data)?

If it is the first, that would be ridiculously slow. If it is one of the latter ones, then it would depend on what querying operations are fast.

28 MB/core-second makes no sense without the context of what you can do quickly after the “processing” is done.


Too much to give all details in an HN thread. To simplify the conversation, Data will be persisted and usable for individual searches and aggregations. I would welcome you to our slack workspace for any further questions you may have - https://short.openobserve.ai/community


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: