Hacker News new | past | comments | ask | show | jobs | submit login

I've built my personal side project as microservices. I started with an initial POC in Python and then I had a clear vision for what services to build.


> I’d have the readme on Github, and often in an hour or maybe a few I’d be up and running when I started on a new project.

I can deploy all of my services with one command. It's trivial - and I can often just deploy the small bit that I want to.

I don't use K8s or anything like that. Just AWS Lambdas and SQS based event triggers.

One thing I found was that by defining what a "service" was upfront, I made life a lot easier. I don't have snowflakes - everything uses the same service abstraction, with only one or two small caveats.

I don't imagine a Junior developer would have a hard time with this - I'd just show them the service abstraction (it exists in code using AWS-CDK)[0].

> This in contrast to my standard consolidated log, and lets not forget my interactive terminal/debugger for when I wanted to go step by step through the process.

It's true, distributed logging is inherently more complex. I haven't run into major issues with this myself. Correlation IDs go a really long way.

Due to serverless I can't just drop into a debugger though - that's annoying if you need to. But also, I've never needed to.

> But now to really test my service I have to bring up a complete working version of my application.

I have never seen this as necessary. You just mock out service dependencies like you would a DB or anything else. I don't see this as a meaningful regression tbh.

> That is probably a bit too much effort so we’re just going to test each piece in isolation, I’m sure our specs were good enough that APIs are clean and service failure is isolated and won’t impact others.

Honestly, enforcing failure isolation is trivial. Avoid synchronous communication like the plague. My services all communicate via async events - if a service fails the events just queue up. The interface is just a protobuf defined dataformat (which is, incidentally, one of the only pieces of shared code across the services).

Honestly, I didn't find the road to microservices particularly bumpy. I had to invest early on in ensuring I had deployment scripts and the ability to run local tests. That was about it.

I'm quite glad I started with microservices. I've been able to think about services in isolation, without ever worrying about accidental coupling or accidentally having shared state. Failure isolation and scale isolation are not small things that I'd be happy to throw away.

My project is very exploratory - things have evolved over time. Having boundaries has allowed me to isolate complexity and it's been extremely easy to rewrite small services as my requirements and vision change. I don't think this would have been easy in a monolith at all.

I think I'm likely going to combine two my microservices - I split up two areas early on, only to realize later that they're not truly isolated components. Merging microservices seems radically simpler than splitting them, so I'm unconcerned about this - I can put it off for a very long time and I still suspect it will be easy to merge. I intend to perform a rewrite of one of them before the merge anyways.

I've suffered quite a lot from distributed monolith setups. I'm not likely to jump into one again if I can help it.

[0] https://github.com/insanitybit/grapl/blob/master/grapl-cdk/i...

Grapl looks quite interesting. I'm looking for something similar for public cloud (e.g. cloudtrail+config+?? for building graph+events). Is there a general pattern you employ for creating the temporal relationship between events? e.g. word executing subprocess and then making a connection to some external service. Just timestamp them or is there something else?

I think what you're getting at is Grapl's identification process. It's timestamp based, primarily, at the moment, yes.

A bit of the algorithm is described here: https://insanitybit.github.io/2019/03/09/grapl

More specifically Grapl defines a type of identity called a Session - this is an ID that is valid for a time, such as a PID on every major OS.

Sessions are tracked or otherwise guessed based on logs, such as process creation or termination logs. Because Grapl assumes that logs will be dropped or come out of order/ extremely delayed it makes the effort to "guess" at identities. It's been quite accurate in my experience but the algorithm has many areas for improvement - it's a bit naive right now.

Happy to answer more questions about it though.

Based on what you seem to be interested I'd like to recommend CloudMapper by Scott Piper.


The blog post is super helpful! I think the session concept is the thing I needed. Thank you!

I tried running cloudmapper but I think I would need to replace the backend with a graph database and scrap the UI parts. We've got hundreds of AWS accounts and I'm having trouble just getting it to process all the resources in one of them.

FWIW, Scott Piper, who builds CloudMapper, also consults.

Glad I could help.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact