Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Posthook – Job Scheduling as a Service (posthook.io)
162 points by cgenuity on June 19, 2018 | hide | past | favorite | 75 comments

Hello all! I built Posthook as a simple solution for web applications to schedule one-off tasks. It lets you schedule a request back to your application for a certain time, with an optional JSON payload.

It can be an alternative to running your own systems like Sidekiq, Celery, or Quartz and the operational overhead that comes along with them. Cron jobs and cloud provider tools like CloudWatch Events are also used for job scheduling but lack observability and may force you to frequently make expensive queries to your data store just to see if there is any work to do.

Questions and feedback are greatly appreciated :)

Neat! We've been doing something like this at work, and another simple solution (curious if your Google cloud infrastructure is built on something similar) is Azure Service Bus queues with a function app trigger.

Unlike AWS SQS, Azure SB queues can have items scheduled to be queued at an arbitrary time, see https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure..... SQS can only delay up to 15 minutes, so you have to implement your own hack to schedule for later than that.

And also unlike SQS, they can trigger a function so you don't have to deal with the logic to listen yourself - or even the transactional message handling, if the function fails the message is automatically re-queued.

Aws just announced "ECS Daemon Scheduling" https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-ec... It sounds promising.

Thanks! I do heavily use Google Cloud Pub/Sub but it does not support scheduling items so I only schedule items when they are due for processing.

Wouldn't it be good to have your own pub/sub mechanism instead of relying on a third party like gce for a critical component. Take for example - https://github.com/grpc/grpc/issues/13327. It took them fair amount of time to fix that issue.

Interesting... AWS leaked that subscriptions to SQS are in the works, but it's good to see that Azure has that already.

> AWS leaked that subscriptions to SQS are in the works

Interesting, where did you hear that?

I'm unclear about the part about expensive queries to the data store. What cron/celery/quartz job would require a check to the data store for a check that wouldn't need to be done with posthook? It seems like if work depends on a decision made with data from my data store, that doesn't change based on what I use as a timer for the task. I'm not sure that I see a clear value-add here.

With a recurring job say every minute, the query triggered by the job to send out event reminders would be something like "get me all the events starting in one hour." With Posthook, you are able to start jobs only when needed and the query changes to something like "do I still need to send out a reminder for this event id."

Hey cgenuity, Great to see the simplicity, I am working on a very similar product, launched couple of days back on HN (https://send.rest). Best of luck.

But have added couple of extra features in send.rest

1) Calling external APIs ( Think : pull data from Facebook while running this task in future and post data on my API or send SMS )

2) Reports and retries ( Try 3 times or send me sms its failing )

3) Recurrence ( Call this task every Monday 9:00 pm )

4) SMS and Emails comes along with them ( Send SMS or Email , pick any service as your backend )

I guess without some extra features it will be difficult to get a market acceptance.

Thank you, good luck to you as well!

I've found that with scheduling tasks for the future it's important to do a final check before fulfilling the action, whether it's sending an email, push notification, etc. You don't want to send a reminder for an event that has been deleted, for example. That is why I have decided to keep the scope small and let developers make the final decision there.

Reports and retries are definitely things that I think add value though, and I have plans to expand on that.

I also think there might be a market for this kind of product .. Great to see more people think on the same line .. Got some paid signups from last week also ..

You might be right about email/push, but it certainly comes handy if minimal config is required.

There are already mature products in this market, e.g. RunMyJobs: https://rmj.redwood.com/

This started out as a distributed cron over 20 years ago, available as a saas product for a few years already.

With send.rest and using hook, can I specify a POST and POST data and custom request headers?

send.rest supports HAR1.2 calls, you can technically call any API

Thank you for this. I've been on the hunt for this exact service for awhile now but a lot of the options were lacking. Excited to give this a try.

Thanks! Happy to provide value, feel free to use the live chat or the support email for any specific questions or requests you might have.

Does Posthook support recurring scheduling (i.e. every 15 minutes)?

Also, is there retry logic? I.E. 3 retries, delay 60 seconds between retries?

It does not support recurring scheduling at the moment.

Right now the retry logic is just one retry 5 seconds after the first failure. At which point the hook gets set to a failed status and failure notifications get sent out. Retries are tricky because depending on how the job is implemented they can cause more harm than good. So I plan to refine that more based on customer feedback.

> Retries are tricky because depending on how the job is implemented they can cause more harm than good.

Ah, yes, there's nothing like bringing capacity back online only to have it crushed by all your customers retrying at the same time.

AWS got bit hard by this[3] but there's a blog post[1] about it, which is linked to by the docs for their client software[2].

[1] https://aws.amazon.com/blogs/architecture/exponential-backof...

[2] https://docs.aws.amazon.com/general/latest/gr/api-retries.ht...

[3] https://aws.amazon.com/message/5467D2/ ... basically DynamoDB is a fundamental service for AWS and had implemented some new streams features. This all appeared to be working, but they were running closer to capacity than intended, and when a cluster went out this caused a cascading failure.

And see this which linked to that RCA: https://blog.scalyr.com/2015/09/irreversible-failures-lesson...

Seems like allowing the user to specify retry logic (if any) covers the "tricky" pieces. Let the user define the number of retries and delay between.

Agreed, thank you :). Added to the board.

What happens when you have a service outage? Do you directly mark the hooks as failed, or do you retry once after your service has been restored?

If the service outage is on Posthook's side, they would be retried.

If the outage is on the customer's side, all hooks that were attempted during the outage would be marked as failed. I plan on adding a feature that will allow the developer to fire off again all failed hooks in a given time period.

I just posted a scalable service that does support recurring scheduling ( every 15 minutes, as fast as every second, or long as every n months ). Show HN here: https://news.ycombinator.com/item?id=17353486

I use cron-job.org for recurring hooks. It's not as developer friendly as posthook but it gets the job done.

What is the absolute easiest way to do this on your own machine?

systemd timers most likely. You already have systemd and unit configurations are small.

For dynamic stuff using normal systemd templates should work.

This is definitely a needed service. As a business, if something isn't core to the business model, it's better to pay money and avoid distractions, at least until it's clear that something is actually core to the business.

That said, I've wanted something akin to a distributed cron, without the complexities of workflow engines like airflow or azkaban, so I started writing a little HTTP API for scheduling jobs [1], which is mostly complete except for configurable retry logic and failure notifications. To schedule a periodic job that runs every 30 seconds,

    curl http://$HOSTNAME/jobs \
    -d '{ "title"      : "Healthcheck"
        , "code"       : "curl example.com"
        , "frequency"  : "30 seconds" }'
[1] https://github.com/finix-payments/jobs

Hey. I just made this. It's a coincidence that the OP and me are building it at the same time, but AFAICT the OP service does not support recurring tasks, but Pocketwatch does. Per your example, try:

   curl https://api.pocketwatch.xyz/v1/timer/new \
   -H "Content-type: application/json" \
   -d '{ "apiKey"             : "i_am_hn_and_proud" 
       , "url"                : "https://host.tld/?my_param=my_val" 
       , "method"             : "POST"
       , "interval_unit_type" : "second" 
       , "interval_unit_count": 1 
       , "duration_unit_type" : "minute" 
       , "duration_unit_count": 6 }'
You can use curl but the Node.js client library[1] is probably easier. I've made a free demo API key for HN readers that should be enough to let everyone try it out a bit[2].

[1] https://www.npmjs.com/package/%40dosy%2Fpocketwatch [2] https://news.ycombinator.com/item?id=17353486

> it's better to pay money and avoid distractions, at least until it's clear that something is actually core to the business.

Sure, but what about avoiding liabilities?

Just curious, why would I use this instead of Sidekiq, Celery or any other job / background worker service that runs along side my app?

Almost all popular job processing libraries in all major frameworks support executing 1 off tasks with robust retry policies and even support recurring tasks.

The operational complexity is really just running the sidekiq or celery command in another Docker container or if you don't use Docker, then setting up 1 extra systemd unit file.

I once had a Celery / Flask app running on a $20 / month digitalocean server. It handled over 3 million background jobs per month and it also hosted my DB server, cache server and web front end.

On your pricing page, your highest tier supports up to 100,000 requests per month at $129/month. How much would it cost to do 3 million requests per month?

Also how would you set up custom retry policies?

I understand Posthook may not be the best solution for everyone's scheduled tasks. It seems like for that project you had the expertise, time, and risk tolerance to set up that one DO server to handle all your needs. Other developers may be in a different situation and may find Posthook useful.

> How much would it cost to do 3 million requests per month?

If you want to use Posthook to schedule 3 million requests a month please email support [at] posthook.io and we can work out a special plan and support contract. Do keep in mind that background jobs and scheduled jobs are two different things and what I aim for Posthook to solve are scheduled jobs. I suspect a big part of those 3 million were background jobs.

> Also how would you set up custom retry policies?

The retry policy is fixed at the moment but it seems like this a common request so I want to roll that feature out soon.

> I suspect a big part of those 3 million were background jobs.

Yep, almost all of them were tasks that should be executed as soon as possible, but Celery also allows for you to schedule tasks at a future date, and deal with running scheduled tasks at a repeated time (every Sunday at 3am, etc.).

As for the price, I was just curious for the sake of wanting to compare it to the $20 / month DO set up.

Thanks for the answers.

Once you have a retry policy, you pretty much by definition have scheduled jobs, so just slap an API on there and you’re all set. (jk)

> How much would it cost to do 3 million requests a month?

Hey! Coincidentally at the same time as the OP, I just built a recurring task hook as a service, and I can do 1 request every second (2.6 million per month) for USD$26

You can see that pricing if you plug your requirement (or if you're just curious) into the console[1]. That console is for buying once-off timers.

But I've also made a free API key for people on HN to try out using the API[2]. (those free demo jobs will be capped at 2 weeks no matter what duration is requested tho, and I might end up capping individual timers a certain amount as well, but not yet).

There's also a console for trying out this free API key (which doesn't including pricing/cost info)[3], and I just put this free demo on Show HN[4].

[1] https://pocketwatch.xyz [2] https://dosyago-corp.github.io/pocketwatch-api/ [3] https://api.pocketwatch.xyz/fancy.html [4] https://news.ycombinator.com/item?id=17353486

I would be really interested in a distributed job running system with a scheduler. I want a UI where I can set jobs via a schedule that run on one of many workers. In that same UI I want to be able to look at failed jobs and see all of the output from that job. I don't want to have to ssh over to the box where the job ran to retrieve a log file. I once had a job where we used Tidal Enterprise Scheduler which poorly implemented this. What solutions exist for this workflow in the open source world?

Look into airflow: https://airflow.apache.org/

Not quite distributed, but https://www.rundeck.com/open-source is pretty nifty.

Rundeck, as others have indicated, will do just that.

I've used Tidal too, it's utterly horrible and universally hated by anyone I know who has used it. What took months to set up and debug in Tidal, took us hours in Rundeck.

Rundeck does exactly this.

If you already have something like Jenkins you could probably bend that to your will, too.

Would you pay for this? How much and what pricing scheme would you prefer?

As a service that runs on Heroku I'd pay around $25/month. that'd be $25/month * 2 (stage+prod). At the bootstrapping stage i'm at now, that's what I'd pay. If I were profitable, I'd pay up to $500 - $1000 a month and happily so, figure out market segmentation so that I'd be forced to pay that much. I would probably run around 1000 jobs a month, which seems like a number that would put me in a higher pricing range. This service would be super useful for around 60 jobs a month even. Output from the jobs would be tiny, maybe 10k at most. I would be happy if output was saved on private S3 buckets.

I guess the big point is, I'm looking to use this for regularly scheduled jobs, not just as a queuing service to spread load.

My email is in my profile if you want to get in touch.

This is a good niche with the rise of serverless as there isn't a super simple way to do this if you want to hit some job every so often. You could set a header key that all requests require and bam, you got security for an ongoing task. I use kue.js myself with jobs for node and glue stuff together with redis across servers but serverless seems like it could work for a lot of use cases coupled with this idea with less code overall.

Thanks sebringj! I agree and the rise of serverless was definitely one of the motivating factors to me building this service. Security is handled through signing the payload that includes a timestamp with an HMAC to prevent replay attacks as well.

i wonder how you ensure its 24-7 reliable? i am definitely interested in using it

ah ok -> https://aws.amazon.com/batch/ that's probably how

How do you compare/compete against something like Azure Scheduler which gives you 500 jobs with unlimited executions (max 1 per minute) for $14/month.

The Azure service also supports logging (60 days), retry options, and advanced recurrence schedules (every other Tue, etc.)?

Hey. That's interesting. I'll answer as well since coincidentally at same time as OP I just built a recurring webhook scheduler[1].

That Azure pricing is very competitive / cheap! That's 22 million requests per month for USD 14.

1 timer on my service can do max request 1 per second (2.6 million per month) and that 1 timer would cost $26.

On my biggest plan you can get 4000 timers and 100 million requests per month for $495.

Still Azure is beating my prices by about 10 times. But they can't do second resolution (yet?), so they're still only a "distributed cron".

Azure Scheduler seems to be aimed towards recurring tasks. So a scheduled request from Posthook is more actionable than a recurring execution from Azure Scheduler for the use cases I'm aiming to help solve with Posthook (think reminders). Also I'm hoping Posthook is a simpler offering in terms of pricing and integration.

Nice simple solution to a sure problem. Kudos, and good luck!

Lately, we've had great experience using Hangfire (https://www.hangfire.io) for async job processing in .NET. It's an awesome piece of software.

I use cloudwatch events, you can configure cron/frequency and define json to send out.

We use CloudWatch events too. It works really well but does require a bit more technical know-how and debugging than a simple cURL so there is that.

This is neat, nice work OP.

I haven't seen this mentioned yet but you could leverage Zapier.com for this type of work. They have a "schedule" app you can pipe it into an outgoing webhook to achieve the same result.

It's pretty powerful. for example, at work, we wanted to run a specific task at different times: nightly, after deploys, and using a slack command. This was all achieved using 3 Zaps (Zapier apps) and it took no more than 30 minutes for the whole thing.

The status page is currently timing out for me: https://status.posthook.io/

Seeing the same, thanks. Looks like updown.io is having an issue with TLS. It also lives here if you wanted to take a look: https://updown.io/fj8q

That's something I will definitely use soon! Great simple idea indeed. Just one question regarding the criteria to come up with the current pricing tiers: isn't it a bit too expensive for an one-off scheduling? I mean, over time I could totally see this as a commodity service going by the same price of popular VPS offerings (in tandem: they increase it, you increase it too, prices go down yours goes too).

Thank you! Pricing is still subject to change as I get more data from the market but the aim was to price it by the value it provides developers over just going off what the servers cost to run.

This is pretty brilliant and fits an immediate need we have. Signing up today.

Thank you! I was surprised something like this didn't exist when I needed it. Feel free to use the live chat or email support for any specific questions you have.

This is perfect addition to a Serverless stack. Thank you.

How does this compare to http://atrigger.com?

Nice work! I built https://cronally.com a few years ago for a similar use-case.

Keep it up!


Seems more like at-as-a-service?

How does it compare to https://www.setcronjob.com?

Does anyone know of a free service that lets you run a few scheduled CURL requests each week?

Just set up a handler to `/wp-admin/index.php` guaranteed to be called multiple times a day for free.

Check EasyCron https://www.easycron.com/

Awesome. Thank you

For free on app engine, you can have a cron trigger a local URL which can then invoke a remote URL. I do this every 8 hours (I actually go one further and have that external URL be a Google Cloud function which does a bit of serverless work for me, also at no cost). See here: https://github.com/cretz/badads/tree/master/cron

You could leverage Zapier to do this. Their pricing suggests you can make two-step Zaps for free and that's all you'd need.

Step 1 - schedule Step 2 - outgoing webhook

It takes all of 5 minutes to set things up but we have a pro account so I haven't tried the free tier yet.

Also check out https://cronally.com -- you can schedule HTTP requests or hook directly into SNS.

We do something similar at Iron.io but with Docker containers.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact