Hacker News new | past | comments | ask | show | jobs | submit login

Not a very deep CS-y one, but still one of my favourite data structures: Promise Maps.

It only works in languages where promises/futures/tasks are a first-class citizen. Eg JavaScript.

When caching the result of an expensive computation or a network call, don't actually cache the result, but cache the promise that awaits the result. Ie don't make a

    Map<Key, Result>
but a

    Map<Key, Promise<Result>>
This way, if a new, uncached key gets requested twice in rapid succession, ie faster than the computation takes, you avoid computing/fetching the same value twice. This trick works because:

- promises hold on to their result value indefinitely (until they're GC'ed)

- you can await (or .then()) an existing promise as many times as you want

- awaiting an already-resolved promise is a very low-overhead operation.

In other words, the promise acts as a mutex around the computation, and the resulting code is understandable even by people unfamiliar with mutexes, locks and so on.




FYI, if you plan to do this in ASP netcore, combine with AsyncLazy for the most optimal results https://github.com/davidfowl/AspNetCoreDiagnosticScenarios/b...


Neat page from David Fowler. I didn't know about that one. Thank you.

To further explain why you want to use an AsyncLazy (or Lazy for synchronous programming) when working with ConcurrentDictionary I find this article to be helpful: https://andrewlock.net/making-getoradd-on-concurrentdictiona...


Be careful with this data structure. If the language allows async exceptions or you have a big where the promise won’t become deferred, there are a lot of edge cases. Examples of edge cases:

- if the promise never becomes determined (eg bug, async exception) your app will wait forever

- if the promise has high tail latency things can be bad

- if the language eagerly binds on determined promises (ie it doesn’t schedule your .then function) you can get weird semantic bugs.

- changing the keys of the table incrementally instead of just using for memoisation can be really messy


I have a couple pieces of code where we had to add rate limiting because it was just too easy to make too many async calls all at once, and things only got 'worse' as I fixed performance issues in the traversal code that were creating an ersatz backpressure situation.

Philosophically, the main problem is that promise caches are a primary enabler of the whole cache anti-pattern situation. People make the mistake of thinking that 'dynamic programming' means memoization and 'memoization' means caching. "Everything you said is wrong." Trying to explain this to people has been a recurring challenge, because it's such a common, shallow but strongly-held misunderstanding.

Youtube recently suggested this video to me, which does a pretty good job of explaining beginning and intermediate DP:

https://www.youtube.com/watch?v=oBt53YbR9Kk

What I love most about this video is that it introduces memoization about 10 minutes in, but by 20% of the way through it has already abandoned it for something better: tabulation. Tabulation is an interative traversal of the problem that gathers the important data not only in one place but within a single stack frame. It telegraphs a information architecture, while recursive calls represent emergent behavior. The reliance on cache to function not only obscures the information architecture, it does so by introducing global shared state. Global shared state and emergent behavior are both poison to the long-term health of a project, and here we have a single data structure that represents both in spades.

We are supposed to be fighting accidental coupling, and especially global variables, not engaging in word games to hide what we are doing.

So I think my answer to OP's question is 'tabulation', which is part data structure and part algorithm.


I wrote some Python code recently that uses a similar data structure (Futures instead of Promises, without knowing necessarily about the data structure's use in JavaScript). It wasn't really for caching.

I modeled mine on the Django ASGI reference library's server implementation, which uses the data structure for maintaining references to stateful event consumers. Exception handling is done with a pre-scheduled long-running coroutine that looks at the map.

I'm curious about your second point -- why exactly do things get bad with high tail latency? Is it only a weakness of the data structure when used for caching? I'm having trouble picturing that.


Suppose the call to evaluate has a p95 of 5 seconds (this is very large of course). If your first call to compute a value hits it, all the subsequent requests for that cell are blocked for 5s. If you didn’t try to cache, only one might block for 5s and the rest could go through fast. On the other hand if you do 20 requests then about one of them will get the p95 latency.


Best practice in my experience is to use a timeout on all async operations to handle edge cases 1 & 2 above. The third case isn't possible with JavaScript AFAIK.


- if the language eagerly binds on determined promises (ie it doesn’t schedule your .then function) you can get weird semantic bugs.

What would be an example of this? If the promise has been determined, why not just immediately run the .then function?


The reason not to do it is to limit the non-determinism of the order of execution.

In Javascript

    f1();
    p.then(f3);
    f2();
functions are always called in the relative order f1, f2, f3 and never f1, f3 , f2 even if p had already resolved.


I know this as memoization with lazy evaluation, nothing new, but at the same time very useful, and I would argue, it is very CS.


not really quite lazy evaluation, thought, at least in Javascript. Promises begin execution as soon as they are created and there is no way to delay that.


Promises are an implementation of lazy evaluation for Javascript. This is exactly lazy evaluation.

By the way, lazy evaluation is the one that offers no guarantees about when your code will be executed. If you can delay the execution, it's not lazy.


Wait. I thought lazy evaluation is defined as evaluation at the time a value is needed. After that I think it will then be in Weak Head Normal Form, which can be thought of as "evaluated at its top level"... but I'm a bit rusty. Basically, an expression gets used somewhere (e.g. pattern matched, in the ADT sense). If it's in WHNF, cool, it's evaluated already (subexpressions within may not yet, but that's their problem--that's exactly what WHNF says). If not, we evaluate it down to WHNF.

Wikipedia states it as: "When using delayed evaluation, an expression is not evaluated as soon as it gets bound to a variable, but when the evaluator is forced to produce the expression's value."

So if evaluation begins effectively at the time the promise is issued, not at the time the promise is awaited, then I would not call that lazy evaluation, and what hn_throwaway_99 said sounds correct to me.

Is my rusty old PL understanding wrong?


> Wait. I thought lazy evaluation is defined as evaluation at the time a value is needed.

Correct.

You've got it right, GP comment has it wrong.

Lazy does not simply mean "evaluated later". From what I understand, in JS, if you call an async function without `await`, it essentially gets added to a queue of functions to execute. Once the calling function stack completes, the async function will be executed.

A queue of functions to execute does not constitute "lazy evaluation" on its own.


Definition on Wikipedia says otherwise:

“In programming language theory, lazy evaluation, or call-by-need,[1] is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing).”

Haskell would be an apt example here.


Promises in javascript are absolutely not considered 'an implementation of lazy evaluation.' Promises are invoked eagerly in javascript.


I've always called it request piggy backing (atop memoization). Our db abstraction at my last gig absolutely required it for scaling purposes. Great tool for the tool belt.


Can someone explain to me why the second example is better.

To me it seems to be the same thing. Replace result with int and I literally do not see a problem with the first one.

Also why is a mutex or lock needed for Result in javascript? As far as I know... In a single threaded application, mutexes and locks are only needed for memory operations on two or more results.

With a single value, say an int, within javascript you are completely safe in terms of concurrent access. Only 2 more more variables can cause a race condition. Or am I wrong here?

--edit:

Thanks for the replies. I see now. The mutex concept is not around memory access or memory safety. It's solely around the computation itself. When the Promise is "pending". It has nothing to do with safety. The parent did mention this, but I completely glossed over it.


You can put the promise into the cache immediately but you can only put the result from the promise into the cache once the promise resolves. So if an identical request comes in a second time before the promise has been resolved, then if you are caching the promise you have a cache hit but if you are caching the result then you have a cache miss and you end up doing the work twice.


I am still not understanding the purpose of this as I believe it is grounded on the wrong assumption.

Pretty much every single asynchronous operation other than some `Promise.resolve(foo)` where foo is a static value can fail. Reading from the file system, calling an api, connecting to some database, etc.

If the original promise fails you're gonna return a cached failure.

Mind you, I'm not stating this might be completely useless, but at the end of the day you will be forced to add code logic to check all of those asynchronous computation results which will eventually outweight the cost of only saving the resolved data.


> If the original promise fails you're gonna return a cached failure.

This is usually a good thing, and even in cases where it isn't, it's often a worthwhile trade-off.

In the common case a failure will either be persistent, or - if load/traffic related - will benefit from a reduction in requests (waiting a while to try again). In both of these cases, where your first key request fails, you want the immediately-following cases to fail fast: "caching" the promise caches the failure for the duration of one request (presumably the cache is emptied once the promise resolves, allowing subsequent key accesses to retry).

The less common case where the above isn't true is where you have very unstable key access (frequent one-off failures). In those cases you might want a cache miss on the second key request, but successful key retrieval usually isn't as critical in such systems which makes the trade off very worthwhile.


> If the original promise fails you're gonna return a cached failure.

In the scenario where that's an issue, you would need to add (extremely trivial 3-5 lines) logic to handle retrying a cached failure. The underlying data structure would continue to be a promise map.


> If the original promise fails you're gonna return a cached failure.

In many cases, another near-in-time request would also fail, so returning a cached failure rather than failing separately is probably a good idea (if you need retry logic, you do it within the promise, and you still need only a single instance.)

(If you are in a system with async and parallel computation both available, you can also use this for expensive to compute pure functions of the key.)


You do one IO operation instead of two.

It's unlikely that always doing two is going to be more successful than just trying once and dealing with failure


It's a tiny optimization.

When the VERY FIRST ASYNC operation is inflight the cache is immediately loaded with a Promise, which blocks all other calls while the FIRST async operation is in flight. This is only relevant to the very FIRST call. That's it. After that the promise is pointless.

As for the Promise failure you can just think of that as equivalent of the value not existing in the cache. The logic should interpret the promise this way.


It's not always a tiny optimization. If you have an expensive query or operation, this prevents potentially many duplicative calls.

A practical example of this was an analytics dashboard I was working on years ago -- the UI would trigger a few waves of requests as parts of it loaded (batching was used, but would not be used across the entire page). It was likely that for a given load, there would be four or more of these requests in-flight at once. Each request needed the result of an expensive (~3s+) query and computation. Promise caching allowed operations past the first to trivially reuse the cached promise.

There are certainly other approaches that can be taken, but this works very well as a mostly-drop-in replacement for a traditional cache that shouldn't cause much of a ripple to the codebase.


Yep, I've been bitten by exactly this failure mode.

You have to invalidate/bust the cache when a failure is returned (which is racy, but since cache busting is on the sad path it's a fine place to protect with a plain ol' mutex, assuming you even have real parallelism/preemption in the mix).

Alternatively you can cache promises to functions that will never return failure and will instead internally retry until they succeed. This approach generalizes less well to arbitrary promises, but is more friendly to implementing the custom back off/retry scheme of your choice.


It's not the cost of saving the resolved data.

If I understand the pattern correctly, it is to avoid multiple asynchronous requests to a resource that has yet to be cached.


Yeah, that's my understanding too.

It seems like an optimization to prevent lots of downstream requests that occur in rapid succession before the first request would have finished. I'd also suspect that the entry would be removed from the map on failure.


I believe the idea is that if you stored the result instead, you would have to wait for the promise to resolve the first time. If you made two requests in quick succession, the second request wouldn't see anything in the result map for the first one (as it hasn't resolved yet) and would then need to make its own new request.

If you store the promise, then not only does the second attempt know that it doesn't need to make a new request, but it can also await on the promise it finds in the map directly.


Wait a minute.

Under Javascript, The behavior induced by this pattern is EXACTLY equivalent to that of CACHING the result directly and a BLOCKING call. The pattern is completely pointless if you view it from that angle. Might as well use blocking calls and a regular cache rather then async calls and cached promises.

So then the benefit of this pattern is essentially it's a way for you to turn your async functions into cached non-async functions.

Is this general intuition correct?

--edit: It's not correct. Different async functions will still be in parallel. It's only all blocking and serial under the multiple calls to the SAME function.


Say you have three elements in your UI that need to fetch some user info. You have two major options:

1/ Have a higher level source of truth that will fetch it once from your repository (how does it know it needs to fetch? How is cache handled ?) and distribute it to those three elements. Complex, makes components more independent but also more dumb. It's fine to have pure elements, but sometimes you just want to write <UserInfoHeader/> and let it handle its stuff.

2/ Your repository keeps this Map<Key, Promise<T>>, and every time you call getUserInfo(), it checks the map at key "userinfo" and either return the promise (which might be ongoing, or already resolved) or see that it's not there and do the call, writing the promise back into the map. This way, your three components can just call getUserInfo() without giving a damn about any other ones. The first one that calls it pre-resolves it for others.

As to why a promise instead of just the raw result: one can potentially return null (and you need to call again later to refresh, or straight up blocks during the entire call), the other one just gives you back promises and you can just listen to them and update your UI whenever it's ready (which might be in 5 seconds, or right now because the promise has already been resolved)

It's a bad implementation of a cached repository (because it ignores TTL and invalidation as well as many problems that need to be handled) that any junior developer could figure out (so it's everything but obscure), but sometimes, eh, you don't need much more.


Because the computation of Result is _expensive_ and _async_.

So if you get two requests for the computation that computes Result in close succession the Map doesn't have the result in place for the second call and as a result you get no caching effect and make 2 expensive requests.

By caching the Promise instead the second request is caught by the cache and simply awaits the Promise resolving so now you only get 1 expensive request.


I didn't know this was a formal design pattern; I do recall having implemented this myself once, as a means to avoid double requests from different components.

Later on I switched to using react-query which has something like this built-in, it's been really good.


This is only tangentially related to:

> you avoid computing/fetching the same value twice

But this problem comes up in reverse proxies too. In Nginx, you can set the `proxy_cache_lock`[0] directive to achieve the same effect (avoiding double requests).

[0]: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#pr...


That right there is Cache Stampede[0] prevention.

[0]: https://en.wikipedia.org/wiki/Cache_stampede


Interesting! This is an issue I had with an external API which I intended to cache on my serverless workers infra.

I hit the API's rate limiter when the workers invalidated their cache, even though I staggered the lifetime of the cache keys for each replicated instance. This is how I found out how many Cloudflare workers run on a single edge instance. Hint: It's many.

Didn't know it had a name. I'm delighted, thanks!


You’re welcome, happy to help. If you are in the .NET space I suggest you to take a look at FusionCache. It has cache stampede protection built in, plus some other nice features like a fail-safe mechanism and soft/hard timeouts https://github.com/jodydonetti/ZiggyCreatures.FusionCache


I love this approach and have used it many times in JavaScript. I often end up adding an additional map in front with the resolved values to check first, because awaiting or then'ing a Promise always means you will wait until the next microtask for the value, instead of getting it immediately. With a framework like React, this means you'll have a flash of missing content even when it is already cached.


Replacing the promise with the result in the map (using a sum type as the map value type) might be more efficient than maintaining two maps?


This surprises me. I had expected the microtask to be executed right after the current one, ie before any layouting etc - isnt that the whole point the "micro" aspect?


I was able to reproduce this in a simple example [1]. If you refresh it a few times you will be able to see it flash (at least I did in Chrome on Mac). It is probably more noticeable if you set it up as an SPA with a page transition, but I wanted to keep the example simple.

[1] https://codesandbox.io/s/recursing-glitter-h4c83u?file=/src/...


I am deeply disturbed and confused. Do you understand why this happens?


The event loop shouldn't continue the circle until the microtask queue is empty, from what I remember.

Browsers probably used to squeeze layouting and such things in when the event loop completed an iteration or became idle. However nowadays it's entirely possible that have, or will have, browsers that do even work in parallel while the javascript event loop is still running, possibly even beginning to paint while javascript is still running synchronously (haven't seen that yet).

I've seen instances where browsers seem to be like "nah, I'll render this part later" and only would render some parts of the page, but certain layers (scrolling containers) would not be filled on the first paint - despite everything being changed in one synchronous operation!* And I've seen them do that at least 5 years ago.

It seems to me the situation is complicated.

* There was a loading overlay and a list being filled with data (possibly thousands of elements). The loading overlay would be removed after the list was filled, revealing it. What often happened was that the overlay would disappear, but the list still looking empty for a frame or two.


Yeah, saves you from thundering herd problems

https://github.com/ben-manes/caffeine cache does something similar by caching the future that gets returned on look-ups if the value is still being computed


I have a similar pattern when using an rxjs pipeline to do async tasks in response to changes in other values. The typical pattern would be

    readonly foo = bar.pipe(
        switchMap(v => doTask(v)),
        shareReplay(1),
    );
Where doTask returns a promise or an AsyncSubject. The shareRelpay allows foo to cache the last value so any late subscriber will get the most recent value. This has a problem of returning cached values while new work is happening (eg while a network request is in progress) so instead I like to write

    readonly foo = bar.pipe(
        map(v => doTask(v),
        shareReplay(1),
        switchMap(v => v),
    );
This way the shareReplay is caching the promise instead of the result. With this pattern I can interact with this pipeline from code like so

    async someEventHandler(userInput) {
        this.bar.next(userInput);
        const result = await firstValueFrom(this.foo);
        …
    }
Without the pattern, this code would get the old result if there was ever any cached value


It’s also the clearest and least buggy way to iterate over the results. Map over Await Promise.all(map).


Partly off-topic, but in JS I tend to avoid iterating over promises with maps because it's always confusing what will/won't work, so I use a basic FOR loop because async/await works in there.

How bad is this? Should I switch to Promise.all instead?


Iterating with async/await means you wait for every result before making another call. You basically execute the async calls one by one.

Promise.all runs them all simultaneously and waits until they are all complete. It returns an array of results in the same order as the calls, so it's usually pretty straightforward to use.

Both approaches have a purpose, so it's not like you "should" strictly use one. But you should be aware of both and be able to use the one that fits the need.


The third major strategy being Promise.any():

- as soon as any of them resolve (any)

- when all of them resolve (all)

- resolve each in order (for)


Promise.all/allSettled/etc will allow you to resolve your promises concurrently rather than awaiting each result in the for loop. Depends what your use case is, really.


Use my approach or yours, but never use forEach with await. I don’t remember exactly why, but it just doesn’t work and will lead to very confusing errors.


Because Array.forEach doesn’t return a value or anything that could be awaited. If you use Array.map you can create an Array of promises that can be awaited with Promise.all


Just make sure Map access is thread safe - awaiting promises is - but updating/reading map keys usually isn't.


Yep, in multithreaded langs you just want to use a ConcurrentMap or whatever the stdlib has for this. The write is very brief because you only write the promise and not eg keep a proper lock on the key until the result has arrived, so no need for a fancier datastructure.


js is single-threaded so no problem there.


Since OP mentioned Mutexes I assumed he's dealing with multithreaded code, but with JS this works as advertised :)


This is clever, otherwise you have to coordinate your cache setter and your async function call between potentially many concurrent calls. There are ways to do this coordination though, one implementation I've borrowed in practice in python is modes stampede decorator: https://github.com/ask/mode/blob/master/mode/utils/futures.p...


I believe that any time you await/then you wind up going back to the event loop. It's cheap ish but definitely not free. This is, of course, to make the behavior more predictable, but in this case it might be undesirable. Unfortunately, I don't think there's any obvious way around it. It can sorta be done with vanilla callbacks, but that takes you out of the async/promises world.


I have written many tools that do something like this, very useful.

Just gotta be careful with memory leaks, so (for a TS example) you might want to do something like this:

    const promiseMap<key, Promise<any>> = new Map();

    async function keyedDebounce<R>(key: string, fn: () => R) {
      const existingPromise = promiseMap.get(key);
      if (existingPromise) return existingPromise;

      const promise = new Promise(async (resolve) => {
        const result = await fn(); // ignore lack of error handling

        // avoid memory leak by cleaning up
        promiseMap.delete(key); 

        // this will call the .then or awaited promises everywhere
        resolve(result); 
      });

      promiseMap.set(key, promise);

      return promise;
    }
So that the promise sticks around for that key while the function is active, but then it clears it so that you're not just building up keys in a map.


This is a great solution to the thundering herd problem, but isn't OP explicitly trying to cache the results for later? In that case, the promise map is a clever way to make sure the cachable value is fetched at most once, and you want the keys to build up in the map, so GC'ing them is counterproductive.


Ya it obviously depends on the intended goal, but caching items in a map indefinitely better be done on a set of values that is known to be limited to a certain size over the lifetime of the server or it'll lead to a memory leak.


If the result of the async function is not necessarily the same value every time you call it, you may want to recompute it


Please correct me if I'm wrong. But you'd better avoid creating extra promises for performance reasons

  const promiseMap<key, Promise<any>> = new Map();

  async function keyedDebounce<R>(key: string, fn: () => R) {
    const existingPromise = promiseMap.get(key);
    if (existingPromise) return existingPromise;
  
    const promise = fn().then(result => {
      promiseMap.delete(key);
  
      return result;
    });
  
    promiseMap.set(key, promise);
  
    return promise;
  }


You check for if(promiseMap.get(key)) and in the NO case you do promiseMap.delete(key)?

Could you explain why that's necessary? (sorry probs a stupid question)


So if the promise still exists, that means that there is an active call out for a promise using this key already, so we'll just return it. We know this because when the function that is executing finishes, we delete the promise from the map.

A purpose of this might be, let's say you rebuild some structure very frequently, like hundreds of times per second or whatever. You can call this function for a given key as many times as you want while that async process is executing, and it will only execute it once, always returning the same promise.


I’m the no case, the promise is created, and then deleted after the function is resolved.

I’m not entirely sure of the use of this pattern where your using a map and deleting the keys as you go - seems like you’d end up doing more work kinda randomly depending on how the keys were added. I’d just relive the whole map when I was done with it.


Won’t JS garbage collect orphaned references? Why is this necessary?


there is some confusion here.

OP is intentionally caching results of, presumably, expensive computations or network calls. It's a cache. There is no memory leak. OP just did not detail how cache invalidation/replacement happens. The 2nd comment adds a rather rudimentary mechanism of removing items from cache as soon as they are ready. You get the benefit of batching requests that occur before the resolve, but you get the downside of requests coming in right after the resolve hitting the expensive process again. Requests don't neatly line up at the door and stop coming in as soon as your database query returns.

Typically you would use an LRU mechanism. All caches (memcached, redis, etc.) have a memory limit (whether fixed or unbounded, i.e. all RAM) and a cache replacement policy.


It does, but the map has a reference to it, so it will "leak" (in gc languages an unwanted reference is considered a leak). If this map got rather large, you could end up with a rather large heap and it would be un-obvious why at first.


Do Promises hold a reference to the chain of functions that end in the result? If so, that seems like a bug.


A promise is just an object, it does not contain any reference to the chained functions.


Could you use a WeakMap for this instead?


If the key is a type that you expect to be GC'ed, yes, totally. If the key is a simple value type eg a string then it behaves the same as a regular Map or an object.


Ah yeah, good point.

I've mostly used this overall pattern with something like Dataloader, where you throw away the entire cache on every request, so the GC problem doesn't crop up.


https://github.com/tc39/proposal-function-memo

This might be relevant to this exact pattern

  const getThing = (async function (arg1, arg2) {
    console.log({ arg1, arg2 });
    return arg1 + (arg2 || 0);
  }).memo();

  console.log(await getThing(1)); // New promise
  console.log(await getThing(2)); // New promise
  console.log(await getThing(1)); // Returns promise from earlier


>> Not a very deep CS-y one

yes, in fact this is more like a pattern than an actual data structure, since you might as well replace it with a list of futures. And a comparison between future based concurrency and thread based parallelism is like apples to oranges.

But it isn't surprising that this pattern ended up at the top since it is what the users of this site would be most familiar with.


Vitess does something similar, "query coalescing", which was designed for loads that are very read-heavy. If 1000 requests come in that end up making the same read query from the database, and they all came in in the same 50ms that it takes to get the results from the database, you can just return the same answer to all of them.


I wrote a version of this with Elixir: https://github.com/bschaeffer/til/tree/master/elixir/gen_ser...

Didn't know what to call it but PromiseMaps is nice name for it.


Promise Caching has a nicer name to it since that's what it is if well-implemented.


> It only works in languages where promises/futures/tasks are a first-class citizen. Eg JavaScript.

Why would that be true?


I think the actual answer is that it only works in languages where you can eagerly execute async code disconnected from the Promise/Future construct. It would've worked in ES3 using a Promise polyfill. (Which is notable because it does not fit the aforementioned criteria—it had no real async support.)


It isn’t. The first implementation of promises I worked on was in Java. We used it as part of verifying certificate chains. If you use threads it doesn’t matter if the operations are blocking, as long as you move them off-thread.


Anybody knows if there is similar pattern for Python's asyncio?


How about a dict whose values are asyncio.Future?

https://docs.python.org/3/library/asyncio-future.html#asynci...



I wonder if Swift's AsyncAwait could be used in such a way.


Sure it's possible, we need to await for the Task if it already exists on the dictionary, for example we could imagine writing something like this inside an actor to make it threadSafe.

    private var tasksDictionary: Dictionary<String, Task<Data, Error>>

    func getData(at urlString: String) async throws -> Data {
        if let currentTask = tasksDictionary[urlString] {
            return await currentTask.value
        }
        let currentTask = Task { 
           return try await URLSession.shared.data(from: URL(string:urlString)!) 
        }
        tasksDictionary[urlString] = currentTask
        let result = await currentTask.value
        tasksDictionary[urlString] = nil
        return result
    }


Hmm, why set `taskDictionary[urlString] = nil` at the bottom there? If the whole point is to cache the result, isn't the point to leave the value there for other requests to pick it up?


Yup, nice catch, no need to reset it to nil if you wanna keep it in memory indefinitely.

I guess making it nil can be also used if you don't wanna make the same request when there is one already in flight, in case you have a 403 error and need to request a refresh token, you don't wanna make two simultaneous requests, but you also don't wanna catch it indefinitely either.


Great! So much nicer than the Combine (or Rx) version.


This is great. I'm actually running into this problem but across distributed clients. Is there a distributed way to achieve somthing like this with say Redis?


Technically yes, the promise can first do an RPC to a distributed key/value store, and only then do the expensive computation (which typically is itself an RPC to some other system, perhaps a traditional database). Promise libraries should make this pretty simple to express (though always uglier than blocking calls).

I say expensive because even in a data center, an RPC is going to take on the order of 1ms. The round-trip from a client like a browser is much greater especially where the client (if this is what you meant by "client") has a poor internet connection.

Therefore this pattern is usually done inside of an API server in the data-center. Additional benefits of the server side approach is that an API server can better control the cache keys (you could put the API server "build number" in the cache key so you can do rolling updates of your jobs), TTL, etc. Also, you don't want a rogue computer you don't control to directly talk to a shared cache since a rogue computer could put dangerous values into your cache, remove elements from the cache they shouldn't, etc.


What do you mean by first class citizen, here? I’m pretty sure this works in all languages with promises, but I might be misunderstanding something.


The data structure here is a Map<T>


can you please explain this a bit more how this avoid fetching the same value twice? maybe with example.


It sounds like, if you have a process that takes a long time to do, you check to see if you already have initiated that process (see if the key is in the list of promises), rather than firing off another promise for the same request.

If you get 10 requests to fire function lookUpSomethingThatTakesAWhile(id) that returns a promise, rather than firing off 10 promises, you fire it off once, look up the ID in your map for the next nine, and return that same promise for the next 9.


What is a promise in this context?


A promise is an object representing an asynchronous result. Some languages, such as Python, call them futures. A promise object will typically have methods for determining if result is available, getting the result if it is, and registering callback functions to be called with the result when it's available.


Thank you!


More generally, I believe this is a monad structure.


Promises are not monads, for one simple reason: they're not referentially transparent. The whole point of OP's example is to double down and take advantage of that.


Referential transparency is a property of expressions, not type classes. Referential transparency is nowhere in the definition of a monad because a monad is obviously not an expression but a type class.

It is definitely true that Promise is a data type that _could_ admit a monad instance. It has:

- a data type M a which is Promise a

- a function pure with the signature a => M a, which is x => Promise.resolve(x)

- a bind function with the signature M a => a => M b => M b which is essentialy `then`

But a monad requires three properties to hold true: Right identity, Left identity and associativity. Right identity holds for every function `f: a => M b`, but left identity and associativity do not hold.


Maybe comonads fit better


promise map?

do you mean

    setmetatable({}, {__index = function(tab, key)
          local co = coroutine.running()
          uv.read(fd, function(err, result)
               tab[key] = result
               return coroutine.resume(result)
            end)
          return coroutine.yield()
      end })
This is an incomplete implementation, clearly: but I've never missed Promises with coroutines and libuv.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: