Hacker News new | past | comments | ask | show | jobs | submit login
Reversing Safeway's private APIs to automate coupon collection (jonlu.ca)
308 points by jonluca on Oct 14, 2019 | hide | past | favorite | 145 comments

I enjoyed the article and would recommend it.

Aside: At the end of the article they have a "Bonus: Speeding things up" section where they automate adding 300~ coupons via 300 HTTP connections in 5 seconds (instead of 60~ seconds).

In my opinion if you're going to automate stuff like this, you should do so with the goal of minimizing disruption (and frankly, detection). They run the script automatically at midnight in the background via cron, so why was going slower problematic? 300 requests in a span of 5 seconds seems much more likely to trigger an IDS[0], get flagged as unusual in the logs, or similar than 300 over 60 seconds. Particularly at midnight.

I'd be trying to look at human as possible and not set off automated security systems. Heck you could add a randomized delay (e.g. 1~2 seconds) between requests and it would still be completed inside of 10 minutes. Plus then nobody can reasonably accuse you of trying to "DoS" them/violate the CFAA.

[0] https://en.wikipedia.org/wiki/Intrusion_detection_system

Yeah from my experience they'll run into problems real fast making 300req/s in 5 seconds. It's so "noisy" and potentially disruptive.

I'm honestly surprised they weren't rate limited. I mean, your SPA would have to be really messed up to make more than a handful of requests a second (even then why aren't you using sockets?) - So it's super reasonable to say that if anyone is making over 100req/s then maybe they should take an hour timeout.

If you don't want to get caught doing this then you'll want a randomized delay like you said, and preferably a pool of IP addresses you proxy through. This is to prevent the automated stuff from catching you though - a human can still look at your account and wonder why you have every coupon activated 24/7.

I agree - I actually built the thread pool stuff for work while I was testing, since I had the code from a previous project.

If I was more worried about Safeway catching on I'd probably do something as you suggested (at the very least I'd add user agent headers and the other cookies expected from a real session, as right now it's trivial to detect my requests).

Sneaker bots do this exceedingly well - it's a constant cat and mouse game to make the requests look as human as possible, very interesting space right now.

I highly recommend not making more then say 5req/s - and preferably do it at like 2am.

These projects are fun IMO, but it's best to not hammer anyone's servers if it isn't necessary.

But at 2AM your traffic will create a more noticeable spike...

It doesn’t blend in as well...

In my experience, unless they have excellent monitoring and a team looking to ban you they won't notice. It's half courtesy, and the other half is because it's not likely to affect their business operations (by slamming the servers when they're busy) they're less likely to notice or care.

> you should do so with the goal of minimizing disruption (and frankly, detection)

Q: If you're worried about detection, why would one blog about it/post it on HN?

Isn't the fastest way to shutdown a loophole to make it public?

Totally, and plus, it runs at midnight, and people aren’t going to be shopping so early in the morning... so span it over 6 hours. At least.

While I agree with spanning it over six+ hours, you might be surprised at the activity level in shopping centers at midnight. Once upon a time, I worked nightshift at walmart stocking shelves- on my days off, I'd still have to be awake all night to maintain a sleep schedule. To retain my sanity, I'd shop at other grocery stores so it didn't feel like I lived at walmart, and I was pretty surprised by how many shoppers were out and about at 1 and 2 am at stores like Ridley's or Smith's.

I did something similar with Safeway's API back in the day. But rather than chew up their API with unsupported usage--CFAA case territory--I now just log into Safeway's website and issue a one-liner on the coupon page:

$(".grid-coupon-clip-button button").click();

I did that for a while too, but realized the mental overhead wasn't really worth it - occasionally I'd get the super generic coupons of "Spend $20, get $2 back" and those were worth it, but most of the time I was adding coupons I'd never use.

The mental effort of going to the site at all was what I was trying to circumvent - now I don't even need to think about it (the clearer abstraction is that going from "2 to 1" is a much less drastic jump than going from "1 to 0" - completely removing the overhead is what makes this worthwhile, IMO, not the actual speed of the actions)

I use to use a similar bookmarklet that I would run on the Safeway Coupons page. But it stopped working after they moved away from Angular and I was too lazy to update it.


This will click the "Load more" button and also click all the coupons while waiting 500ms per click.


for(let i=0;30>i;++i)setTimeout(function(){btn=document.querySelector("#coupon-grid_0 > div.coupon-grid-container > div.load-more-container > button");btn && btn.click()},1e3i);elems=document.querySelectorAll(".grid-coupon-clip-button button");for(let i=0;i<elems.length;++i)setTimeout(function(){elems[a].click()},500i);

This is genius and a much more of a "hacker" way to do it... Sometimes simple is better

This is what I was thinking the whole time reading and looking at the site images... why script all that, when you could simply click the buttons in the browser...

Rube Goldberg Machine's are not good in software projects...

jQuery? Shouldn't you be running a React front-end from a bookmarklet and figuring out a way to Dockerize it?

what does this do exactly?

> $(".grid-coupon-clip-button button").click();

> what does this do exactly?

The command allows you to click all buttons in the page at once:

• The dollar sign is an alias for jQuery [1];

• The text between double quotes is a CSS selector;

• Select every DOM element with a “grid-coupon-clip-button” and “button” CSS class;

• The thing at the end is a JavaScript function call which triggers an “onclick” event [2];

[1] https://jquery.com/

[2] https://api.jquery.com/click/

This is a great little write up, but the third bullet point should be:

- Select every button with a parent element that has the css class “grid-coupon-clip-button”

To be pedantic, it clicks on elements that are a button element that is the child (at any depth) of an element (of any type) that has at least the class of "grid-coupon-clip-button"

Uses javascript to find all the page elements with the coupon add button styling attributes and then uses javascript to click each add button.

*uses jquery

`$` is a built-in wrapper around `document.querySelector` in many popular browser consoles (along with `$$` for `document.querySelectorAll`).

Though of course if Safeway's website does have a global `$` from jquery then that would take precedence.

This reminds me of a very similar case a decade ago I never figured out.

Anyone remember Coupon Guy from 4chan in the 2009/2010 timeframe? You could make your own "Buy 1 Get 1 Free" or variant (for both N, "buy 1 get 10", or for kind of coupon, like "buy 1 get half off") Tv, Xbox 360, near anything etc. For Walmart, Best Buy, other chains etc. The tools were passed around stegonographically in the instruction images. Since coupons weren't properly accounted for until they hit a place in Texas, whole threads full of Anons over the course of weeks fabricating working coupons.

Until they stopped working, and of course rumors of "the FBI" apparently grabbing the guy.

I never did figure out what happened tech wise under the hood there.

>and of course rumors of "the FBI" apparently grabbing the guy.

They did get him: http://www.thesmokinggun.com/documents/internet/fbi-busts-4c...

This story about a similar arrest has a good explanation about how to fake coupons: https://www.wired.com/2015/05/inside-a-million-dollar-dark-w...

If I had to guess it would be something such as concatenating a bar code.

In UK, we have items which when reduced all they do is add the new price to the end of the bar code.

ie. If some product is barcode of 3035555074225 and it's then reduced in price, the reduction including some checksum is added.

So, if the new price of 3035555074225 is 50p, the code becomes 303555507422500502 (Where we add 0050 for the price, and 2 is the checksum).

Next time you are in the supermarket just look at the barcode and you'll spot the pattern.

So for your example, maybe there was some EPOS system that the guy had inside understanding of how barcodes worked on coupons and could easily pair them.

>> The tools were passed around stegonographically in the instruction images

Whoa, really? I remember the coupons flying around 4chan of course. Had no idea the tools were in the images.

It's easy to make a file that both a valid image and a valid zip file, because zip readers start from the end-of-central-directory at EOF and most other things start from a header at the start of the file. Many such images included a winrar icon.

There might have also been some steganographic images, but I know for sure I saw several of the image+archive kind around then.

This is the kind of stuff where I tell people that coding can be a real practical skill (that gives you an unfair advantage these days).

I did something similar (read-only) for Home Depot Truck Rentals. To check if the truck was available at my local store, each time, you had to put in your zip code, and click a couple of times. Once I found that was an API, I rebuilt the call in Postman and kept hitting that endpoint until a truck was available.

That way I could check really fast.

The twist: None of it mattered because their data itself didn't update accurately (I saw one in the parking lot and they had one available and never updated their site). :)

Exactly this.

I did the same thing to find available camping spots in Hawaii because they are super difficult to get. Wrote a script that would query their "API" every 5 minutes and alert me if a spot became available anywhere.

Have you considered whether doing this might be...wrong?

Presumably those campsites are permitted by some government agency (NPS, BLM, the state of Hawaii, etc.), and presumably that agency designed permitting system with the assumption that people with limited time and attention would be vying for the permits by having to visit the site themselves to get one.

This encodes a particular definition of fairness: that those who register early, or are very motivated, or simply those with a lot of free time to refresh the site, will get permits.

I can also whip up a quick script to replace refreshing an unprotected HTTP API with a notification email. Does that make me more deserving of the camping spot?

> presumably that agency designed permitting system with the assumption that people with limited time and attention would be vying for the permits by having to visit the site themselves to get one.

That's quite a very large assumption that I don't think we can accept as fact.

> that those who register early, or are very motivated, or simply those with a lot of free time to refresh the site, will get permits.

Perhaps someone who writes a script would count as "very motivated"?

> Does that make me more deserving of the camping spot?

"Deserving" has nothing to do with this in the first place. The only way that works is if you define "those who register early", "motivated", "a lot of free time" as "deserving". I could maybe see an argument for the first two classes of people as being "deserving", but I don't think you can justify "I have a lot of free time" as a reason for deserving anything, really.

Doing first-come, first-served based on availability and the random possibility of a cancellation isn't ever going to be a "fair" system. This sort of system is put into place because it requires very little coordination and work on the part of the agency that maintains the reservations. Holding that up as some sort of standard for fairness, and suggesting that anyone who thinks outside the process is wrong... is a little much.

I'll bite.

I think you're right the agency probably didn't sit down, write down definition of fairness, then design a permitting system around it. They probably implemented the cheapest/easiest digital analog they could find to a traditional fax-in/walk-up first-come/first-served permitting system.

However, the intentionality of the implementers was not central to my argument.

The system was created by (probably) non-technical people under a certain set of assumptions: namely, that this digital first-come first-served system would function approximately like the old paper one, but with fewer dead trees and toil. The old one was rate-limited by having to call an office and probably talk to a human, and the assumption that if you call every 5 minutes that human will probably get annoyed with you and stop answering your calls. The new one is rate-limited by the assumption that most campers simply can't spend all day refreshing a website.

The traditional first-come first-served system isn't intrinsically "ethical" or "fair" for some classes of people (as you've astutely pointed out), at best it's a crude approximation of some version of fairness. While crude, it was established by a democratically-elected government tasked with allocating a shared resource. "People who can automate HTTP API calls" and the nearby "people who can hire people to automate HTTP API calls" (as has actually happened with some outdoor permits) were almost certainly not in the groups of people the government was seeking to advantage by choosing this system, and I think most engineers are smart enough to be able to intuit that.

So the root of my comment was this: GP is using special knowledge they have (and probably worked hard for) to extract more of a public good than the public really intended to have access to.

* Is that fair to everyone who doesn't have GP's knowledge? Do people like the GP deserve more camping spots than others? This is a public resource, not sneakers, so fairness is important. * If everyone with programming knowledge acted the way GP acts, would that maximize the public good? * If everyone with programming knowledge acted the way GP acts, would the system even function at all?

My answers are basically: * No. Everyone who wants deserves an equal chance at the spots. If there's more demand than spots, it's the government's job to decide. Random programmers on the internet intentionally subverting the government's intentions is wrong regardless fairness (or lack thereof) of the original system). It would nearly-minimize the public good. Only programmers and people who can hire them would get popular camping spots. This is a real problem is popular outdoors areas around tech hubs. There's a reason NPS will only accept old-fashioned faxes provably not sent from a free online relay for the most popular Sierras routes in CA (e.g. the JMT and much of Yosemite), and it's not because they want to rock like it's the 80's or because the government is backwards. It's because assholes tried to spam the process with automation. I think some (like Half Dome) were migrated to a new lottery system on outdoor.gov this year. * The system would completely collapse, and most smart programmers could predict that. The government would have to spend more money on servers just to serve bots pinging the registration system constantly, or it would crash. Even if they did that, the people who gots slots would be a vanishingly small subset of the population (programmers) or people who can hire them. Likely, a grey market for "scalped" permits would arise.

If you want to see a world governed by a non-human understanding of fairness and one in which a highly technical cohort was entrusted to make these sorts of judgment calls, look no further than the domain industry in the early 2000s.

Specifically, consider Snapnames -- a company born out of the notion that snapping up a domain name coming up for renewal was something that was legitimately fairly awarded to the automated process that was fastest.

Unfortunately, I found this wholly unconvincing. I sincerely doubt it is a grand ideal, carefully thought-out, or well-funded. Expecting an idealized set of well-crafted axioms for accessing public spaces with demand outstripping supply at an accelerated pace will almost certainly lead to disappointment. "Special knowledge" can be as simple as knowing the right person to befriend to add your name to a list. The example of regressing to ancient non-internet fax instead of even straightforward USPS snail mail tells me they don't have the correct answer either.

You make an interesting point but I don't think that system was ever made to be ethically fair.

Without a script it relies on your free time and refreshing the script every 15 minutes. What if you have a full time job far away from a computer, are you less deserving than anyone with a lot of free time to refresh the page?

>This encodes a particular definition of fairness: that those who register early, or are very motivated, or simply those with a lot of free time to refresh the site, will get permits.

Why is this definition changing with knowledge something that should be considered wrong? The ability to do this work isn't gated to certain people except by knowledge, and the knowledge itself isn't gated. For the longest time this has been considered a fair way of doing things to get an item in limited supply.

This is a cat-and-mouse game. The assumption that people are time-limited and that UI design could be designed to ensure fairness on that assumption is along the same lines of security through obscurity.

It would fit one definition of "are very motivated".

I can't afford to have a holiday in Hawaii, so a particular definition of fairness is already encoded.

the definition for "wrong" in this case is "what would be the social impact if everyone acted like me". In this case, I think the answer is "nothing", and that it passes the moral test (for me personally, at least).

Set something like this up for spots at Point Reyes and I'd be your first customer. Been trying for more than a year to snag a spot there.

I've been doing it for years now.


Yep, I do this for fun custom license plates.

I am interested in looking at your script and/or code.

I did something similar to check if tickets are available on Resident Advisor. https://github.com/jmcmullen/ra-resell-check/blob/master/src...

When working in a kitchen in California, I had to take the ServSafe Food Handler course online, which consisted of long, patronizing, unskippable videos explaining basic food safety concepts (which I already knew from previous experience) and little "check your knowledge" multiple-choice questions peppered throughout. The only thing that actually affected whether you got the certification was a quiz at the end.

I just poked at the main JS file for a couple of minutes until I found the statement I needed:

This turned what would have been a 3+-hour slog into 5 minutes, and I passed the quiz just fine.

That is so true. Even if your algorithm works it's a human factor that breaks it.

Never underestimate the error introduced by a lack of incentive for humans to comply with the API (or the API itself being human-hostile).

My favorite example is the Domino's "Your pizza is ready" signal. Since the data feeding the signal also feeds the store's performance analysis (i.e. they track how fast employees are getting pizzas ready), there's significant incentive for employees to lie to the algorithm and hit "It's ready" before it's physically ready, on the assumption that customers will take nonzero time to wander over and show up for pickup.

> there's significant incentive for employees to lie to the algorithm and hit "It's ready" before it's physically ready

It's the same with fast food in general. One of my first jobs was at a McDonald's where the standard practice was to hit the 'finished' button for any order that was taking 'too long', at which point the people doing food prep would have to keep track of a sequence of three or four orders in their head. As you can imagine, errors were constant, but customer satisfaction was less important to management than satisfying this imaginary metric.

I managed at Domino's back about a decade ago when they forced the transition to the updated in-store POS that was required for this.

To add a bit of color to what's happening:

- You don't actually hit an "It's ready" button. When you knock the pizza off of the makeline screen, it transitions from "so and so is making your food" to "it's cooking".

- The "It's cooking" phase is just a simple timer. It's supposed to be adjusted to the time of the conveyor oven (which can vary from ~3-9 minutes depending on the particular oven they use). Most stores never customize that setting in the system, and it stays at the default. And other stores may have ovens running at two different speeds.

- Average make time is one of the metrics monitored by corporate audits, so you do have gaming of the system by knocking pizzas off the make screen early. But you can also have a backup at the oven during rushes, where food sit at the end of the makeline ready to go in the oven when capacity opens. At the same time, some items such as wings have to go through the oven twice (depending on the oven configuration). In both of these cases, even without gaming the system there will be dissonance between actual cook time and the cook time shown in the pizza tracker.

- For those that did try to game the performance metrics, it came with an equal headache internally beyond just that "It's cooking" timer. The scheduling system used that average make time as an input into calculating labor needs. The more you knocked stuff off the makeline early, the more optimistic the scheduler became and the more you'd have to override the suggested schedule and the more "Labor Waste" you'd create by having more people working than the system thought you needed.

It was really quite fascinating to see that system transition before I started my career. I witnessed it at two stores under two different franchises - one did the bare minimum to comply and the other one embraced the new system and it's capabilities. There were a lot of incredibly capabilities and forecasting optimizations that were made possible by the update. But they all presumed accurate data in the system at all times. But in many cases, managers were either disincentivized to do what was required to maintain that level of accuracy or transparency, or a component of the system would be designed in such an idealistic fashion that it didn't allow for the amount of pragmatic flexibility it needed. Both of which have been really valuable lessons I've taken forward with me.

Future of this: computer vision detector on top of frequently-updated satellite data?

There are companies[0] that provide realtime (I think) satellite imagery apis. One use case I saw was stock traders tracking parking lots of large retailers to gain early insights into sales metrics.

[0]: http://www.digitalglobe.com/products/satellite-imagery

Retail was doing this back in the 1990s. "Parking lot counting"

Bonus points if you implement it with your own remote controlled drone.

I was going to say just mechanical turk parking lot pictures.

Another tip: (area code) 867-5309 works for Safeway club card discounts in most area codes while piping tracking of your purchases to essentially null.

Yes, Jenny’s number.

At risk of ya'll taking "my" safeway gas rewards, this is almost always good for whatever the maximum discount off per gallon is at any safeway that also has a gas station.

Its amazing to me the degree to which people will go to save 3% or less on gas once a month. If you fill up 4 times a month you're saving about $4, and all it costs is a profile of everything you buy attached to your phone number.

Given that they're probably already tracking everything via my credit card number (whether I sign up for anything or not), I don't feel like much privacy is being given up.

Though as a rule I don't bother with most of these programs, mainly due to the inconvenience factor.

I've lost hope of maintaining my privacy through any actions I can take individually. A legislative solution is required.

Correct me if I'm wrong, but if you're using a credit card the bank is only getting a receipt that shows the sum total of the transaction and not an itemized receipt. The club cards are consuming and tracking the data of the entire transaction.

The club card acts as unique id to track you. The credit swiped at the machine has a unique id that they can use to track you.

The club card works across multiple credit cards and cash. That’s the main difference. And you might possibly track 2 people who control the household (wife gets one, husband gets one)

Using Apple Pay would circumvent the CC tracking because it only transmits a one-use number.

Unfortunately (from some perspectives), this isn’t true. Each instance of a card in Apple Pay has a long term Device Account Number (DAN) that functions the same as a physical card’s number (though there is no link between the DAN and actual card number from the merchant’s perspective). The same DAN is used across every transaction. The only way to get a new one (which is pretty easy, relatively speaking) is to unload the card from Apple Pay then re-enroll.

Was not aware, thanks for the clarification.

I've been wondering about returns lately using Google Pay consistently. It seems with returns they often ask 'do you have the card used for the purchase?', but with one-time numbers I don't know if the card will be recognized. My last several purchases I don't recognize the last 4 at all, and wouldn't be able to exclaim, 'yes, that's my card'

While returns are made more complicated by Google and Apple Pay, the card numbers aren’t single-use (which I’m not sure if you mean to imply is the case with google pay today). On both iOS and Android you can view the last few digits of the Device/Virtual Account Number.

I don’t have an Android device to test these instructions on but they seem plausible and corroborated: https://supportcentre.natwestinternational.com/Searchable/91...

When I made a return for an Apple Pay purchase at Target, they saw that the transaction was marked “Tapped” as card type and took my word for which card was correct. Who knows how long that’ll last. Where I really expect to run into problems down the line is when I try to take advantage of the insurance benefits offered by cards and can’t produce an invoice/receipt with the expected card number on it.

The receipt has the last four digits on the card. Localizing that specific card number to the store and you are going to get pretty complete profiles on people. There will obviously be some duplicates but some smart algorithms could match up the common purchases. If I always buy the same boxed tea and the same deli meat, patterns can arise. The guy who buys a 1/2lb of swiss and 1/2lb of roast beef every week is probably a different person than the one that buys 1lb of cheddar and 1lb of turkey each week. Associate my other purchases with my deli meat choices and they can track when I came in for some special dinner ingredients I've never bought and a box of my usual tea bags. Even purchases that look unremarkable at first glance, like bread, eggs, and milk, have trends (bread brand, egg size/brand, 1%/2%/whole/skim/brand of milk). Having the last four digits makes it a lot easier to track customers but even without that, there are probably unique customer trends you can pull out of the seemingly anonymous data. At minimum the store could print out some coupons that it thinks you would be interested based on your latest purchase and what other people buy with the things you buy. A mom buying capri-suns and lunch snacks would probably be interested in similar kid lunch items.

The store is able to build customer profiles using credit card numbers. The bank doesn't need to be involved at all.

Yes, it won't be perfect (especially if one uses multiple cards), but many stores are getting extremely sophisticated with their techniques for profiling customers.

The grocery store has access to your credit card number when you swipe it. It can associate that card (or a hash of it) with your itemized receipt.

The discount is markedly more substantial than 3%. If it's a Safeway brand gas station the max discount is a dollar off per gallon. The safeway gas station I used to do this at currently has regular for $2.69 a gallon.

The max discount I've ever seen is $0.20 for Chevron, but tends to be $0.10. Chevron is much less rare than a Safeway-branded gas station.

Currently west coast gasoline is ~$4.00/g, so that's about 2.5-5.0% discount, which is in line with GP said.

For that $1 off/gallon gasoline, you need to have spent $1000 on groceries. So if your fillup is 10 gallons and gas costs $3/gallon, you will have spent $1000 to save $10. You aren't likely to spend $1000 between each fillup.

Im not, but thats the glory of using (XXX)867-5309. Tons of people know to use that number when buying groceries, many less so when buying gas.

But if you are using someone else’s phone number then maybe, especially if it’s a widely used generic number like in the grandparent comment.

I got 30 cents off yesterday at a Pacifica gas station. So far this purchase data hasn't hurt me in any fashion. They're welcome to it.

Some people don’t, in fact, live in the Bay or make telephone number salaries hacking React all day.

Whenever I give that number to the cashier, I half expect them to say "hey, wait...."

I thought that number was drilled into everyone's brain, but I guess you have to be a certain age.

According to my last receipt, my YTD "savings" has been $959.68

So I am not the only Jenny out there!

I've never heard of it. I think the most famous phone number from a song currently is 281-330-8004

Ha, never thought to try that. I've been using my dorm room phone number from freshman year of college for 20 years now.

At the self checkout machines at Giant there is a "Forgot my card" option. It gives you all the discounts without entering anything.

According to google it is a number from an early 80's pop song that at the time "caused a fad of people dialing 867-5309 and asking for "Jenny""


Similar trick works at AMC movie theaters with their rewards program. You can use it to get discount tickets on Tuesdays and popcorn discounts.

Similarly for all these other rewards programs you see at restaurants nowadays. I will never understand the idea of using a phone number as authentication without any additional PIN or text message or anything. If I have an acquaintance that I know goes to a lot of movies, and I either know or can find their phone number, I can drain their rewards account. Or you can drain their Safeway rewards account, etc. I wonder how much longer the situation will last?

I think they recently cleared this. I tried 10+ area code variants just this weekend and none worked.

Works at my local Safeway, as of yesterday. There was a period of time over a year ago during which Jenny's number didn't work at a different Safeway but it's been reliable at this one, so far.

I use this one when I'm traveling. It almost always works.

510 recently stopped working :/

Ok now I know to type that in any time getting gas at safeway.

Too bad? I don’t have a car

Just get a Jerry can

Is there a private API for a particular store's prices?

It seems like coupled with couponing, you can build a decent price tracker that can tell you if you're actually getting a good deal. (like https://camelcamelcamel.com/ for amazon, https://steamdb.info/sales/ for steam games)

Otherwise I've noticed that many(though not all) coupons are for items which recently had their base price increased to make the coupon seem like a better deal than you're actually saving.

When I see access to private API's like this, I wonder how judges would interpret these actions as they relate to the CFAA. By accessing "private" API's like this, are you knowingly accessing a computer without authorization? Are you exceeding your authorized access?

The fact that Aaron's Law never went through has disturbed me...

In EU you are allowed to automate everything you legally can do manually, even if they have a large sign on their page saying that automation is not allowed. The only thing that is limiting you from selling is copyright, but by-products from ordinary business is always legal.

any reliable legal source for that ? thanks

This is the best summery from the “horse racing case”: https://eur-lex.europa.eu/legal-content/GA/TXT/?uri=CELEX:62...

There was also a review of the database directive last year that summarize all the precedence on this: http://data.consilium.europa.eu/doc/document/ST-8466-2018-IN...

I know you are just playing devil's advocate but why "without authorization"? You are using your account and your cookies so it should be fair game?

The US Gov't has tried to apply CFAA to someone they claimed was breaking a website's TOS[0]. The EFF filed an amicus brief in support of the defendant about this[1]. Who knows what the TOS is for grabbing coupons?

Sure, the judge who heard this case said this would be an "overly broad" interpretation of the law at the time, but the question has come up in subsequent criminal cases as well. I'd feel better if that was actually codified and not left up for interpretation by other judges or courts.

0 - https://en.wikipedia.org/wiki/United_States_v._Drew#Indictme...

1 - https://www.eff.org/cases/united-states-v-drew

+1 you're only doing things you're already authorized to do, just with a different client.

Many sites view that as "unauthorized" because they like that top-down control

It's not any individual site's view that matters to me. It's the view of the courts that's the most concerning/pressing.

Probably not...

There's an ongoing case Linkedin v HiQ where Linkedin said HiQ was scraping publicly available linkedin profiles but there was a robots.txt that told them not to. HiQ kept doing it until Linkedin threatened them under the CFAA. HiQ just won a preliminary injunction to get to continue where the court said it was unlikely that they were violating the CFAA but they might change their minds as the case progresses:


The Safeway API in question requires a user to be logged-in, authorized, and have explicitly agreed to a TOS (clickwrap). The HiQ v. LinkedIn case was centered around publicly accessible content which did not require the user to be authenticated nor have explicitly agreed to a TOS (browsewrap is not enforceable).

I'd argue the above case does not apply here.

ahhh, yeah that is certainly quite distinct. Thanks for pointing that out.

HiQ v Linkedin kind of hinges on the implicit authorization given by making an API public. Here Safeway gave explicit authorization so I suppose it might come down to the TOS. Then again all Safeway has to do is revoke your account and you're gone, so I don't really know why anyone would be to worried about them coming after you with the CFAA. Cant be worth the effort.

On a slightly unrelated note, I was trying to figure out why this guy is using a .ca domain when I realized, his name is Jon Luca. At first I thought his name was Jon Lu (and therefore asian). Good use of domain name.

I'm in Canada and shop at Sobeys, which owns the Safeway brand up here, thought there might be some relevance to me thanks to the .ca domain name. Unfortunately, there was no relevance but it was still an interesting read.

On a side note, Safeway and Sobeys in Canada don't have a loyalty program, instead they piggy back off of Air Miles. All of the special offers available just amount to bonus Air Miles, so they're not actually that worthwhile (IMO).

What a blast from the past. I am actually Canadian but have been living in the Bay Area for over 10 years so I've already forgotten that you can collect Air Miles through Safeway (which was why I clicked on this article in the first place thinking of the .ca)

I don't think Air Miles are completely useless. I do remember redeeming air miles at least once in the past for a one-way trip somewhere...

They're not completely useless, but I do find that I don't seem to get much benefit from them.

For example, I needed to rent two cars on a trip last year. The first car I was able to rent through Air Miles, but I had to pay the taxes and any fees beyond the base price myself so it didn't feel like much of a deal. The second car required more Air Miles than I had, so I had to pay for the whole thing myself, I couldn't do part on Air Miles and part cash.

I ended up renting the second car through Costco and felt like I got a better deal overall than I did with the first car.

Maybe rental cars aren't the best way to use them.

Not really a good use since the .ca TLD is supposed to be reserved for Canadian entities (individuals and businesses) and the site would seem to break policy.

If it's indeed the case that he has no residency, he's one report to CIRA away from the domain cancelled.

I saw someone use this for their email address a few years back. Say their name was Jane smith (anonymised for obvious reasons), their email address was j@nesmi.th (actually using a .ch domain).

I own locutusofb.org a similar idea for any Trek fans.

Haha, I did something similar a while ago - I just invoked a click() on every "add coupon" button on their website, I didn't want to reverse engineer their APIs - way easier but also it requires a desktop browser with the page open.

I'm not sure if using this API is any different but a few months ago Safeway made a change that only lets you have 20(?) coupons at any time. After you add more it kicks off your oldest one. Which sounds like plenty but if you're adding every single coupon you're gonna get a ton that you have no desire for ($1 off diapers when you don't have a kid etc).

It worked for a while, though!

You can use Selenium to do the same thing in a headless browser, no need for a desktop browser with a page open.


In which case you want to probably use their headless Docker containers. https://github.com/SeleniumHQ/docker-selenium

You can also use Puppeteer for this purpose (using headless Chrome) https://github.com/GoogleChrome/puppeteer

That's probably what he did though. I like Puppeteer more than Selenium FWIW, especially because Node.js allows you to prototype a lot quicker than other languages imo.

Curious, why implement your own threadpool? Python has two built in already: concurrent.futures.ThreadPoolExecutor and multiprocessing.pool.threadpool.

For a while I actually tried doing something like this -- the coupon page included jQuery, so a quick one-liner self-XSS clicking the right elements would iterate through all the pagination, and then you could call the click handler on all the Add buttons (which share a CSS class for styling).

The problem came at checkout -- whenever I typed in my phone number (to apply the coupons from the loyalty account), the point-of-sale system would hang for tens of seconds while loading and then trying to apply all those coupons.

This is cool but I'm not sure the purpose. I use the Safeway app every time I go to Safeway. I scan all the offers and just add the ones I want, which builds a shopping list for me.

This gives me a list of things to buy but more importantly I know what's on sale. If I just added all the coupons

I'd still have to scan the list of things to find what I want to buy, but then I'd have to track them elsewhere, because the built in list would be useless.

Safeway actually made a decent app that helps me shop faster. This feels like it would ruin that.

The author of the article points out that he's not really a "couponer"; he wouldn't bother with the coupons if he had to go through and look through them all. He buys what he wants to buy (decided ahead of time, not through a process of looking at the app), and his daily batch job having added all the coupons and possibly getting him a discount on the stuff he was already going to buy is just a bonus.

You have a different objective than the author of this article, who says

> I’m not someone that “needs” to get the best deal - if what I’m buying has a coupon then great, but I’m not going to change my purchases based on coupons

The author will buy what the author will buy, and he may happen to get a discount if any coupons apply.

This is more for someone who buys what they always buy(or whatever they feel like in each instance), but doesnt want to miss out on the deal just because they didnt click the coupon.

One of the other benefits he mentions is that sometimes Safeway offers coupons like "Buy $20 worth of groceries, Get $2". Unless you look for and add that coupon, you'd miss out on it, with the script, it's always activated.

I also use the Safeway app, which I think is one of the better apps out there, but sometimes they add in a coupon for something free. Unless I'm looking for that, I might miss it. I'd like to modify the script to look for and notify me when it finds them.

Coupons have two purposes, which seem contrary, but are essential to understand how coupons are a form of advertising:

1) Saves you money (versus the retail price of the item)

2) Makes you more likely to buy the product on the coupon.

Your use of coupons adheres to both value #1 (saves you money, because you know what's on sale) and value #2 (advertises products on sale, increasing your chances of buying them). This is the core advertising model of coupons and why they've been popular forever.

This automated use of coupons saves you money if you happen to purchase an item for which a coupon exists, without requiring you to do any extra work to save that money — but also without having the desired advertising outcome of making you more likely to purchase that item.

This compares well to adblocking. Internet advertising on websites:

1) Saves you money (versus subscribers-only paywalls)

2) Makes you more likely to buy the product on the advertisement

And in that analogy, ad-blockers are directly equivalent to coupon auto-adders: they allow you to save money without having the desired advertising outcome of making you more likely to purchase that item, in a fully-automated manner that doesn't require you to exert any effort doing so.

Coupons are a precursor form of advertising ("pay-per-clip" :) where you are paid money to view the advertisement, and are more likely to commit to buying the product when you 'clip' the coupon — it's a marketing psych thing. They also have perfect tracking, since retailers provide coupons to manufacturers along with purchase date and location.

Coupon autoclippers break that agreement, such that you're 'paid' for being influenced by the advertisement without ever having been influenced. The coupons are no longer valid for tracking the effectiveness of advertising A/B tests in different markets (your Safeway account's zip code is surely part of that data). They are no longer proof that you viewed an ad at all.

You're not wrong that this app would ruin how you shop quickly, but you're also shopping quickly using a list of products that were predetermined by Safeway and/or other marketing divisions to be of maximal interest to them for you to purchase. As long as you're okay with that, coupon clipping is an excellent approach. For others, autoclippers would minimize the price paid without changing their purchasing methods (which may be paper-based, brand-focused, or random-chaotic)

> ("pay-per-clip" :)

You have a well-written and accurate comment, but I'm overwhelmingly distracted by the combination of the adorable "pay-per-clip" and the historied debate about how to handle an emoticon smiley at the end of a parenthetical. (conclusion: you're doing it wrong and are a terrible person!....but, "pay-per-clip"...tee-hee)

Thank you, I do my best :).

> They also have perfect tracking

They have perfect tracking of completed purchases, but have no idea how effective their ad was for the people who didn't buy it. Did they miss the ad entirely because of poor placement? Did they see the ad but not value the deal? Did they clip the coupon but not make it to the store before expiration? Did they clip the coupon but find a better deal in the store? Did they clip the coupon, find the store was out of stock, and not feel like dealing with the store's rain-check process (if any)?

Arguably the tracking isn't perfect for completed purchases, even. They get purchase date and store location, but that's not all that much: online retailers get much more information when someone clicks on a banner ad and then completes the purchase. They can even get some information out of a failed sale, depending on how far the potential customer made it through the process before bailing.

This is a great analogy. I was a little worried in writing this that it might be "ruining" a good thing, and that publishing it would kill it, but I figured enough people over time have tried to take advantage of coupon-ing (and the fact that it's inherently people trying to get deals) that it was ok.

Interesting way to compare this auto script to adblocking.

I had written a similar program for US supermarket Publix, however I used it as an exercise to learn puppeteer[0] rather than using their private API directly.

Reading this inspired me to finally release it here: https://github.com/davecardwell/publix-coupon-clipper

[0] https://github.com/GoogleChrome/puppeteer

> This brought what used to be a painstaking process of clipping coupons from magazines to a slightly less painful process of having to click “Add” on all the coupons every week after logging onto their site.

This is an insane trend in supermarket usability. You want to offer low prices, but only to people who go through a practice round of online shopping before doing it in person?

Why are we doing this to ourselves?

They wouldn't make as much money if everyone clipped the coupon. However, most of the time, the coupons aren't really special. The loyalty club member price is already shown on the shelf anyway. Most "coupons" are more of an advertisement for the old folks who still do that. The incentive for the actually good coupons is to get you to come into the store and buy other stuff alongside the on-sale item. If I only have a few key items on my shopping list, I might be more inclined to buy more if the coupon app shows buy one get one half off for something. Or I might buy a different brand than I normally would because there is a coupon only on the app.

> They wouldn't make as much money if everyone clipped the coupon.

I get on a basic level it's a form of price discrimination, it just seems unfathomable that this bizarre skeuomorph, totally dependent on a series of random historical developments, would actually happen to line up with the way to extract the most return from your customers and/or give them the best experience.

But hey, I found the perpetual fake "50% off" sales at JC Penney's crazy making, and when they reversed that for more honest pricing they almost went out of business. I know I'm just shaking my fist at irrationality for no reason.

It still bothers me though.

This seems like prime territory for a Mac menu bar / Windows system tray app that runs this process once a day.

Why would it need a visible UI component? Just put it in a cron job

Ordinary people don't use cron jobs.

I suspect that's because of bad reviews. I give all crons 5 *'s.

This is not made for ordinary people I would say

Ordinary people would absolutely use an app that lives in the menu bar/system tray, has a Safeway login page and a "run on startup" option, and once a day does magic stuff in the background that adds all the available online coupons.

Right but now Safeway will update something that will cause this to break, and now people who use this super handy app will expect the maintainer to update it - even if it was just a one-off project that the original developer had no interest in maintaining.

So now people will complain and your inbox is flooded with "please fix this" because people feel entitled when all you really wanted to do was just try some cool thing.

Nice article... I'd have probably just used node+puppeteer for this.

Great article. Not sure why the author re-implements a thread pool when there is not one, but two in the standard library (futures and multiprocessing.pool.ThreadPool).

It needs some tweaking to work now, but here is what I was using (in Python):


Executed via cron every day at a time before I would make a potential grocery run meant the coupons were already waiting on my account when I went to the store.

Anyone know if the dev lurks here? I'm wondering if there's a way to fork his code or add Kroger functionality to it. If im not mistaken Kroger is a safeway acquisition that is a popular grocer in the south/southwest of the USA. this would be very helpful for those of us without safeways.

Kroger is the second largest supermarket in the US (after Walmart) that has many subsidiaries grocery chains that they have acquired in its 100+ years of existence. Pretty much Kroger bought up other "family" grocery chains and then keeps those chains local names such as Dillons, Frys, King Soopers, City Market, Gerbes, Kwik Mart, etc.

Safeway by contrast is part of the Albertson and Albertsons and its Subsidiaries is the 3rd largest Supermarket chain.

Kroger is separate company unrelated to Safeway.

Next time you are at the store look at the circular. There is a barcode you can scan with their app that adds everything to you app that week.

That doesn’t fix the information overload which the OP’s solution solves.

digression - if anyone has evidence that Safeway Inc gave back to the postgres project, it would be great to hear about it (in public)!

btw- how many people here have contributed directly to PostgreSQL in any way.. "food for thought"

Can I just subscribe to the coupons? Please. I give you my email, and using this - email the coupons to me in a pdf that I can print out? (Or is this a stupid request)?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact