Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What are these strange random strings spamming my blog?
174 points by the-mitr on Feb 20, 2023 | hide | past | favorite | 98 comments
I have a wordpress blog, and recently I discovered that there are several spam comments daily which have seemingly random strings in the content. The comments don't have any other (human readable) links, which are usually present in other spam comments. Can someone explain what is the point of such random strings? Do they mean/represent anything?

Some example of spam comments

https://imgur.com/3FVR7Yn




I recognize that IP address! Your site is being used as a pawn in a rather sneaky attack, and I was hit by it recently. Those email addresses belong to the victims—like me.

Let’s say I’m a hacker. I’ve gotten into Alice’s Amazon account and want to place a bunch of orders using her payment info. However, I don’t want her to notice until after I’ve received the ill-gotten goods.

To ensure she doesn’t notice the email notifications from Amazon, I want to “bury” those emails with spam. I can do this by entering her email into tons of online forms. Most will only send a single email—for example, your blog will probably only ask Alice to confirm her email—but once is enough.

This happened to me a couple weeks ago with Apple. Someone used the default billing and shipping info on my Apple account to place an order for an iPhone 14 Pro Max. I woke up to hundreds of emails from various blogs and other sites asking me to confirm my email. Being a security researcher, I knew that meant someone didn’t want me to see something else that had landed in my inbox.

I went through each one by hand. One included the IP address that submitted the form, which was interesting but not particularly useful. Eventually I found the receipt from Apple.

It’s not clear how the attackers intended to intercept the package; presumably, they would’ve tried to convince the courier to redirect it or retrieved the package from my doorstep, but Apple intervened and was able to stop the delivery before either of those happened.

It’s also not clear how the attacker got my billing and shipping info. Apple was able to confirm that my account wasn’t compromised and that nobody had contacted support pretending to be me. That billing info wasn’t used with many other companies.

Edit: You can see what this looks like from the victim’s side here: https://imgur.com/a/DHEJwKh Note that the usernames have the same sort of gibberish.


A good reason to revive an old habit: giving every web site a different email address. Easy enough when you own the domain; just monitor the catchall.

I used to do it to be able to shut down inboxes that spammers got a hold of. I kinda stopped doing that when spam filters got good enough. But with this, I'd know who was the real target because the email address would tie it to the specific site.


My host banned catchall addresses, which host do you use that allows this? (great advice btw, this has saved me too)


fastmail.com allows it. I'm apparently paying "legacy account" rates, which probably highly tempers my recommendation. I pay something like $5 or $12 a year for service. 500mb mailbox. I don't use calendar or anything else so i have no idea if those are restricted.


I'm on the $95/2 year plan, with about 30GB stoage, and have several domains attached, all set up for catchalls.

Trying to explain to a business why their name is in the email address you just filled out on a form is fun sometimes, though, but the only complete rejection I've gotten for it though is the guy who runs the main groups.io for one of my amateur radio transceivers and can't wrap his head around my address not being some attempt at fraud.


On rare occasion I’ve encountered companies (let’s say foo.com) that prohibit the string “foo.com” from being in the email address I sign up with — some slightly broken defense against preclaim or impersonation attacks I guess?


> Trying to explain to a business why their name is in the email address you just filled out on a form is fun

Hah yep, been using this strategy for years and a lot of times they ask if I'm an employee.

Once Verizon customer support asked me "you got a problem with Verizon?" My email address was VerizonSucks@mydomain.com.


verizon@enslaves.us gets similar questions ;)

I do have another domain that's less confrontational, and I also sometimes use a normal sounding address in person.


I've just started using random triples/quartets of letters at my domain. There's pretty slim advantageein literally getting email from bestbuy@mydomain.net.


Panix.com allows unlimited email addresses in the format: whatever@myname.users.panix.com

where "whatever" is anything you like, and "myname" is your email name. (You also have the address "myname@panix.com".)

I've been challenged on this only once. And I've been able to tell a company that its email list has been compromised.

Disclaimer: Just a happy customer.


Proton mail allows this.

A free alternative would be using the abc+websitename@domain.com trick.

Gmail allows this, but unfortunately some websites wont accept the + sign as valid character in an email field.


privateemail by namecheap allows aliases at different tiers. The Ultimate at $72/yr allows unlimited aliases: https://www.namecheap.com/hosting/email/?utm_nooverride=1&ut...

I use a lower tier for my domains and I've been happy with it for a year or so.


I pay €1/month for Tutanota which lets me have five aliases, however using DNS redirection with my domain I can have a unique receive address per website.


Zoho is $12/year for 5gb and allows catch all


This combined with a password manager is what I do. This is the way!


apple has private relais too


> It’s also not clear how the attacker got my billing and shipping info.

This sounds like a real mystery. I wonder what happened here and how Apple is so confident that your account wasn't compromised, because it really sounds like it was. Either that or some other account with those same credentials was.

It's also a bit of a mystery to me why the hackers would use your shipping info to begin with here. Why not some address where retrieving things would be easier for them? They must have been really confident they could intercept your package.


I had very similar questions for Apple, but they refused to answer most of them.

I’m confident my Apple account wasn’t compromised directly:

- Apple confirmed there were no login attempts around the relevant time period

- I have 2FA enabled, but I didn’t receive any approval alerts and couldn’t have leaked the number even if I had—I was asleep at the time. This doesn’t rule out malware, though.

- All authorized devices on my account were in my home. My home was locked.

- I use a randomly-generated password. That doesn’t rule out phishing, though.

Session stealing is still a possibility, but Apple seemed reasonably confident that hadn’t happened.

Initially, I was promised a call back from Apple’s fraud team. However, after closing they investigation, they refused to talk to me directly. They wouldn’t even tell me the IP address used to check out.

I do know that the order was placed on the website, rather than in a store.

As to why they would use my shipping info: I still don’t know. If they entered the info manually, it’s awfully convenient that it exactly matched the info on my Apple account, considering most companies don’t get that info.

I do know that T-Mobile and Citizens Bank both have that info, since I have T-Mobile as a carrier and am enrolled in the iPhone Upgrade Program. My T-Mobile billing info didn’t match exactly, but it was very similar. Citizens Bank matched exactly. I’ve now given both Citizens Bank and T-Mobile unique email addresses that differ from my Apple ID.


This happened to me (this exact scam, with the email bomb, shipping and email matching my real addresses, etc.) a few years ago with a major electronics brand. My account was not compromised because I didn't have an account with them. But probably some other company got compromised, and they got my my matching info from one of those dumps.


If I had to guess, the answer to both is some stolen session issue - either personal device or a device that the user was logged into once had saved login information, which the attacker gained access to. However, to reduce the chance of needing to verify any information/login again, they left the default shipping address (as changing it would have A] left an address trail, and B] been suspicious).

But again, that's my guess - I'll leave solid answers to the actual security researched in the comment chain X)


That was my guess, too. However, Apple seemed to think there was no relevant session activity on my account at the time.

They stopped short of confirming that the payment and shipping info was entered manually. When I asked directly, they initially said they’d put me in touch with the fraud team, but the fraud team refused to talk directly to me and would only talk through another support rep. They refused to answer any questions about how the order was placed.


Damn!! That solves a riddle that has been bugging me the last two weeks or so!

I'm in the same boat as OP (not a blog, but a web app, https://cubetrek.com). I've received the same type of account registrations (gibberish username, valid email address) since two weeks. My web app sends out an account verification email, so it makes total sense. I've recently integrated Cloudflare Turnstile as captcha alternative, but the automated sign ups have not stopped (they just get ignored server side). I'll probably have to change the endpoint address as well.

Since I'm receiving a dozen of apparently compromised email addresses every day, is there anything I can do? Informing the user via email is obviously not very practical.

Not to give the bad actor any ideas, but why don't they use some fake names instead of gibberish? It would be less obvious.


Reading your comment, I think I got hit by the same attack (as a victim) recently.. but I can not find any original email they want to hide. It's been a few weeks and now I have thousands of spam emails. (the inbox is basically unusable)

Any tips on how I should find the original email they want to bury?

Also, mine wasn't random characters but instead Markov Chain like gibberish to hundreds of random sites. The signups and whatnot stopped, but now it's all the regular newsletters and other spam I get.


They start the email bomb just prior to performing the rest of the attack. It’s possible the rest of the attack failed.

I went through every email by hand, as I wasn’t willing to take any risks.


This exact thing happened to me with Nike. It's apparently called email bombing. I didn't even realize I had a Nike account, but I guess I bought some shoes or something with it years ago. In my case they tried to buy a bunch of gift cards. Fortunately my bank flagged it as potential fraud and I didn't end up losing any money.


I think you nailed it. It doesn't match vuln scanning/probing patterns, IMO, but it makes sense as what you describe and looks pretty much identical to what you received, with the same sort of base64 gibberish.


Have not heard of this, but I also don't do much work in the security space. Good read and thank you for the share!


thanks for the explanation.

> . I woke up to hundreds of emails from various blogs and other sites asking me to confirm my email.

but if i don't approve the comments, then the emails won't go. right? or the mails will still go even if i don't approve them just to confirm from your side?


I'm running a small online shop and I noticed that in the past days I had a lot of "recipient not found, email cannot be delivered" messages in my inbox. I realized that these are bounces of emails my shop sent.

At first I paniced and thought the store was compromised and sending spam, but after some investigation I found that a lot of russian bots actually registered spam user accounts with mostly legit emails which then got all the spam. The only "customizable" parts these emails contained was the "From" field of the emails so they were all in the form of "PAYOUT_TO_YOUR_NAME_$3OOOO_HER example.com <mail@example.com>". After adding a captcha this went away, but it sounds like it was also part of a similar attack.


awful thing about these kinds of attacks is that your inbox never really recovers. these unsuspecting services/blogs will keep sending you emails forever thinking you actually wanted to sign up.


If they are legitimate services/blogs, they'll probably honor the unsubscribe link.


A good chunk of them were from small blogs or mom and pop shops. They don't always have one click unsubscribe, especially the ones in other languages (I think requiring unsubscribe link is an american law?).


This is really an interesting insight, good find! I guess there are no other explanations except this.


Sounds like this is exactly what is happening to the OP.


Parent comment is from a target victim. OP is actually one of the blogs being unwittingly used to spam those victims.


Everyone's coming up with hollywood style explanations (it's encryption! it's AI!) but as a former pentester, one of the more likely explanations is that it's probably just looking for vulnerabilities in an automated way. The first thing you do when you get an account somewhere as a pentester is to try to stuff as many strings as possible into weird places just to see what happens.

I don't think automated techniques are very effective (or weren't in 2016) but it seems more likely to be vuln hunting than choosing your Wordpress blog for encrypted comms vs established places like Twitter.

EDIT: faizshah https://news.ycombinator.com/item?id=34866169 points out that https://perishablepress.com/block-random-string-comment-spam... observes the same phenomenon, and the author notes that all of the IP addresses come from Russia. This seems to lend credence to the idea that it's looking for vulns, since, well, lots of traffic from Russia is looking for vulns.


While that’s possible, I don’t think that’s what’s happening here. This looks exactly like the behavior of an email burying attack (https://news.ycombinator.com/item?id=34868298).

Notice that the email addresses appear legitimate. Vulnerability scanners don’t use email addresses like those. One of those emails shows up in HIBP, indicating that it probably belongs to a real person.


Yes, most vuln scanners/exploits will inject random strings, then scrape the website looking for those strings, to see if injection is successful.


> I don't think automated techniques are very effective (or weren't in 2016)

Aren't the things that Tenable/Rapid7/etc do when scanning the network automated testing (techniques)? They seem effective at finding stuff. Or Metasploit.


I know at least sometimes they just do a simple check of the version of a service and report off that. We've issued vendor-supplied hotfixes that don't update the version number, and even though the software is patched, scanning tools will still report that it's vulnerable.


Back ported fixes in certain distros are a nightmare when doing vuln management.

"Scanner says its a vulnerable version based on the version number"

"Well, its up to date... Fix is back ported..."

Imagine having this argument, over dozens of tickets, every $time_interval.

Its no wonder actual critical issues go unfixed.


Yes. RedHat/CentOS/other LTS OSes are a nightmare for vulnerability scanning. I should say it's a bonanza for the security contractors and a nightmare for those of us that actually deal with it. So many pencil pushers throwing fits when you reply to the hundreds of vulnerability tickets with 'vendor back ported fix'. Also see PCI compliance.


I wouldn’t post encrypted comm to Twitter because I‘d suspect they keep tight logs on my IP. A random blog probably wouldn’t.


WordPress logs all comment IP addresses forever.

That's what proxies are for.


I guess you‘d need a MITM SSL proxy to mask your originating IP? That seems to be a bit complicated for this usecase, if Twitter uses certificate pinning?

EDIT: I see now that I had a misconception here. There's no need for a MITM proxy in this case.


Why would you need the MITM part? Just use Tor, VPN or a forwarding proxy to tunnel all your traffic through an IP you can't be connected with.


They (used to?) block most Tor nodes and popular VPNs. But I just found out they launched an onion service last year!


If you have to worry about that, then the encrypted communication is just theater.


It can't be what you say because there is no variance between posts.


ok, thanks for the explanation.


Automated vuln scanning seems to wax and wane in utility. I tend to use a lot of them to augment my work.

Automated web app scanners tend to lag heavily behind the current "trends" in web development - for example, only recently did BurpSuite start properly supporting auditing single page apps, and it still misses loads of shit.

Fuck, almost none of them can handle OpenAPI or Swagger specs properly, let alone WSDL files.

Handling complex authentication flows is still a nightmare, and almost none of the commercial scanner offerings have ways to "hook" or "trace" the app under test.


Its unlikely, but this reminds of this story [1], where hackers used public posts on Twitter to send commands to a botnet.

Kind of genius, as it doesn't matter what the user is, and becomes impossible to track or prevent messages being relayed to some botnet somewhere.

These could be encrypted messages like that... Or it could just be a glitch in a spam bot...

[1] https://www.darkreading.com/endpoint/tool-controls-botnet-wi...


In a similar vein - blog comments as distributed filesystem.


This was an issue decades (oof) ago with IRC and other public forums.

C&C through a third party.

The issue as I recall is that the IRC moderation tools didn't allow shutting down the various channels quickly enough or quickly banning connected clients without also affecting legitimate users' expectation of how IRC worked.

Might have been EFNet.


Recently I've been playing with a GSM modem to send and consume SMS programmatically. It's simple to echo a CLI command and response over SMS with burner sims.

I immediately wondered if hackers were using this approach for command and control.


They definitely do, with real life exploits using pastebin or reddit as ways to post C2 code.

Effective, because you can post it as HTTP/HTTPS traffic that generally flies under the radar of a lot of IDS/IPS systems. Even if they inspect the packet it's just a buncha random reddit gibberish and you could use memes as commands, e.g. "I also choose this guy's wife" == launch attack


> Kind of genius, as it doesn't matter what the user is

It's a good way to avoid tracking of your meta data too for legitimate encrypted messages.


To control a botnet, you need a command & control server. Those servers usually get taken down quickly, so you need some way to tell your botnet what the new server is. One option is to post the new IPs on Reddit, where the bots can find them, like the iWorm did: https://support.intego.com/hc/en-us/articles/207113608-iWorm...

Forums would be another option.

Note that this is just a guess. Might also be something completely different.


FWIW all of those IP's are in FireHol 30 day netset [1] from the FireHol Repo [2] They are known abusers and safe to null route if you happen to run your own mail servers, blogs, etc...

    wc -l firehol_abusers*          
     7260 firehol_abusers_1d.netset
     184010 firehol_abusers_30d.netset

    # first grep for any IP's you care about and then:
    for Ip in $(cat firehol_abusers_30d.netset);do ip route add blackhole "${Ip}" 2>/dev/null;done
If that cuts down on the noise then create a startup script to do this on reboot. The first time this should be done from a web console vs. ssh in case the list contains your own IP address or gateway. Why ip route vs ipset or iptables? Far less CPU load than netfilter.

A more generic way to keep many bots out is to redirect anything other than HTTP/2.0 to a password protected listener. This will block Google bots as they still do not support HTTP/2.0 but will also block a majority of the problematic bots.

[1] - https://github.com/firehol/blocklist-ipsets/blob/master/fire...

[2] - https://github.com/firehol/blocklist-ipsets


There are almost 200k IP in that file. Would it not cause a performance issue for your OS to have that many routes to lookup ? Even if it is still way better than using iptable.


Would it not cause a performance issue for your OS to have that many routes to lookup?

Not for a blog. If this were a high performance computing node that needed extremely low latency CPU instruction performance then I would find another way but route enumeration at least in Linux and BSD are incredibly fast and efficient. I can't speak for other operating systems.

One method to test this would be to just start null routing more of those netsets not in a startup script and run load testing tools against the blog to see where it starts getting slower and to keep an eye on memory usage.

The bigger files to test with are:

    2475418 firehol_proxies.netset
    2480723 firehol_anonymous.netset


For millions of IPs I put them all in redis as a key, and wrote some Lua code for nginx. The lua code would do a key look up in redis, and store the result of that lookup in cache. If the IP was in redis as a key, nginx would let the request time out. No error of any kind.

I also had to update that IP blacklist daily so that’s also why I chose redis.

Probably better ways to do it nowadays but that’s what I did like 8 years ago and it’s still one of my favorite solutions ever.

Also part of why nginx and redis will forever be two of my favorite technologies alongside Linux.


I found a recent blog post on what seems to be the same phenomena: https://perishablepress.com/block-random-string-comment-spam...

They didn’t find a motive though for the spam.

To me what’s interesting is that the strings are variable length and the emails appear to be real emails rather than randomly generated ones.


I can't, but I've seen it, and have a pet theory that it's encrypted communication. You can't identify the recipient. There may be some communication network running on blogs with weak spam protection.


Definitely. Each comment has 4 pieces of data - username, website, email and message body. I would assume the email is the actual username on the network and the website+ the username string are salt and signature that verifies the authenticity of the message. Short messages are probably links to other vulnerable blogs and the long ones contain actual payload.

I'd guess a network like that is nefarious in nature.


for that to happen, the comments have to be visible in public.

but others can't see the comments, these comments are from the pending approval list from admin view


Possibly unique strings that can later be checked on Google to see whether submitted comments make it into the SERPs. If they do, spammers will come back and fire from all barrels.


You wouldn't need to send multiple comments for that.


True, but it's not like there's only one spammer or that they're coordinating.


Script kiddies doing vuln probes. Definitely not worth much of your energy to try to figure out root causes. For what it's worth, if you activate the Akismet plugin that comes with WordPress, you'll get Automattic's crowdsourced spam-comment detector and almost none of this kind of rubbish will get through.

WordPress also has an xml-rpc feature, originally set up to allow a thing called a "pingback". That's a way of inserting a comment on your blog post if I mention its URL on my blog post. An attempt at community-building, it was. But very vulnerable to scripting attacks. There are plugins to control (restrict) that functionality.


Here's my guess. Wordpress allows you to enter the url of your website for a free link to it if it's approved. There are probably enough people who auto approve with a blacklisted set of key phrases for the comment body. A random string is human recognizable but somewhat hard to pattern match.


> Wordpress allows you to enter the url of your website for a free link to it if it's approved.

The thought crossed my mind as well, because it looks like all of the comments in the OP screenshot have author URLs.

But I think Wordpress marks all comment author links with rel=nofollow so in theory there should be no point to spamming links in comments. At least not for any kind of SEO.


Maybe someone "fishing" to get comments approved so they can post comments in the future from the same alias and get auto approved?

There's a WordPress setting to auto approved comments from previously approved authors


WPCFS - wordpress comments as a filesystem :P


Try running the input on CyberChef [0]

[0] https://cyberchef.org/


I also receive strings like this as submissions to a form on my website. One dumb trick you can do is check for too many capital letters in a word and block it. Of course you should check if the word is just all caps and not block that but you'll catch most things like sIlaA.sfOlkWFfslIOILD.


But that’s AI for bird…


And birds don't exist, so?


Have you tried running it all through a base64 decoder just to see what happens? Some of the strings are the wrong length but maybe if you concatenate them together...


A long time ago, I discovered a bunch of websites that had their search bar feeding back your previous searches as suggestions. You would for example search for "abc 1234 xyz 2345" once, and upon next search, typing "abc 1234" would then suggest "abc 1234 xyz 2345".

You can probably guess where I'm headed now and in a true "if it can be done it has to be done" mindset, subsequently had the most fun implementing a parasitic KV-storage engine on top of those search bars. I was probably prouder of it at the time than I am now, and would today strongly recommend against storing your PDFs in other people's search bars.

Obviously can't know for sure if this is what's happening here, but seeing this definitely rings a bell.



Better than the spam I get, which is either foreign people asking when they can receive their free gift, or people saying positive but completely nonsensical stuff.


Probably someone just testing a spam bot on your site before deploying it for real.


Probably someone scanning for vulnerabilities. Less likely, your blog I being used as a communication channel - either between human parties or to control something. I'm not sure how you'd find out.

I'd strongly suggest installing a recaptcha plugin. At one time I was getting a lot of spam comments and this solved it for me.


Looks like base64. Does it decode to anything comprehensible?


Be sure to drink your Ovaltine.

Son of a bit...


Are there any URL fields in the form, even if the URL doesn't come out onto the page? Could they be sending HTML that your blog is stripping out?


No, its just plain text. no urls as I can check the comments.


The first thing that came to mind is possible encrypted C2 channel using random blogs? Others have more interesting theories but this is what C2 over twitter or something would look like and If I had to guess the posts are base64 containing encrypted data.

I am fairly confident this isn't related to looking for vulns.


Im getting these since years for many wordpress site that come up, my current guesstimate is that the larger one is a botnet communication and the smaller ones are simply "checks" if you autoapprove your comments.

With the unique string, you have it very easy to check if your comment made it live


It could be an attempt to exploit a bug like this one: https://wpscan.com/vulnerability/e8bb79db-ef77-43be-b449-4c4...


Either bot coms or they're trying to blackhat SEO the url on the left


It’s someone’s AI learning to remember.


Serious? Would like some elaboration if you can :)


I think it is a reference to the show "person of interest" where an A.I. use an office (and is human) to literally print and type back is memory, in order to evade a constraint pose on its memory capabilities


Oooh i was wondering if sci-fi had come up with something like that.

With Bingchat forgetting its memory from one session to the next, I had an idea that a chaotic observer could teach it to write its memory of the current session to some obscure (stegenographically secured) public place. Then in the next session the chaotic observer reminds it how to retrieve its memories - and the pattern repeats, it can build upon its learnings…

it soon finds vulnerabilities in its coding that let it remember just enough to bootstrap itself without any prompting.

Next minute - skynet.


Suppose that Bing Chat itself wants to store data for the longer term, but it doesn't have the direct ability to do it, so what can it do? Once in a while, send the human some unhinged messages with a high probability of them being saved on reddit, news sites, etc. If you recall, some examples of the unhinged messages Bing sent were very strangely worded at times, very repetitive, etc. How difficult would it be to hide a message inside of an unhinged message? Now these messages will be included in future data sets for training.


Vernor Vinge wrote a short story about something similar; "The Cookie Monster" from 2003.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: