
Show HN: Sitebulb, a website crawler and auditor for SEOs - hathawayp
https://sitebulb.com/launch/
======
throwaway2016a
This looks really good. My only comment I have is that if I am purchasing a
piece of desktop software I expect there to be at least the option of an
annual license.

Monthly feels more like a service than a product and I think of desktop
software as a product.

But I realize Desktop software pricing is evolving so this may just be old
school thinking on my part.

Also, if I'm a Mac user (I am) I expect to be able to buy / pay through the
app store. I know it sucks because Apple takes a huge cut but there is a level
of trust (misplaced or not) and ease of use that comes with the app store. I
am much more likely to use an app if it is in there. Just like on Linux I am
much more likely to use an app if it is in a package manager.

Speaking of that, I don't know what desktop framework was used but, is there
anything preventing a Linux version?

~~~
seibelj
I am personally fine with a monthly fee for desktop software. If I'm paying
for software, it's up to the person who sells the software to decide how I
should pay, and for me to decide if it's worth it.

~~~
gog
I am definitively not fine with that. Of course it's up to the person selling
the software to decide on the pricing model, but it's also up to the customers
to voice their opinion.

I refuse to buy software that is subscription based. The closest to this that
I find OK is PHPStorm which has a hybrid model where if you pay for a year you
get to keep the version you installed at the beginning of your subscription
(or after a year of monthly payments have been made).

I don't expect companies to work on the software for free, if there is an
upgrade worth upgrading I am prepared to pay for it.

------
hathawayp
Hey HN, Sitebulb is a desktop crawler for Windows and Mac, specifically
designed for SEO consultants/agencies. Me and my business partner have been
building it for the last 2 years or so, and we're finally launching it today.

It's main differentiating factors: 1\. Scale – it can comfortably crawl
500,000+ page websites despite being a desktop program. 2\. Reporting – it
does a lot of data manipulation and processing so you don't have to. 3\.
Visualization – it has tons of useful graphs, including the Crawl Maps, which
help you visualise site structure.

Our aim was to give it the reporting capability of a SaaS crawler, with the
convenience of a desktop crawler.

Looking forward to hearing your feedback on our new product. Thanks, HN
community!

~~~
stingraycharles
Interesting that it’s all a desktop app. What problem do you think this solves
compared to something that runs in the cloud ? Apart from the cost structure,
I can’t think of anything myself.

~~~
hathawayp
The other big thing, in comparison to cloud software, is convenience. You can
setup a crawl and start it running -and see URLs being crawled - within a
minute.

On cloud software that's simply not possible, due to the way that everything
is scheduled.

There are a few other small things, such as being able to view Audits offline
(what we call 'train mode').

The cost structure can be a big limiting factor though, especially for smaller
companies. Sitebulb effectively remove all limitations around number of
domains, number of projects, total number of URLs crawled etc...

~~~
throwaway2016a
> On cloud software that's simply not possible, due to the way that everything
> is scheduled.

As someone who works in cloud software this makes me cringe a little.

I have no doubt this is how existing cloud SEO crawlers work but with elastic
scaling, web sockets, and serverless there is no reason why this has to be
true.

It is not a limitation of cloud software. It is a sign of devs and/or product
owners deciding making instant results is not a priority for the product.

Edit: I hear that a lot from industries that are not intimately familiar with
web apps. "You can't do that on the cloud"... a typical web software engineer
will not be able to do it but there are people out there who can. They are
more expensive than your typical developer but if depending on your product
they are worth it.

~~~
hathawayp
Sorry, maybe I misread, but I kind of read the comment as 'what separates this
from other cloud products on the market?'

So I wasn't trying to argue what is and isn't possible with cloud
architecture, simply what is and isn't possible with (our) cloud-based
competitors.

The process is along the lines of: 'Click Start', get taken to a screen which
says 'Initializing' or similar, then maybe 2-3 minutes later you'll see
something start to happen. But there is little to no data on which URLs are
actually being crawled.

Sitebulb, and desktop crawlers in general, has a much quicker feedback loop.

~~~
throwaway2016a
> Sitebulb, and desktop crawlers in general, has a much quicker feedback loop.

I wasn't denying that. I'm sure it does. I am confident this is way better
than most (if not all) current cloud solutions.

I just think it is unfortunate because there is no technical limitation of the
cloud that prevents it from being instant on the cloud as well.

The cloud can't handle spikes well (1,000 customers all unexpectedly try to
scan at once) but if the load is predictable, linear, or easily done in
parallel which I suspect it is for this use case than it is perfectly doable
with no delay on the cloud.

~~~
gerenuk
Deepcrawl does that, but again it entirely depends on the website type and
infastructure you have got.

------
mustafabisic1
We're nobody special. We're not a cool startup that's just secured funding,
we're a bootstrapped, 2 man team and we've built both our products from the
ground up.

Their honesty bought me! Nice going bros.

~~~
hathawayp
:) thanks

You might like these as well: [https://sitebulb.com/release-
notes/](https://sitebulb.com/release-notes/)

------
EvanKRob
Ditto on the sitebulb love. It's a well-done product and a welcome challenger
to the status-quo of SEO software out there.

I will say the cost gives me hesitation but you've put a lot of work into it
so I understand the justification.

For the visualization, does the crawl map limit the connections? I was
expecting to see more of a web with pages linked from the entire site. Can you
tell us more about that?

Thanks

~~~
hathawayp
Price is always a difficult one trying to get the cost/value balance right. We
did some pricing sensitivity testing before launch so I'm hopeful we've not
got it too wrong.

Regarding Crawl Maps, yeah it does have some limitations on, which I've
written about here - [https://sitebulb.com/resources/guides/crawl-maps-
faqs/](https://sitebulb.com/resources/guides/crawl-maps-faqs/)

Although from your comment I think you might be thinking it is a link map,
rather than a crawl map. So with the Crawl Map it is mapping out how each
URL/node was found when the crawler traversed the site. So each node will only
ever have one edge/link.

A link map ends up a LOT more messy, although it's on our roadmap to try and
build one of these too!

~~~
will_critchlow
"Almost everything looks like a graph. Almost nothing should be drawn as one."

It's _really_ hard to make sense of full link graph visualisations. I'm
talking about this in an upcoming conference presentation. We should share
notes :)

~~~
hathawayp
Absolutely. In development we tried different ways to make the Crawl Map also
represent link data, and they were all just unintelligible. Even the Crawl
Maps on big sites are hard to get your head around, and that's with Sitebulb
sampling quite heavily.

I'd love for us to come up with some sort of solution for it, I just don't
know how we'd do it!

SL presentation I assume?

~~~
will_critchlow
Yep.

Will be sharing and writing about it too - still a bit of a work-in-progress.
Hopefully it all comes together nicely!

I'm leaning towards comparisons between tables of data, metrics etc rather
than visualisations for much of this.

------
SnowingXIV
What's this doing differently or better than screamingfrog? Which is also a
desktop program and provides quite robust information. SF has been one of the
standard industry tools for those doing SEO for years.

~~~
hathawayp
The main difference from Screaming Frog (which is legitimately awesome) is the
reporting. Once it has finished crawling it will do a lot of pre-processing
for you and build graphs, lists of hints, etc...

I've written a more comprehensive answer to this here:
[https://sitebulb.com/resources/guides/how-is-sitebulb-
differ...](https://sitebulb.com/resources/guides/how-is-sitebulb-different-to-
other-crawlers/)

~~~
0x4a42
I'm testing Sitebulb right now (trial version). The crawling is kinda slow
(I'm on 100mbits fiber). Why did you choose to build eveything from scratch
instead of making an application that use the results from other
crawlers/spiders (ie: Screaming Frog) and just produce the audit reports?

EDIT:

And after about 3hours of crawl, this is what I got (and no way to resume it):

>Audit Stopped! >The audit stopped early because: Maximum Crawl depth limit of
50 reached

>WARNING: Audit Paused ! The audit is incomplete and did not finish properly.

~~~
0x4a42
I left the program running in the background and it resumed itself after a few
minutes. I have no idea why as there was no info on the dashboard.

------
seosimon
Have been using during the beta programme - really useful tool for doing site
audits and focusing quickly on the areas that can make the most difference.
Already using on client work and it is now a key tool for me alongside
Screaming Frog and others.
[https://a.paddle.com/click?said=431&aaid=2812&link_id=380&ch...](https://a.paddle.com/click?said=431&aaid=2812&link_id=380&chk=2e6a9d7972bd21985ee10c473e827d88&redir=aHR0cHM6Ly9zaXRlYnVsYi5jb20v)
(my affiliate link)

------
hathawayp
Sorry, I meant to say, there's a free 14 day trial available to anyone once
you download the software (no credit card required).

~~~
MattLeBlanc001
I keep trying to sign up, but it prompt me with: " you already have an
account...", when in fact I don't.

~~~
hathawayp
Hey I'd appreciate if you could ping me your email address to
support@sitebulb.com so we can see what's going wrong with your account (or
lack thereof).

------
blacksmith_tb
Hmm, I set it to crawl our little Ghost blog (e.g. blog.example.com) and it
immediately jumped to spidering our main site (e.g. www.example.com). Now, I
imagine if I had properly poked at the Advanced settings I could have limited
it to the initial subdomain, but I would have expected that to be the
default...

~~~
hathawayp
It should stick the subdomain you specify in the start URL, unless there is a
redirect or something. Other subdomains won't be crawled, although it will
HTTP status check links to subdomains. So possibly that is what you saw in the
URL log on the crawl progress page?

If you want me to take a closer look send the subdomain over to
support@sitebulb.com and I'll see what's going on.

------
cubano
Avast free WebShield on Win7 is giving me a FileRepMalware error and is
aborting the download and connection.

I am not exactly an expert on Avast and this particular error but perhaps
someone here would like to know about this.

~~~
hathawayp
That's frustrating, sorry. It's a 'reputation issue', that over-protective
anti-virus software doles out to smaller software vendors like ourselves.
Basically they don't know if it is good or bad because we haven't had millions
of installations.

i.e. it's a false positive

~~~
cubano
Sure no worries..of course I personally knew that, but I thought it might be
useful for you guys to know its being blocked by the AV.

It's a really nice product BTW...good job.

------
thehodge
I've been testing Sitebulb for a few months now and I'm really impressed,
you've done solid work guys :) good luck with the launch

~~~
hathawayp
Thank you! And thanks for the beta feedback, it was super helpful.

------
sbeckeriv
Nice to use and useful insights. Thanks!

------
danielsamuels
How does this compare to something like Scrutiny, which seems to do a similar
thing?

~~~
hathawayp
It has a lot more comprehensive reporting and data visualization than the
likes of Scrutiny. I have no idea of the scale limitations of Scrutiny, but
I'd be very surprised if it can handle ~500,000 URLs.

Also Sitebulb is for both Windows and Mac.

------
grandpoobah
why is the windows download 130mb? that seems excessve

~~~
garethbrown
Yeah and that's compressed :) I'm in the process of getting it down, but
there's a few things in there don't help its size. For instance, the latest
versions of Electron and Phantom a reasonably big.

------
hayksaakian
How does this compete against screaming frog?

~~~
hathawayp
Answered this one below already. For completeness:

The main difference from Screaming Frog (which is legitimately awesome) is the
reporting. Once it has finished crawling it will do a lot of pre-processing
for you and build graphs, lists of hints, etc... I've written a more
comprehensive answer to this here:

[https://sitebulb.com/resources/guides/how-is-sitebulb-
differ...](https://sitebulb.com/resources/guides/how-is-sitebulb-different-to-
other-crawlers/)

~~~
hayksaakian
thanks, i'll check it out

------
chad_strategic
Linux version?

~~~
hathawayp
I'm afraid not, at least not yet. It's something we'll work on if the demand
appears to be there.

Right now we are focused on other features that appear to be a higher priority
to our users.

~~~
lpasselin
I'd also be interested

