
Show HN: Crawlab: Open-Source Web Crawler Admin Platform That Runs Any Language - tikazyq
https://github.com/tikazyq/crawlab
======
tikazyq
Hi,

Thanks for the upvotes.

Crawlab is a golang-based distributed web crawler management platform,
supporting various languages including Python, NodeJS, Go, Java, PHP and
various web crawler frameworks including Scrapy, Puppeteer, Selenium.
Technically you can run any spider on it. It has both English and Chinese
language support.

Github Repo:
[https://github.com/tikazyq/crawlab](https://github.com/tikazyq/crawlab) Demo:
[http://crawlab.cn/demo](http://crawlab.cn/demo)

Since its launch in March, Crawlab has received a lot of positive feedbacks,
especially about the flexibility and appealing Web UI. And Crawlab is evolving
fast, we have developed many features through continuous iterations.

------
dragonsh
A good project, how does it compare to python based scrapy.org.

For admin you can use scrapyhub open source or also another project
[https://github.com/Gerapy/Gerapy](https://github.com/Gerapy/Gerapy)

~~~
chrischen
Scrapy is a web crawler. This project is a web crawler management UI/platform,
so it presumably manages your scrapy crawlers/instances and schedules them.

~~~
dragonsh
So if I understand it correct Crawlab is another simple easy to use admin for
managing web crawlers, one still needs to use scrapy or write their own
crawlers. It should be similar to the admin tool I mentioned in my earlier
comment and at:

[https://github.com/topics/scrapy-ui](https://github.com/topics/scrapy-ui)

[https://github.com/topics/scrapyd-ui](https://github.com/topics/scrapyd-ui)

~~~
tikazyq
There are a couple of crawler management projects: scrapydweb, spiderkeeper,
gerapy, crawlab. The first three are based on scrapyd.

------
realty_geek
Great to see an awesome product - primarily in Chinese!!! That will teach me
to not take English language domination for granted!!

~~~
pattusk
Certainly interesting to see English's domination increasingly challenged on
open source tech projects. However this makes contributing harder for non-
Chinese speakers. I had a look at the git's issues page and all the discussion
is in Chinese. Google translate can help, but I'm not sure it would be enough
for some subtle problems. Also not sure how communication would go with PRs if
part of the team is strictly sinophone.

Great project nonetheless. Will likely give it a try. Keep up the excellent
work!

~~~
VvR-Ox
We'll have to get used to this and I think it's actually a good thing.

Of course a lot of folks speak English but Chinese is also very important and
will be more so in the world.

~~~
dvdkon
I really appreciate having a "lingua franca" of programming. Projects in other
languages are certainly interesting to see, but I also appreciate that most
authors use English, it contributes to a larger worldwide community.

------
tikazyq
Thanks all for the upvoting and positive feedbacks for Crawlab. The reason why
Crawlab is mainly focused on Chinese is because it was initially promoted in
mainland China tech sites (Juejin, V2ex, etc). Due to the GFW we cannot access
the info outside China, therefore it would be difficult for us to know the
feedback from non-Chinese developers.

We definitely would be happy if more contributors can join Crawlab
development, so we will be working on the improvement of multi-language
support including English documentation, Code of conduct, Contributing.md and
English communities. Our team is small (please check out the Contributors
section) but from top companies in China and we would be happy to share
knowledge between Chinese and non-Chinese developers.

Btw, what is the best tech community? (In China we have Wechat group)

------
atymic
Looks like a cool project, however I can't seem to get into the demo (it seems
to indicate using admin/admin but that doesn't work).

Would be great to have an english language option on the demo login :)

~~~
tikazyq
Thanks @atymic for the feedback. The initial password for admin is changed so
that no harmful action would be done on the demo. Instead, you can still sign-
up to checkout the demo.

And we do have an English version but not on the Login page. Will definitely
add into it.

------
captainmarble
Sounds like you're using redis as a message broker for tasks here. Are you
using redis streams?

~~~
tikazyq
No, we are using SubPub for message communication between nodes. For tasks, we
are using hashed list. English documentation missing but we will add it later.

------
lidHanteyk
Cool stuff. Does it really run any language, or only languages that have had
integrations written?

~~~
tikazyq
Crawlab is based on shell execution, so basically anything that is runnable in
shell, it can be run on Crawlab, i.e. any language.

