

New pagination and crawling features - pranade
http://kimonify.kimonolabs.com/kimload?url=http%3A%2F%2Fkimonolabs.com%2Fcrawlblog%2F1

======
nfm
I'd never heard of this site before today. This looks cool and like something
I would actually want to use, and I think you've got a very promising way to
demo your product while doing content marketing.

That said, I had absolutely no idea how to make it work. I clicked around, and
recognized the effects of some of the things I was doing, but it didn't really
make sense to me and I'm still confused. If you're trying to acquire users
from this kind of page, I'd recommend having a giant call to action somewhere
at the top of the page with a link to some introductory explainer material.

~~~
pranade
Thanks a ton for the feedback... that's super helpful. We'll make some tweaks
to this page now.

~~~
nfm
You're welcome :) By the way, I've found running things through eg.
UserTesting.com will help reveal this kind of info in a way that you won't
necessarily get via feedback from friends or HN. Lot's of facepalm moments but
it's definitely worth doing to get a new perspective on how people perceive
your marketing and product.

~~~
pranade
Awesome, thanks for the recommendation!

------
adwf
Looks cool, but I couldn't find any definitive answer about whether it obeys
robots.txt or not? Just that it's upto the end-user to determine which pages
get crawled.

I'm not too fussed about people crawling my sites (as you say, it's gonna
happen anyway), but I do worry about certain dynamically built sections of
websites that are off-limits to bots for good technical reasons.

~~~
pranade
Although we do put the onus on the users to pay attention to robots.txt we
realize the reality is that some amount of them won't necessarily respect it.
That's one of the primary reasons behind the way we designed our crawler the
way we did -- requiring people to actually specify the links it will visit (as
opposed to spidering around sites following all links). Our hope at least is
this requires people to put a little thought into the data they want (and
where they want it from) before hitting a site.

~~~
adwf
Sounds good enough to me. Might have to keep an eye out for it, hopefully your
users will all be good ;)

Maybe you could design a feature that gives them a little friendly reminder if
they're about to cross someones robots.txt, lets them check that they're
aren't about to do something seriously annoying. Sometimes a robots.txt is
over zealous and should be crossed, most of the time not though.

~~~
pranade
Ah, yeah that's not a bad idea. Still keep the responsibility on the user, but
do our best to keep them honest :)

------
zyxley
This really needs actual docs instead of just videos.

Also, there's apparently no way to delete any properties or actually manage
collections using the kimonofy GUI, which is pretty annoying.

~~~
pranade
Thanks for the suggestion. You're right, videos aren't the best way to
comprehensively describe how to use this. We're working to put together some
deeper documentation (and improving the API editor) so you'll have more
control over how things come out.

------
lukasm
This is awesome product. Requests \- add target url to json \- merge a few
apis into one e.g. I'm scraping 20 jobs postings. I would like to make one GET
that gets me all the jobs.

~~~
pranade
Thanks, great suggestions! We're working on a more powerful API editor that
will let you do just that.

~~~
lukasm
Also, is it safe to use API key in Javascript that runs on client? Maybe you
should do singed urls in same format that S3 or have a public read-only key.

~~~
trip41
It's safe in the sense that we don't support 'private' APIs yet, charge money
yet or allow you to authenticate any other parts of the service with your API
key. But yes you're right, it will have to be dealt with eventually. This is
on our radar to roll out well before we actually start charging people or
offer different types of security features. Will probably be something like
public-key/private-key.

------
tonystark
You guys are pumping out hero features. Keep it up.

------
doesnt_know
I'm guessing the new feature is not suppose to look like this:
[http://i.imgur.com/cKFgQcg.png](http://i.imgur.com/cKFgQcg.png)

It's pretty broken. Firefox 27.0.1 on Arch if someone from kimono sees this.

~~~
pranade
Thanks for the feedback. Just curious, did you see this after you moved/
resized the window or did this happen right away?

~~~
doesnt_know
It happened straight away. After clicking on any of the data, that's what
happens.

------
theGimp
Man. If you manage to make it work and cover edge cases, you will be solving a
big problem for a lot of people, I imagine.

I hope your service flourishes and continues to exist because it will
definitely make my life a lot better.

Best of luck to you!

~~~
pranade
That's the plan.. thanks!

------
oniTony
Any way to pick up elements that are on the page, but not necessarily
clickable? E.g. the <title> element of the page.

~~~
pranade
It's unfortunately not possible right now, but we know people want it, so
we're working on it. Probably have to tuck it into our advanced mode feature
as it's not something we'll ever be able to solve with the point and click UI.

~~~
k__
So this is a scraper that just scrapes links?

~~~
pranade
No. You can extract any content you want from the page (incl. text, images,
links) just not meta elements or invisible ones (e.g. <title> or anything else
that usually shows up in <head> for that matter)

------
kirillzubovsky
This is so good, it's scary.

~~~
pranade
Awesome, glad you like it!

~~~
robertp
Great work. Big innovation in taking a clunky method of crawling info and
turning it into something extremely elegant with how its outputted and someone
can work with it.

------
level09
nice and impressive work, however I'm a bit old school, still prefer to do it
in a few lines of python code.

