
Show HN: Xtrct.io – Automatic product data extraction - static416
https://xtrct.io
======
darrenwestall
Awesome - I’ll try this tomorrow and let you know if it works for our use
case.

We currently pay a different provider £299 per month for 10k requests so we
will definitely migrate if it does what we need.

A few questions, does it handle pages that are loaded by JS? And would it find
something like a house price or job salary?

~~~
static416
We originally attempted to build something that works on any type of
information on any page, but that was a little too broad to be shippable in a
reasonable time.

So the current version will work on anything that looks like a traditional
product page. Prices, variants, primary photos, comments/reviews (some of the
time), etc.

Other pages may also produce results, but it's really tailored to product-like
pages at the moment.

It runs a full Chrome browser, so if you can see it in Chrome, it'll be
extractable by the engine. It renders JS, and dismisses any JS popups or
modals so they don't contaminate the data.

If you give it a try and it doesn't produce the results you're hoping for,
definitely let me know and I'll try to add your examples to the next batch of
training data. Once a site is added to the training data the accuracy is very
high.

But please give it a try and let me know what you think! eric@xtrct.io

~~~
darrenwestall
I've dropped you an email, it doesn't seem to handle jobs at the moment but
I'd love to explore this with you further.

------
static416
Just launched, still some improvement needed in the ML components, but works
pretty well.

Want to gather some input so I know what kind of training data to add to the
ML components, and if it's missing any features people are interested in.

Any feedback, advice, thoughts, would be great.

~~~
jazoom
What technologies do you use to power this? Something like YOLO?

------
artur_makly
Nice work. What IP service do you use? I've found most of them to be of really
poor quality.

