Hacker News new | comments | show | ask | jobs | submit login
Show HN: IP Geolocation and Threat Data API (ipdata.co)
79 points by jonathan-kosgei 4 months ago | hide | past | web | favorite | 65 comments



Looks neat, very clean.

What factors go into determining whether an IP is a threat, and how often is this reviewed?

One of the problems I have with most public intel (don't care if it's FireHOL, Crowdstrike, Alienvault or US-CERT) is that inevitably some GoDaddy (for example) site gets used by an APT and so a GoDaddy IP makes it onto a public blacklist, flagged as being abusive. But a billion other sites also share that IP on that host (or it gets reassigned, as in AWS) which leads to a deluge of false positives anytime anyone else happens across it by means of a legitimate site. Some of these IPs remain blacklisted for years despite any malicious infrastructure long being dismantled.

Do you add any value by mitigating this, or do you just suck down the same public blacklists every other product uses?


Thanks!

You make a valid point. A number of VPN providers for example use GCE/AWS/Softlayer to host their services. And these IPs do get reassigned. One of the ways we mitigate this is by limiting the age of IP addresses in our lists to a maximum of 30 days. If an IP address hasn't been reported to have been responsible for malicious behavior for a period of 30 days it shouldn't be in our lists.

We're also mulling adding an is_cloud_provider field to the threat response object.


is_cloud_provider, would be great!


Thanks for the feedback!We'll probably push this by the weekend :)


Why is is_anonymous = true for a tor relay? Why are relays and exit nodes given the same flag? As a tor relay operator, I anticipate this being misused, like so many before it, to arbitrarily block all tor relays by people who don't know or care how tor works.


Hi f2n, thank you for raising this concern. is_tor and is_anonymous are true for any and all nodes on the Tor network. I'd love to hear more about your concerns, please send me an email at jonathan at ipdata dot co


For traffic to leave the tor network, it must go via an exit node. Many nodes are just 'relays' (they allow no exiting).

A few years ago I ran a no-exit relay from home. Eventually Hulu blocked my IP, even though all traffic from my IP to hulu was from me, and not via tor. Hulu couldn't be bothered to differentiate, and just assumed that a tor node at my IP address meant that traffic coming from that IP must be something routed via Tor.

By not differentiating, you are making the same mistake, and will punish people for no reason.

Edit: Assuming you are pulling the 'exit node list' you link below, I don't see how that is happening, as it does at least claim to verify that you can actually exit through the node.


Hi colonelxc, we'll probably only include Tor exit nodes in our lists and are working to fix this. Thanks for sharing this.


This looks great! I'll give it a shot for my web game (it's a neverending battle dealing with troll accounts).

I've been using Cloudflare to detect TOR, but couldn't find a good way to detect proxies/etc.


You'll love our data :)


Hey all! An interesting way to test our API would be to pick a random IP address at http://check.torproject.org/cgi-bin/TorBulkExitList.py?ip=1....

And append it to https://api.ipdata.co. For example; https://api.ipdata.co/185.10.68.114


Do you provide local database? Making a web service call for every request seems like a performance bottle neck.


I've always had good luck with Maxmind's local database [1] offering. It bewilders me how many companies today create SaaS offerings and refuse to offer on-prem versions. It's like they intentionally want to avoid customers with serious needs (speed and security being the most common need for on-prem) who are willing to pay serious amounts of money.

[1] https://www.maxmind.com/en/geoip2-databases


Let's say it like this:

- https://api.ipdata.co/1.1.1.1

City name: Research

- https://www.maxmind.com/en/geoip2-precision-demo?ip=1.1.1.1

City name: Research

Oh, that must be a very coincidence.

Nuh nuh nuh, nobody would ever want to launch a SaaS with a database they have stolen.


They aren't using a copy of Maxminds DB, see:

https://www.maxmind.com/en/geoip2-precision-demo?ip=185.10.6... (No data)

https://api.ipdata.co/185.10.68.114 (lots of data)


I wonder if Maxmind have put in some "Trap Streets" in their dataset. https://en.wikipedia.org/wiki/Trap_street


Yes, I believe they are. It is very common in the industry.


It is interesting! Both making same mistake.


Hi, unfortunately we don't. However performance is very important to us which is why we have 11 endpoints around the world. And average ~65ms response times see status.ipdata.co.


65ms feels like a lot. I guess you can cache it so you're only calling once per IP.


I understand what you mean, but every other provider from the tests I've done comes in at double our speeds some even over plain HTTP. We only serve requests over HTTPS

More importantly the performance is consistent and you'd get the same performance wherever you were in the world.


Why compare to another API endpoint? You should compare it against accessing a local database.


Comparing a network call vs filesystem i/o wouldn't be a useful comparison for someone deciding between different third part API providers.


Why not? I can load an IP database locally, so it's an option and a "competitor" to a 3rd party API. That's what I'm comparing it to, not against other APIs.


It's pretty obvious that hitting your local disk is going to be a lot faster than making a network call. Make the right decision for your use case.


Maybe in this case. But historically, exactly this decision point has oscillated. It's caused reversals in distributed computer design for decades. First the networks were slow and disks were (relatively) fast - so each machine had one. Then networks went Ethernet, and disks started disappearing. Disks got down to a few ms access time and they came back. Then Gigabit came around. Then SSD.

Today I'd say it depends upon exactly what your network data source latency measures out to. The answer could go either way.


That's a pretty interesting history and put in that perspective I see what you mean and totally agree with you.


This is definitely really cool and something I imagine myself using for future projects. The price points are very reasonable and the website is very usable. One question I have though is what best practices do you recommend for how one can protect against IP address spoofing.


I don't have a resource to point you to, but feel free to reach out via email jonathan at ipdata dot co. Would love to discuss this


How often is the threat data updated? Any way to truncate the response to just what we want?


The data is updated as often as every 15 minutes, though we aggregate all those changes over the course of an hour. Are you interested in only the threat data? We have been considering making it possible to query for individual fields


We'd be only interested in threat data for our usecase.


Please send me an email at jonathan at ipdata dot co.

We'll work on exposing the individual fields and I can let you know once that is live.


One of the few services that geoip's my VPN correctly to germany. Too many of them pick it on France and I get shown lots of ads I don't understand. And no threat either (some blacklist me for sitting on an OVH network).

Good work!


Thanks! :)


I tried sharing your page on FB messenger to a friend who is interested in this kind of APIs, but Messenger blocked it...man I hate this blacklisting crap...back to IRC it is.


Hey sorry about that! Thanks for sharing!


What tool do they use for https://status.ipdata.co ?




I will answer myself: https://updown.io

I think I will use it in https://apility.io, a service that competes at some point with this.


https://apility.io/search/127.0.0.1

Not sure I would trust Apility.


That means 127.0.0.1 is in these lists. We compile all different sources in one list you can customize. If you don’t trust one of several of them you can disable them using the api or the dashboard. There are over 100 now.

A cyber Intel engineer would perform a check in these lists to find out why this is a false positive. In this case there are bogons lists, and they are correct. Others, means that a popular malicious domain has changed the public ip to a private range. So the domain should be removed from them, something normally happens automatically in hours.


Seems nice.

Small suggestion : 1500 api request for free and 2500 api requests for 10€.

I think it's a huge leap for pricing, either reduce the requests for free user or reduce de price / increase the requests for the first payed plan.


It's not really that huge, 1000 requests is a rounding error for our other plans :)


Great job, we have similar service https://ip-api.io - ip geolocation and ip intelligence / abuse prevention


Lots of 404 for the images, such as https://ipdata.co/img/seworks.svg


That link works for me. Could you try to reload the page? Ctrl + R


Tried in a new private browser window, even. Here's part of the page source on that URL:

<header class="section-header"> <small>Oops</small> <h2>Page Not Found!</h2> <hr> <p class="lead">Sorry. That page doesn't exist.</p> </header>

Edit: The images are loading now. But https://api.ipdata.co returns:

  {"message": "Internal server error"}
and the demo on the right hand side of the frontpage shows:

  Your IP Address is {}0 items


Could you try visiting https://api.ipdata.co. I'm positive the API is up https://status.ipdata.co


It still errors out:

  % curl --header "Accept: application/json" https://api.ipdata.co
  {"message": "Internal server error"}
Also, this request doesn't seem to return valid JSON:

  % curl --header "Accept: application/json" https://api.ipdata.co/224.0.0.1
  224.0.0.1 is a bogon address.


That's an expected response for bogon IP addresses, but you have you make a good point, we should make our error responses JSON.

Please email me your IP address at jonathan at ipdata dot co and I'll help you out.


Broken for me too


Should be fixed, please try a hard reload.


Have you thought how(or if) GDPR will affect your product?


Hi sphix0r, yes. First off we only store logs from user requests for 24hrs and only for analytics. Otherwise our GDPR compliance is still something we're perfecting but something we believe we're already on the right side of.


getting a 403 on https://ipdata.co/docs.html


Fixed! Do a hard reload i.e. Ctrl + R


How are you identifying Tor?


There's a huge list at http://check.torproject.org/cgi-bin/TorBulkExitList.py?ip=1.... of Tor Exit nodes.


Jonathan's been doing a great job of ipdata.co! I'd like to also shout out to my own service https://ipinfo.io here though, where we've recently launched new plans that include company details, carrier details, and IP type - we have a custom classifier that labels each IP as isp, business, or hosting, which can be really useful for a bunch of use cases. Here's sample output from the pro plan:

    {
      "ip": "66.87.125.72",
      "hostname": "66-87-125-72.pools.spcsdns.net",
      "city": "Southbridge",
      "region": "Massachusetts",
      "country": "US",
      "loc": "42.0707,-72.0440",
      "postal": "01550",
      "asn": {
        "asn": "AS10507",
        "name": "Sprint Personal Communications Systems",
        "domain": "spcsdns.net",
        "route": "66.87.125.0/24",
        "type": "isp"
      },
      "company": {
        "name": "Sprint",
        "domain": "sprint.com",
        "type": "isp"
      },  
      "carrier": {
        "name": "Sprint",
        "mcc": "310",
        "mnc": "120"
      }
    }
See https://ipinfo.io/responses for more of an overview of the differences between our plans.


Do you think it's appropriate to shill your own service every time a competitor has a Show HN?

You made the same post during OP's last Show HN: https://news.ycombinator.com/item?id=15881463

I also wonder how mature a project has to be before it seems sheepish to "Show HN".


I think it's always appropriate. May the best service win in any Show HN. Feature disparity quickly reaches equilibrium when missing features are highlighted, making all products better (in theory).

Everyone's just trying to do the best they can with what they have.


No hard feelings :)

We're still less than a year old, so not at all sheepish about sharing whenever we can :)


Hey Ben! Thank you :)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: