Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Dog API (dogapi.dog)
275 points by kinduff on Dec 27, 2022 | hide | past | favorite | 73 comments
Hello there, happy holidays.

I've been maintaining for 6 years this Dog API that only returned facts. I recently rewrote the project to make it more flexible [1] and I had a blast doing so.

This API has been used by a lot of computer science students, as well as bots and other 3rd party services that integrated in the past. The old endpoint receives around 1,000-1,500 requests per day, which makes me happy.

The goal is to extend it to make it more interesting and usable, I collect dog data in my spare time. I'm not looking to monetize it, it's just for the love of education.

Feel free to use it, and share it!

[1]: https://github.com/kinduff/dogapi.dog

From a fun / teaching / learning point of view - I get it. As others are saying however, the secondary learning that you want is that sometimes a database is the right solution.

For example, a student could learn that they need to use a API (with all of the API problems: latency, server down, throttling, blah blah blah) when all they really need to do is download a sqlite file, hook up sqlite to their application, run one query - and have a vastly simpler application. This could be as little as 10-50 lines of code. Never underestimate the power of shipping data via sqlite files.

Considering a set of relatively infrequently changing facts about external world, publishing them as RDF can be a good choice. Developers can use it in whatever way they need, including loading it into SQLite—and it’s self-documenting, so they don’t need some extra information such as a spec to explain what everything means. It can link to external data for things beyond its scope that someone else already took care of. Other data can link to it, too—so it benefits open information exchange.

(Yes, RDF has got a bit of a bad rap in some circles, but I don’t think it’s entirely justified… More importantly, RDF is incredibly boring, so publishing such a dataset is likely to gather much less attention than releasing a free public API, at least here.)

I agree, but also don't underestimate the problems and risks in real systems that keep changing.

I've shipped some data in a JSON file bundled with an application, because it didn't change a lot, and we had frequent (biweekly) releases anyway. I mistakenly assumed those two things would always be true.

As time went on, the frequency of necessary changes increased but our overall products release cadence got worse and worse (many months) as the product grew. So customers had continually outdated data.

An additional problem I didn't expect is that since we wanted to get customers the latest data (because the next release was so far out), we had to update the data set as close to release as possible. So always a change at the last minute, which is already a busy time and means it's not really tested well. And also it was extra toil; we had a script for data generation but it still needed a manual run and pull request. In the time I spent running this I could have easily written an API based version several times.

Another problem was that this change had to be made for every release, including bugfix ones. This caused frequent discussion (do we need this?), was forgotten a few times, and caused merge conflicts later which were annoying because in this case the proper way to merge was to ignore both ancestors and rerun the script, which people weren't used to.

Of course I rewrote it eventually (after convincing PM that I need time for this supposedly long done feature again), but then I had to deal with the extra problem of customers who had local customer specific modifications of the file (because that was the workaround our support staff came up with as a way to deal with the outdatedness).

Thankfully, a problem I did not habe was binary data in git. The commits were ugly but at least still got ok diffs (once we sorted the JSON file properly). But you mentioned SQLite, which is not a good match for git. Of course one can work around it by merging the raw SQL, but still, yet another pitfall.

Files/dumps as an API is vastly underrated.

For this use case that works well but a big part of that learning is the data is fairly static and likely doesn’t update often

Thanks! As soon as I find the time I will use it in my useless project, https://caolendario.pt, a pun on the Portuguese word cão (dog) + calendário (calendar).

Photos are taken from https://dog.ceo/dog-api/, a similar project.

More FE-relevant, but also relevant: https://placekitten.com/

> I lost ownership of the Slack application due to Slack's policy. I tried to regain ownership of my own application but it was not possible.

Would love to hear more about this.

Long story short: I built the app under my ex-employer Slack, and apps are tied to organizations. When I was removed from the organization, I lost access to the app. Couldn't recover it, I've been trying for years.

I thought Slack was just a way to share screen Is it a platform for apps ?

Slack is like IRC and Slack apps are bots.

Slack primarily is a group chat, but it has support for integrations they call 'Apps'.

Nice. I love silly things like this.

An example on the homepage would be nice. Maybe something about Laika (https://en.wikipedia.org/wiki/Laika)

Unrelated to OP, but is it SOP for Russian agencies to deliberately obfuscate or outright lie on otherwise uncontroversial data?

> Laika died within hours from overheating, possibly caused by a failure of the central R‑7 sustainer to separate from the payload. The true cause and time of her death were not made public until 2002; instead, it was widely reported that she died when her oxygen ran out on day six or, as the Soviet government initially claimed, she was euthanised prior to oxygen depletion.

There seems to be little reason to lie about this given Laika's survival was already nil.

Having the first living thing in orbit survive for a week vs. a few hours makes a pretty big impact on the perception of your new space program. "Total success, now we just have to figure out re-entry" vs. "failure, our spacecraft cannot sustain life yet".

> is it SOP for Russian agencies to deliberately obfuscate or outright lie


Laika was launched into orbit in 1957. At that time "Russia" was a member of the USSR (soviet union) called the "Russian Soviet Federative Socialist Republic" (https://en.wikipedia.org/wiki/Russian_Soviet_Federative_Soci...). That government collapsed in 1991, and was replaced by the current "Russian Federation" government.

Which is relevant history to your question, because SOP for the USSR/RSFSR is very different from SOP for the current government. The current gov is corrupt and lies, but has entirely different motivations than the government that lied for 34 years about Laika.

Of course they are gonna say things worked out as planned.

http://TheCatApi.com gives me cat pictures.

Can this dog API do that? If so, that would be a huge "selling" point

I'm gathering some images for breeds, already have everything in place to upload them. This feature will come soon.

If you're interested, might be fun to hook it up to stable diffusion

It would be amusing but since the purpose of this API is factual accuracy, I can't imagine a worse fit then coupling it with stable diffusion.

I love the fact that Cat API offers an Enterprise plan.

That's because cat pictures are the 2nd most viewed type of picture on the interwebs.

“Cat as a Service” is my go to for these kinds of things.


> With V2, I'm planning to add new features including breeds, pictures, and other cool dog data.

There was a dog api that sent dog images, it was a few years ago that I last saw it in action though.

I think you should open source the data, too, by uploading a SQL dump to Github. It provides more learning vectors.

Will give it a thought, thanks for your comment.

One possible advantage here depends on how you do this, but you could make it easy for others to add/edit dog facts and thus help you maintain the data (with moderation, if wanted). Is this something you are interested in? This can be complimentary to the API - it doesn't have to replace it.

(I'm writing on an open source tool to help people do this with files in Git repos. https://pypi.org/project/DataTig/#data )

sounds like hooking the API up to wikidata would be a sustainable long-term solution (presumes wikidata has the relevant coverage).

The swagger doc for version 2 points to localhost and so the "Try it out" buttons fail.

I mention this only because it's not very clear to me if you'll be providing an endpoint for v2 or this is no longer the case and people are expected to install their own server, as the README on the repository suggests.

Other than that, thanks. It looks like a nice tool to have for teaching :)

Thank you for pointing that out, I've updated the Swagger file to make it work as expected.

Ah, ok. Great :)

Excellent. This goes straight into my list of nice APIs for student work and coding exercises. If there's any way to contribute or support, you should advertise it.

Thank you! Feel free to contribute to the main repository [1], using "buy me a coffee" in the FAQ, or GH Sponsors program [2] to support the server.

[1]: https://github.com/kinduff/dogapi.dog#contributing

[2]: https://github.com/sponsors/kinduff

It would be awesome to see that list you have!

You might also like:


Okay, when you said "dog api", I thought you hooked up a RPC API to a voice assistant that says "sit", "come", "take" etc and trained your dog to respond to it.

This is what I was hoping for - and of course there would be a cat option but it would simply return “not implemented” for everything.

Expectations. I was expecting an API for dogs to use. Bark once for a treat, put paw on human for some affection, scratch at the door to go out, etc.

402 Treats Required

I was expecting it to be dog slow.

I did the same thing but dumped it into a table. Small enough dataset that can load the entire thing and just use jquery to filter via drop down selector.

To make it useful I tried filter on characteristics that a buyer would be interested in such as long-haired vs. short.

I also gathered the data manually. I'll have to compare datasets


Cool project. Noob question here, the homepage mentions that V2 follows some kind of standard, and the API docs page mentions "JSON API schema", does that refer to this? https://jsonapi.org/format/

That is correct, it is referring to that standard.

The glory of “ERROR” documents as a concept!

JSONAPI is too complicated at first, then exactly as complicated as is useful over time.

I'm learning more about Restful API design and was wondering why there is a generic "data" wrapper, "type: bread/group/facts" and generic "attributes" structure rather then just returning a object designed for the specific type of data being described.

Take a look at the JSON:API specification https://jsonapi.org

Kind of off-topic: I built a single HTML file website to fetch dogs images from Flickr tears ago for my girlfriend. Totally forgot about it until now. It seems to be still functioning.


I suppose the style sheet for the site could be named dog-style.css?

Wait - maybe not a good name after all.

Is the API aiming to be a high quality source of dog related facts? Or is the focus mostly on exposing an API for testing code and the content is not a priority?

Also: is the data open source? Can we contribute to that too, e.g. send a PR?

I've tried to curate the information as good as possible, but some facts are hard to confirm. Same for breed descriptions and other data.

The main goal is to expose a fun API, and then be as accurate as possible. I'm working on the latter.

This reminds me of the Capybara API.


(The other one seems to give a CDN error these days)

Really like the documentation, will try something similar, additionally to the Swagger UI.

Will make a Datadog check for it in python to pull data and display the results on a dashboard. Maybe also some alerting, if breeds count suddenly drops or something

Ok I know it's not in vogue right now but this is actually a good use for the blockchain. Like the database here could just be placed on-chain in a smart contract, and anyone could read it (for free, only writes would cost money) forever, as long as at least one person is operating a node on ethereum. No ongoing hosting or serving costs. Anyone could build an API that did like photo generation or whatever on top of it, and builders could always access the raw data directly.

You seem to be describing something like a classic P2P content-distribution system - BitTorrent w/ magnet links, IPFS, CoralCDN, etc. Could you explain a bit more what part of your proposal would be incompatible with those existing systems?

Not too familiar with CoralCDN, will check that out. Would say the comparison with IPFS is apt, BitTorrent though requires people to seed on a per file basis right? so versioning for the source of truth etc becomes tough and you need people to care about your data specifically.

IPFS is an alternate solution, but as addresses are not updatable someone somewhere needs to keep track of where the canonical source is. Further, there is no API built in - the data would live at some address, but someone still needs to construct a way to get it without just downloading the whole data dump at once. Perhaps you shard across different files but that has its own issues.

On chain with Ethereum, you can use a smart contract which allows CRUD operations on the data on a per item basis. The latest version of the smart contract data is clear. The API is basically set with the smart contract - infra to get that has already been built with libraries like ethers, and will continue to be maintained since there are other contracts/products out there that rely on the same infra.

The blockchain does not need to be thrown at every source of data store.

Sometimes a tiny lil' db is all you need to serve some nice cacheable data :)

I always use your API to place images when I need to show something! Thanks haha

Gotta love the commitment, I'm sure someone could make use of this

Is anyone surprised that there's a .dog top level domain? When did they add all these extra domains? Also where's the .god TLD?

> On June 20, 2011, ICANN's board voted to end most restrictions on the creation of generic top-level domain names (gTLDs) -- at which time 22 gTLDs were available. Companies and organizations would be able to choose essentially arbitrary top-level Internet domains.

> The initial price to apply for a new gTLD was $185,000. ICANN expected the new rules to significantly change the face of the internet. Industry analysts predicted 500–1000 new gTLDs, mostly reflecting names of companies and products, but also cities, and generic names like bank and sport.

> ..Esther Dyson, the founding chairwoman of ICANN, wrote that the expansion "will create jobs [for lawyers, marketers and others] but little extra value."


How big is the overall dataset?

Why does this need to be an API? Why not just provide a file with the relevant information?

I wanted to build this as an API. What are the advantages - learning wise - of using a file with the relevant information?

The implementation of dogdump (a Dog API to SQLite scraper) left as an exercise to the reader ;)

Learning wise there is no advantage, but for it to be actually useful for people that's a better option imo.

You could bulk import it into your own database and run arbitrary queries locally.

Of course if you want to use this as a learning project for writing and using APIs... that takes precedence ;-)

Excellent API.


Personally lost interest as soon as I saw the awful AI "art" at the linked page. Not expecting everyone else to have the same opinion, but mine is mine.

"Before you speak ask yourself if what you are going to say is true, is kind, is necessary, is helpful. If the answer is no, maybe what you are about to say should be left unsaid."

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact