We've been working on the product for over a year and we are just out of beta today! One of CC's goals is to encourage the use and remixing of CC-licensed content, and we hope that CC Search will help make that content more discoverable. The current version is very much an MVP and only searches images, but we plan to add more content types in the future and index the ~1.4 billion works out there under a CC license. We would love any feedback you might have.
Also, CC Search, the associated API, and the scripts we use to index data are all open-source and developed completely openly. Our sprints and roadmap are public and we welcome contributions from the community.
CC Search code: https://github.com/creativecommons/cccatalog-frontend/
CC Catalog API code: https://github.com/creativecommons/cccatalog-api/
CC Catalog code: https://github.com/creativecommons/cccatalog/
2019 Vision: https://creativecommons.org/2019/03/19/cc-search/
Active sprint: https://github.com/orgs/creativecommons/projects/7
How to contribute to CC projects: https://creativecommons.github.io/contributing-code/
Encouraging use of ccREL embedding like aldenpage mentioned in another response is also in our long-term vision since that would require no additional actions from the creators.
An annual transparency report is probably mandatory with that process, like these ones :
In my mind, the dream is to have the user embed an asset from our servers on their web page (like an updated version of these old CC license buttons ), read the referrer headers from the server logs, and then dispatch crawlers that read ccREL  data embedded on the page, which would allow us to instantly index content as soon as it is published. Performing broad web crawls searching the web for ccREL data is also possible but probably not what we're looking to do in the near-term.
We have a ways to go before we are able to do this, since there's no easy way for end users to create and embed ccREL at the moment, and there are of course lots of other unanswered questions about how we would moderate incorrect attribution, how these tools might be abused, etc.
The two image filters I use most often there are image size (larger than) and orientation (portrait or landscape). If these would be included here it would be perfect.
It would be great if there were more search filters. Especially size ("larger than") and date ("within x-y", "last 1 day/week/month/year").
And when you have filters anyway, filtering by EXIF data (geoposition, and typical photography details like camera model / aperture / focal length) would be pretty cool.
We recently set up a tech blog and we have upcoming posts that will talk about the architecture and decisions we made. Here's a post about the original proof-of-concept (we are no longer using that version): https://hackernoon.com/cc-search-developer-notes-and-reflect...
This search result for 'satellites' is a screenshot of a google maps result, which includes the copyright information for the imagery within the CC image [see the bottom right attribution text in the imagery].
How can that really be in CC?
How could we be sure that the images themselves are genuinely licensed rather than someone copying it from another source and slapping on the licence?
We don't yet have a way of dealing with content where the user incorrectly licensed it, I think we could add a "report image" function that would help us identify and remove this type of content. There's a disclaimer if you scroll down that says "Verify at the source: Flickr" with an explanation of why we can't guarantee the license, maybe we should make that more prominent.
In the parallel situation here the 'selling' company [CC search] is saying "this isn't stolen", and you're using it in good faith. Seems like your liability _should_ be zero. If CC_search also did due diligence then their liability should probably be zero too.
Media mega-corps probably see things differently so I imagine their copyright laws mightn't be as liberal.
Similar to the hypothetical ebay case: you're going to be asked to reimburse the owner (or return his property), you collecting from the seller is a different story.
I was recently trying to find music for a side project and searching for CC licensed music appears to have gone from somewhat onerous to very in sites like Jamendo.com, free music archive, etc. This is off topic but if someone can point to a website with CC licensed music where it’s possible to search based on license type (e.g. BY-NC-SA) I would love to know.
(Via https://news.ycombinator.com/item?id=19791801, which we merged into this thread.)
Similarly if the default CC license were "NC" then I imagine many more shared images would be excluded from commercial use.
My suggestion is that people probably wouldn't mind modification of their CC images as a default.
I usually use CC-BY-SA.
Edit: looking afresh at the CC license material it appears my understanding of the licenses is weak (eg see look down-thread), that the default does allow modification. Which makes it weirder that people would go out there way to specify that their pretty poor quality images could only be used as-is add not modified.
We have images from a lot of other collections, such as the Met, Rijksmuseum, Behance, Thingiverse etc. Flickr has more images than any of them by a couple of orders of magnitude, though.
Thank you for this.
Terms of Service: https://api.creativecommons.engineering/terms_of_service.htm...
- Browser navigation doesn't work. Going back doesnt take you to your previous search query.
- Search seems to be solely keyword based? And keywords are kind of hit and miss on many images.
My wishlist would include also including videos and music/audio and an API to access it. I'm sure that's a big ask though.
Adding audio is planned for Q4. Video will likely be in 2020. We already have an API, although we're not encouraging use of it until we build a few more features in.
Thanks for much for your work @kgodey and to your team.
Same thing happened to me (no results) and that was my initial thought but, in my case, I had forgotten to check uMatrix. Some vital CDNs and subdomains were allowed but uMatrix was blocking api.creativecommons.engineering
I've also seen the database :).
You may proceed.
1. It results in tracking and could violate the GDPR and other data privacy laws.
2. It is third-party content and might get blocked by ad / content blockers.
I suggest to offer a default HTML source code without CC icons.