
Photo Metadata and Search on MacOS - jjwiseman
https://28mm.github.io/notes/osx-photo-search
======
haikuginger
Your script has what looks to be a pretty big inefficiency here[1] that's
slowing it down. Looping over a set kills the constant-time presence-checking
that you get from using that data structure; it will likely be much faster to
do something like this:

    
    
        tags = list(set(descr.split()) & categories)
    

The following would also work:

    
    
        tags = [x for x in descr.split() if x in categories]
    
    

[1][https://gist.github.com/28mm/9820bd8b6eb27555efe9d6f46dd95a8...](https://gist.github.com/28mm/9820bd8b6eb27555efe9d6f46dd95a81#file-
gistfile1-txt-L97)

~~~
28mm
Ah, interesting observation. I’ll look at changing it to something more like
what you’ve suggested.

If memory serves, the reason it doesn’t first split the description on white
space is that some categories contain whitespace, and would never match.

Thanks!

~~~
haikuginger
If you need to check against multiword tags, I'd suggest a utility function to
expand a list of words into each possible one-or-more-word subset. Should
still be substantially faster than the current state, and you can improve it
even more by limiting it to phrases with no more words than the tag with the
maximum number of words.

    
    
        def get_all_phrases(descr):
            words = descr.split()
            if len(words) == 1:
                return words
            phrases = []
            for i in range(2, len(words) + 1):
                phrases += get_phrases_of_len(i, words)
            return words + phrases
    
        def get_phrases_of_len(length, words):
            return [' '.join(words[i:i+length]) for i in range((len(words) - length) + 1)]

------
saagarjha
> Undoubtedly, launchd can be told to disable photolibraryd, but the approved
> mechanism wasn’t immediately obvious to me.

You want "launchctl unload
/System/Library/LaunchAgents/com.apple.photolibraryd.plist".

~~~
28mm
Author here—-that is vastly better, thanks for pointing it out.

~~~
dunham
Why not just copy the database (and wal) to tmp before accessing it?

~~~
28mm
Lack of familiarity with SQLite—-and maybe some mistaken assumptions about its
behavior :)

~~~
dunham
I often copy first because I'm afraid me locking the database is going to
interfere the app. Although, for my Apple Notes extraction, I am hitting the
database directly.

------
kennethfriedman
It's also absolute madness that the native Photos app on iOS will not search
the meta-data of the photos, only of the apple-restricted "scene recognition"
objects, and dates.

Add a description to a photo, search for that description in Photos for iOS,
and there will be no results.

~~~
aikinai
Apple’s Photos team seems to be actively working against metadata support. It
used to respect EXIF/IPTC information on import, but High Sierra for example
completely ignores all metadata dates and sets the time to file modified time.
ಠ_ಠ

~~~
beamatronic
I wish there was an open source Photos/Picasa type of app for power users.
Something that could create and export indexes in many formats.

~~~
woolvalley
There are, but they don't work well. Like darktable.

Out of all the photo library management apps, apple photos is by far the most
performant.

------
reaperducer
This is a major shortcoming of MacOS to those of us who use it heavily for
photography. Or even those with large cameraphone libraries.

The data is there. It should be available to all applications!

------
Someone
_”Instead, I opted for an egregious hack. In order to lock its database,
photoslibraryd, need to be able to write to it. I simply removed write
permisions. […] Now, it’s possible to open $photodb, and poke around.“_

If photolibraryd has that file open permanently for exclusive access, I think
that shouldn’t work; photolibraryd either should keep write rights and others
shouldn’t be able to open the file, as long as it has the file open, or you
run the risk of database corruption (probably recoverable) when photolibraryd
can only write half of what it wants to write (which will happen is, I think,
implementation defined because of the existence of NFS)

I also think it is unlikely that photolibraryd has code checking for that the
file becomes read only ⇒ apparently, photolibraryd periodically reopens that
file for writing. If so, one should be able to race it for exclusively opening
it.

Also, I’m curious. Photos.app must have a way to open that database for
writing. How does it do that? Does it stop photolibraryd? Send it a signal?

~~~
soneil
Doesn't it seem more likely Photos.app simply communicates with photolibraryd?

------
giobox
This is a clever idea. I had just assumed that the photoanalysisd tags were
indexed by Spotlight already!

------
woolvalley
I hope he files a radar about this :)

