notsylver
I was planning to do this myself lol. I was going to use SQLite as the index, and use `sqlite-vec` or something similar to query for similar files directly. I think the only other thing I was planning were more filters, `"positive term" -"negative term"` to be able to negate results, `>90"search"` to find images that match by >90% and some generic filters like `--size >1mb` to help narrow it down when you are looking for a specific image. Quantizing embeddings to make them smaller/faster also seemed interesting but I haven't tried doing it yet.
progx
Uses only 1 core 100% under linux, can this be changed?

10 images, each ~20 kb size, took more than 10 minutes to index, is that normal without GPU-acceleration?

spullara
Very cool! Here is a similar python version.

https://github.com/spullara/photoindex

Oh and if you want to run something locally on your iphone you can use my app I am still testing:

https://x.com/getrememberwhen

sureIy
This is cool. Is there also a way to show contents of the image as indexed? i.e. image 1 has cat and dog

There are a lot of tool/apps that let you “search images” but not much that lets you just as easily “read images”

kjeldsendk
I have wanted to clean up my photo collection for ages and remove any nsfw picture that might hide somewhere.

Would this be able to do that and how likely is it It will see a pc release.

petesergeant
I've been enjoying https://github.com/mazzzystar/Queryable on iPhone
y04nn
How does CLIP compare to YOLO[1]? I haven't looked into image classification/object recognition for a while, but I remember that YOLO was quite good was working on realtime video too.

[1]: https://pjreddie.com/darknet/yolo/

yburkov
netdur
I have made similar android app for semantic image search, works offline too, still gathering feedback and polishing UI, but it works, if you are brave enough here is it https://drive.google.com/file/d/1tE0cY6umj5h5zCY_Jvaou1M8sCf...
ivanjermakov
In russian, "sisi" is a variation of "tits".

Is there a job/services that confirm that branding is appropriate across different languages? Seems like a non trivial problem to solve.

Jack5500
Isn‘t clip superseeded by multimodal llms?
24currynigger
[flagged]