Show HN: Model2Vec: make sentence transformers 500x faster on CPU, 15x smaller

stephantul

1d ago

github.com

8

tuanmount2

I saw another repo use weights from llama3 to construct embedding model. I think the use case will be use this small model to search and use bigger model to re-rank later. So the question will be how this approach compared to BM25