Tokens Per Second is NOT All You Need Goodhart’s Law: When a measure becomes a target, it ceases to be a good measure We're excited to have a guest post
Hybrid search, New OpenAI Embedding Models, Multimodal RAG for Video Processing Highlights Hybrid search with custom reranking (included in LanceDB Python version 0.6.0 release) * Explore the potential of reranking to enhance retrieval quality and
Hybrid search: RAG for real-life production-grade applications by Mahesh Deshwal What is Hybrid Search, and what’s the need for it? With the increasing usage of LLMs in RAG setting, there’s
LanceDB Community News — January 2024 We’re kicking off 2024 with a new LanceDB community newsletter to showcase all the updates in the LanceDB ecosystem, news, blogs, and important links.
Substrait Powered Filter Pushdown in Lance by Weston Pace Filter pushdown is one of the more fundamental optimizations in any data engineering pipeline. The premise is simple: the earlier you filter
LanceDB + Polars A (near) perfect match A spiritual successor to pandas, Polars is a new blazing fast DataFrame library for Python written in Rust. At LanceDB, we
Efficient RAG with Compression and Filtering by Kaushal Choudhary Why Contextual Compressors and Filters? RAG (Retrieval Augmented Generation) is a technique that helps add additional data sources to our existing LLM
Using column statistics to make Lance scans 30x faster by Will Jones In Lance v0.8.21, we introduced column statistics and statistics-based page pruning. This enhancement reduces the number of IO calls needed
Benchmarking LanceDB I came upon a blog post yesterday benchmarking LanceDB. The numbers looked very surprising to me, so I decided to do a quick investigation on
Modified RAG: Parent Document & Bigger chunk Retriever by Mahesh Deshwal In case you’re interested in modifying and improving retrieval accuracy of RAG pipelines, you should check Re-ranking post. What’s it
Multi-Lingual Search With Cohere and LanceDB by Kaushal Choudhary Overview Cohere provides a Multi-Lingual Embedding model which promises to cross the language barriers in language models which were predominantly based on
Multi-Modal AI made easy with LanceDB & CLIP by Kaushal Choudhary One of the most exciting areas of research in deep learning currently is multi-modality and its applications. Kick-started by open sourcing of