InfraRed 100, Vector Datalake, Lance + Mosaic Streaming Dataset

InfraRed 100, Vector Datalake, Lance + Mosaic Streaming Dataset

2 min read

🔥 LanceDB Makes the 2024 InfraRed 100 List🔥

The InfraRed 100 list is updated every year, and it features the most impactful and fastest-growing private infrastructure companies! LanceDB is honored to be recognized on this respected list as one of the early stage (seed/A) startups.

🤝LanceDB Joins the Open Unity Catalog Ecosystem 🤝

Databricks announced OSS Unity Catalog. LanceDB is proud to be one of the Data & AI platform ecosystem partners off the gate, as we shared the vision for open access of data between platform and vendors. We are working on a Lance dataset as one of the default volumes.

Community contributions

💡
Spark-Lance: Spark-Lance is cooking up! Another integration the community has been asking for, and we are getting it done! https://github.com/lancedb/lance/pull/2500 For Spark users who are interested in building Production-Scale Private RAG Pipeline with LanceDB, check out LanceDB’s talk at the Data+AI Summit recording to get some inspiration!
💡
A heartfelt thank you to our community contributors of lance and lancedb this month: @NickDarvey @heiher @LuQQiu @joshua-auchincloss @josca42 @beinan @harsha-mangena @paulwalsh-sonrai @paulrinaldi

Good reads

Last month, we shared the two deep dive posts columnar file readers series. We are bringing you another one this month, along with a comprehensive guide on choosing the right vector search system, and our own recommendation for a modular AI software development stack:

Event recap

Running Trino SQL on a vector data lake powered by Lance

Supercharging AI Training with Mosaic Streaming and Lance Columnar Format

Latest releases

Rust and JS users no longer need to manually generate embeddings as both languages now support the embedding registry!