this post was submitted on 05 Sep 2023
22 points (100.0% liked)

InternetIsBeautiful

3402 readers
1 users here now

A place for your preferably unique useful or fun sites and kind of a bookmark manager for me :p

founded 2 years ago
MODERATORS
 

GIF demo

Source: Wikipedia search-by-vibes through millions of pages offline | Little Short Bulletins

RSS Feed: https://www.leebutterman.com/feed.xml

What is this?

What is this?

This is a browser-based search engine for Wikipedia, where you can search for “the reddish tall trees on the san francisco coast” and find results like “Sequoia sempervirens” (a name of a redwood tree). The browser downloads the database, and search happens offline. To download two million Wikipedia pages with their titles takes roughly 100MB and under 50 milliseconds to see the final results. This uses sentence transformers to embed documents, product quantization to compress embeddings, pq.js to run distance computation in the browser, and transformers.js to run sentence transformers in the browser for queries.

Is this good?

Yes.

Real-time search over millions of documents is happening in real-time completely offline. Results stream back every 10ms on a mobile device, and search results update gradually as the database is sequentially scanned.

Via Simon Willison’s Weblog: Wikipedia search-by-vibes through millions of pages offline

RSS Feed: https://simonwillison.net/atom/everything/

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here