core
Building blocks for thedu
Introduction
We often have to go through a whole bunch of hoops to get documents processed and ready for searching through them. litesearch plans to make this as easy as possible by providing simple building blocks to set up a database with FTS5 and vector search capabilities.
Applying usearch macOS fix if required… usearch dylib path: /home/runner/.usearch/binaries/usearch_sqlite.dylib Not on macOS, skipping usearch fix.
/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/usearch/__init__.py:125: UserWarning: Will download `usearch_sqlite` binary from GitHub.
warnings.warn("Will download `usearch_sqlite` binary from GitHub.", UserWarning)
Database.query
Database.query (sql:str, params:Union[Iterable,dict,NoneType]=None)
Execute a query and return results as a list of AttrDict
Simple Docs table setup
Database.mk_store
Database.mk_store (name:str='content', **kw)
Make a sql table for content storage with FTS5 and vector search capabilities
| Type | Default | Details | |
|---|---|---|---|
| name | str | content | table name |
| kw | VAR_KEYWORD |
setup_db
setup_db (pth_or_uri:str=':memory:', wal:bool=True, sem_search:bool=True, **kw)
Set up a database connection and load usearch extensions. You can refer usearch docs on sqlite plugins here: https://unum-cloud.github.io/USearch/sqlite/index.html
| Type | Default | Details | |
|---|---|---|---|
| pth_or_uri | str | :memory: | the database name or URL |
| wal | bool | True | use WAL mode |
| sem_search | bool | True | enable usearch extensions |
| kw | VAR_KEYWORD | ||
| Returns | Database | additional args to pass to apswutils database |
Database.search
Database.search (q:str, emb:bytes, columns:list=None, where:str=None, where_args:dict=None, lim=50, tbl='content', rrf=True)
| Type | Default | Details | |
|---|---|---|---|
| q | str | query string | |
| emb | bytes | embedding vector | |
| columns | list | None | columns to return |
| where | str | None | additional where clause |
| where_args | dict | None | args for where clause |
| lim | int | 50 | limit on number of results |
| tbl | str | content | table name |
| rrf | bool | True | need to rerank results with reciprocal rank fusion |