EngramEngramDocs

Python SDK

EngramClient is a lightweight HTTP client for a single Engram miner. Store text, images, PDFs, URLs, and conversations. Query with metadata filters, retrieve by CID, delete, and list records. No extra dependencies for text; pypdf needed for PDFs.

Install

bash
pip install engram-subnet
# For PDF support
pip install engram-subnet pypdf

EngramClient

python
from engram.sdk import EngramClient
client = EngramClient(
miner_url="http://72.62.2.34:8091", # or use from_subnet() for auto-discovery
timeout=30.0,
)
ParameterTypeDefaultDescription
miner_urlstr"http://127.0.0.1:8091"Base URL of the miner's HTTP server
timeoutfloat30.0Request timeout in seconds
namespacestr | NoneNonePrivate collection name — enables encryption
namespace_keystr | NoneNoneSecret key for the namespace (min 16 chars)

from_subnet()

Auto-discovers the best available miner from the Bittensor metagraph. Probes the top miners by incentive score in parallel and returns a client pointed at the fastest responsive one.

python
# One line — no miner URL needed
client = EngramClient.from_subnet(netuid=450)
ParameterTypeDefaultDescription
netuidint450Subnet UID to query
networkstr"finney"Subtensor network — "finney", "test", or ws:// endpoint
timeoutfloat30.0Timeout for the returned client
probe_timeoutfloat3.0Timeout for each health probe during discovery
top_nint5Number of top miners to probe (picks by incentive rank)
Note
Requires bittensor to be installed. Raises RuntimeError if no miners are reachable.

Private namespaces

Pass namespace and namespace_key to store data in an encrypted, private collection. Text is encrypted with AES-256-GCM client-side before being sent to any miner.

python
private = EngramClient(
"http://miner:8091",
namespace="company-docs",
namespace_key="your-secret-key-min-16-chars",
)
cid = private.ingest("Q4 revenue was $4.2M") # encrypted before leaving your machine
results = private.query("revenue figures") # decrypted client-side

See Private Namespaces for the full encryption spec and threat model.

ingest()

python
cid: str = client.ingest(text: str, metadata: dict = None)

Embed and store text on the miner. Returns a CID string.

python
cid = client.ingest(
"BERT uses bidirectional encoder representations.",
metadata={"source": "arxiv", "year": "2018"}
)
print(cid) # v1::a3f2b1c4d5e6f7...
ParameterTypeDescription
textstrText to embed and store (max 8192 chars)
metadatadict | NoneOptional key-value metadata (max 4 KB JSON)

Raises: MinerOfflineError, IngestError, InvalidCIDError

ingest_image()

Describe an image with Grok Vision (xAI) and store the description as a searchable memory. The raw image bytes are never sent to the miner — only the AI-generated description is embedded and stored. A content_cid (SHA-256 of the image) is stored as metadata for integrity verification.

python
result = client.ingest_image(
"photo.jpg", # path, or raw bytes
xai_api_key="xai-...", # get one at console.x.ai
metadata={"user_id": "u_123"}, # optional extra metadata
)
print(result["cid"]) # v1::a3f2b1... — use this for search
print(result["description"]) # "A photograph of a whiteboard showing..."
print(result["content_cid"]) # sha256:abc123... — integrity check
print(result["filename"]) # "photo.jpg"
# Search by what's in the image later:
results = client.query("whiteboard diagram with architecture")
ParameterTypeDescription
sourcestr | Path | bytesImage file path or raw bytes
xai_api_keystrxAI API key for Grok Vision (required)
mime_typestr | NoneMIME type e.g. "image/jpeg" — auto-detected from extension if omitted
metadatadict | NoneOptional extra metadata

Returns: dict with cid, description, content_cid, filename
Raises: MinerOfflineError, IngestError, RuntimeError (Grok API failure)

Note
Get a free xAI API key at console.x.ai. Grok Vision supports JPEG, PNG, GIF, and WebP.

ingest_pdf()

Extract text from a PDF and store it as a searchable memory. Requires pypdf. The full text (up to 8192 chars) is embedded; the SHA-256 of the raw PDF is stored as content_cid.

bash
pip install pypdf
python
result = client.ingest_pdf(
"research_paper.pdf", # path, or raw bytes
metadata={"category": "research"},
)
print(result["cid"]) # v1::...
print(result["pages"]) # 12
print(result["chars"]) # 48293
print(result["content_cid"]) # sha256:...
# Search the PDF content later:
results = client.query("transformer attention mechanism")
ParameterTypeDescription
sourcestr | Path | bytesPDF file path or raw bytes
metadatadict | NoneOptional extra metadata

Returns: dict with cid, pages, chars, content_cid, filename
Raises: MinerOfflineError, IngestError, ImportError (pypdf missing), ValueError (image-only PDF)

Note
Image-only / scanned PDFs have no extractable text. Run OCR first (e.g. pytesseract) or use ingest_image() per page.

ingest_url()

Fetch a web page, strip navigation and boilerplate, and store the readable text as a memory. SSRF protection is built in — private/loopback addresses are blocked.

python
result = client.ingest_url(
"https://arxiv.org/abs/1706.03762",
metadata={"category": "research"},
)
print(result["cid"]) # v1::...
print(result["title"]) # "Attention Is All You Need"
print(result["chars"]) # 6842
print(result["url"]) # final URL after redirects
# Search later:
results = client.query("transformer architecture paper")
ParameterTypeDescription
urlstrHTTP or HTTPS URL to fetch
metadatadict | NoneOptional extra metadata merged with auto-extracted title/source

Returns: dict with cid, url, title, chars
Raises: ValueError (invalid URL, private address), RuntimeError (fetch failure, no readable text)

ingest_conversation()

Store a conversation thread as individual turn memories. Each message is embedded separately so individual turns are semantically searchable. A shared session_id links them.

python
messages = [
{"role": "user", "content": "What's the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "Tell me more about Paris."},
]
cids = client.ingest_conversation(
messages,
session_id="session_abc123",
metadata={"user_id": "u_456"},
)
print(cids)
# ["v1::a3f2...", "v1::b2e8...", "v1::c9f4..."]
# Retrieve conversation turns later:
results = client.query("capital city France")
# Returns the turn that mentioned Paris
ParameterTypeDescription
messageslist[dict]List of {"role": ..., "content": ...} dicts
session_idstrShared ID linking all turns — stored as metadata
metadatadict | NoneOptional extra metadata added to every turn

Returns: list of CID strings — one per message turn

Note
Filters empty messages automatically. Each stored record includes role, session_id, turn, and timestamp in its metadata.

query()

python
results: list[dict] = client.query(text: str, top_k: int = 10, filter: dict = None)

Semantic search over stored embeddings — works across text, images, PDFs, URLs, and conversation turns.

python
# Basic search
results = client.query("how does self-attention work?", top_k=10)
# [
# {"cid": "v1::a3f2b1...", "score": 0.9821, "metadata": {"source": "arxiv"}},
# {"cid": "v1::b2e8c1...", "score": 0.8847, "metadata": {"type": "url"}},
# ]
# Filter by metadata — AND semantics (all conditions must match)
results = client.query(
"revenue figures",
top_k=5,
filter={"user_id": "u_123", "type": "text"},
)
# Only conversation turns for a specific session
turns = client.query(
"Paris",
filter={"session_id": "session_abc123", "role": "assistant"},
)
ParameterTypeDescription
textstrNatural language query
top_kintMaximum results to return (default 10)
filterdict | NoneAND-match metadata filter — all key/value pairs must match

get()

Retrieve a stored record by its CID. Returns the metadata (not the raw embedding vector).

python
record = client.get("v1::a3f2b1c4d5e6f7...")
if record:
print(record["cid"]) # v1::a3f2b1...
print(record["metadata"]) # {"source": "arxiv", "title": "Attention Is All You Need"}
else:
print("Not found")

Returns: dict with cid and metadata, or None if not found

delete()

Remove a stored record by its CID. The operation is idempotent.

python
deleted = client.delete("v1::a3f2b1c4d5e6f7...")
print(deleted) # True if it existed, False if not found

Returns: boolTrue if deleted, False if CID was not found
Raises: MinerOfflineError

list()

List stored records with optional metadata filtering and pagination.

python
# All records (first page)
records = client.list(limit=50, offset=0)
# Filter by type
image_records = client.list(filter={"type": "image"})
# All memories for a user, paginated
page1 = client.list(filter={"user_id": "u_123"}, limit=20, offset=0)
page2 = client.list(filter={"user_id": "u_123"}, limit=20, offset=20)
for r in page1:
print(r["cid"], r["metadata"].get("title", ""))
ParameterTypeDescription
filterdict | NoneAND-match metadata filter
limitintMax records per page (default 50)
offsetintRecords to skip (default 0)

Returns: list of dicts with cid and metadata

batch_ingest_file()

Ingest all records from a JSONL file. Each line must be a JSON object with a text key.

python
# data.jsonl format:
# {"text": "First entry"}
# {"text": "Second entry", "metadata": {"category": "ml"}}
cids = client.batch_ingest_file("data/corpus.jsonl")
print(f"Ingested {len(cids)} records")
# With error tracking
cids, errors = client.batch_ingest_file("corpus.jsonl", return_errors=True)
for err in errors:
print(f"Skipped: {err}")

health() / is_online()

python
# Check liveness — raises MinerOfflineError if unreachable
info = client.health()
# {"status": "ok", "vectors": 42156, "uid": 7}
# Safe check — never raises
if client.is_online():
cid = client.ingest("...")

Multi-miner pattern

For redundancy, ingest to multiple miners. The same text always produces the same CID.

python
from engram.sdk import EngramClient, MinerOfflineError
miners = [
EngramClient("http://miner1:8091"),
EngramClient("http://miner2:8091"),
EngramClient("http://miner3:8091"),
]
cids = []
for miner in miners:
try:
cids.append(miner.ingest("Critical knowledge."))
except MinerOfflineError:
print(f"Miner offline: {miner.miner_url}")
print(f"Stored on {len(cids)}/3 miners")
Note
The same text always produces the same CID across every miner — CIDs are content-addressed, not location-addressed.
engram docs · v0.1edit on github →