mud_server.translation.renderer
Ollama HTTP renderer for the OOC→IC translation layer.
OllamaRenderer is a thin, synchronous wrapper around the Ollama
/api/chat endpoint. It is the only place in the translation layer
that makes a network call.
Sync vs async
The renderer uses the synchronous requests library (already a pinned
dependency at requests==2.32.5). The GameEngine is fully
synchronous, and FastAPI runs sync endpoint handlers inside a thread-pool
executor, so a blocking HTTP call here does not stall the event loop.
When the engine is eventually asyncified the upgrade path is:
1. Replace requests.post with await httpx.AsyncClient().post.
2. Mark render as async def.
3. Mark OOCToICTranslationService.translate as async def.
4. Propagate await up through engine.chat/yell/whisper.
httpx is already in the project dependencies (>=0.28.1) so no
new dep is required at that point.
Request structure
The /api/chat payload includes a top-level keep_alive field
(default "5m") that tells Ollama how long to keep the model loaded
in memory after the request completes. Without this field Ollama uses
its server default (typically 5 minutes), but after a cold-start the
model may be unloaded before the next request arrives, causing a full
reload on every call. Setting keep_alive explicitly avoids this.
Deterministic mode
When set_deterministic(seed_int) is called (by the service, after
deriving a seed from the IPC hash), temperature is clamped to 0.0 and
the seed is forwarded to Ollama’s options.seed field.
IPC hash sourcing (FUTURE — axis engine integration)
set_deterministic will be called from OOCToICTranslationService
once the axis engine passes a concrete ipc_hash through
service.translate(..., ipc_hash=ipc_hash). The service converts the
first 16 hex characters of the hash to an integer:
seed_int = int(ipc_hash[:16], 16)
Until then set_deterministic is never called and the renderer uses
the configured temperature from TranslationLayerConfig.
Attributes
Classes
Synchronous renderer that calls the Ollama |
Module Contents
- mud_server.translation.renderer.logger
- class mud_server.translation.renderer.OllamaRenderer(*, api_endpoint, model, timeout_seconds, temperature=_DEFAULT_TEMPERATURE, keep_alive='5m')[source]
Synchronous renderer that calls the Ollama
/api/chatendpoint.One
OllamaRendererinstance is created perOOCToICTranslationServiceand reused across all translation calls. The renderer is stateful in one way only: deterministic mode can be armed viaset_deterministic, which persists for the lifetime of the object. This is by design — the axis engine arms it at the start of a deterministic turn and the service then callsrenderfor each character in that turn.- _api_endpoint
Full
/api/chatURL.
- _model
Ollama model tag (e.g.
"gemma2:2b").
- _timeout
HTTP request timeout in seconds.
- _keep_alive
Ollama
keep_aliveduration string (e.g."5m"). Controls how long the model stays loaded in GPU/CPU memory after each request.
- _temperature
Sampling temperature; clamped to 0.0 in deterministic mode.
- _seed
Integer seed forwarded to Ollama when deterministic;
Nonewhen non-deterministic.
Initialise the renderer.
- Parameters:
api_endpoint (str) – Full Ollama
/api/chatURL.model (str) – Ollama model tag.
timeout_seconds (float) – HTTP request timeout.
temperature (float) – Default sampling temperature.
keep_alive (str) – Ollama
keep_aliveduration string. Controls how long the model stays loaded after each request."5m"(default) keeps it warm for 5 minutes;"0"unloads immediately.
- set_deterministic(seed_int)[source]
Arm deterministic mode for subsequent
rendercalls.Clamps temperature to 0.0 and stores the seed so that identical inputs produce identical outputs across runs. This is called by
OOCToICTranslationServicewhen a non-Noneipc_hashis provided andconfig.deterministicisTrue.The seed is derived from the IPC hash by the service, not here, to keep hashing logic out of the renderer.
- Parameters:
seed_int (int) – Integer seed forwarded to Ollama’s
options.seed.
- render(system_prompt, user_message)[source]
Call Ollama and return the raw response content.
Builds the Ollama request payload, executes a synchronous POST, and returns the
message.contentstring from the JSON response.Returns
Noneon any network-level failure (timeout, connection error, non-2xx status). Content-level validation (PASSTHROUGH sentinel, multi-line output, etc.) is handled byOutputValidator.