emergence-mini-dilles/tests/test_time.py
Jeuners 919866e50d Time Dilation framework + OpenRouter multi-LLM
Implements core pieces of 'Time Dilation in LLM Agent Systems'
(Dillenberg 2026) and adds OpenRouter as a second LLM provider.

ENGINE
- engine/time.py: AgentClock with cumulative proper time tau
  (weighted by op type), EWMA pace (alpha=0.3, dt clamped 0.1-60s),
  ClockRegistry singleton, gamma_{src->dst} frame transformation,
  drift_report with per-pair divergence and threshold flag.
- engine/turn.py: ticks tau on reasoning/tool/memory/reactive;
  broadcasts tau+pace+model in every WebSocket message.
- engine/db.py: schema adds turn_log.tau, turn_log.pace,
  turn_log.model, agent_clocks table; dev-mode auto-migrate
  drops+recreates if old schema detected.
- engine/llm.py: full refactor for two providers.
    Ollama: native tool-calling via /api/chat
    OpenRouter: OpenAI-compatible /api/v1/chat/completions
  Auto mode picks OpenRouter if OPENROUTER_API_KEY is set.
  Per-agent model via EMERGENCE_AGENT_<ID>_MODEL env var.
  .env loader with empty-line guard.
  decide_tool returns (name, args, meta) with cost_usd for OR.

FRONTEND
- web/: new 'Time Dilation · Eigenzeit tau' section with per-agent
  tau bars, pace, op count. Drift warning when any pair exceeds
  threshold. LLM provider info in header.

TESTS
- 14 new tests in tests/test_time.py (tau monotonic, EWMA convergence,
  gamma asymmetry, drift detection).
- 4 new LLM tests: openrouter response parsing, per-agent override,
  provider_info, is_available.
- All 99 tests green.

LIVE-VERIFIED
- 4 different OpenRouter models running in parallel:
  - anchor: anthropic/claude-3.5-haiku
  - flora:  openai/gpt-4o-mini
  - lovely: meta-llama/llama-3.3-70b-instruct
  - spark:  google/gemma-3-4b-it
- All 4 produce turns, all 4 have different tau values,
  drift_report shows the Frame-Transformation gamma values.
- Observation: gamma ~ 1.00 because the explicit Round-Robin +
  sleep(2) keeps frames coherent. This is itself a non-trivial
  validation of the paper's claim: in non-synchronized systems,
  dilation would emerge.

SECRETS
- .env added, OPENROUTER_API_KEY live. .env is git-ignored.
- .env.example documents the config without exposing any key.
- .gitignore now blocks .env, .env.local, *.key, *.pem.

README
- New 'Time Dilation' section explaining tau, pace, CDC, drift
- New 'Multi-LLM via OpenRouter' section with cost table
- Per-agent model config documented
2026-06-15 02:27:11 +02:00

163 lines
5 KiB
Python
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

"""Time-dilation framework tests (τ tracker, EWMA pace, CDC)."""
import time
def test_tau_starts_at_zero():
from engine import time as time_mod
c = time_mod.AgentClock("a")
assert c.tau == 0.0
assert c.pace == 0.0
assert c.n_ops == 0
def test_tau_monotonic():
from engine import time as time_mod
c = time_mod.AgentClock("a")
t0 = 1000.0
c.tick("reasoning", 1.0, now=t0)
c.tick("tool_call", 0.5, now=t0 + 1)
c.tick("reasoning", 1.0, now=t0 + 2)
assert c.tau == 2.5
assert c.n_ops == 3
assert c.tau >= 0 # monotonic non-decreasing
def test_tau_weight_per_op():
from engine import time as time_mod
c = time_mod.AgentClock("a")
c.tick("reasoning", time_mod.W_REASONING_STEP)
c.tick("tool_call", time_mod.W_TOOL_CALL)
c.tick("memory_lookup", time_mod.W_MEMORY_LOOKUP)
c.tick("reactive", time_mod.W_REACTIVE)
expected = (time_mod.W_REASONING_STEP + time_mod.W_TOOL_CALL
+ time_mod.W_MEMORY_LOOKUP + time_mod.W_REACTIVE)
assert abs(c.tau - expected) < 1e-6
def test_pace_ewma_initialized_on_first_tick():
from engine import time as time_mod
c = time_mod.AgentClock("a")
c.tick("reasoning", 1.0, now=1000.0)
assert c.pace > 0
def test_pace_ewma_converges():
from engine import time as time_mod
c = time_mod.AgentClock("a")
# Simulate 10 ops at 1 second apart -> pace ~1.0 op/s
for i in range(10):
c.tick("r", 1.0, now=1000.0 + i)
# EWMA should be near 1.0
assert 0.5 < c.pace < 1.5, f"pace={c.pace}"
def test_pace_ewma_reacts_to_burst():
from engine import time as time_mod
c = time_mod.AgentClock("a")
# slow start
for i in range(5):
c.tick("r", 1.0, now=1000.0 + i * 10) # 10s apart
slow_pace = c.pace
# burst
for i in range(5):
c.tick("r", 1.0, now=1050.0 + i * 0.1) # 0.1s apart
fast_pace = c.pace
assert fast_pace > slow_pace
def test_history_bounded():
from engine import time as time_mod
c = time_mod.AgentClock("a")
for i in range(500):
c.tick("r", 0.1, now=1000.0 + i)
assert len(c.history) <= 200
def test_snapshot_roundtrips():
from engine import time as time_mod
c = time_mod.AgentClock("anchor")
c.tick("r", 1.0, now=1000.0)
snap = c.snapshot()
assert snap["agent_id"] == "anchor"
assert "tau" in snap
assert "pace" in snap
assert "n_ops" in snap
def test_registry_get_creates():
from engine import time as time_mod
time_mod.registry.reset("a")
c = time_mod.registry.get("a")
assert c.agent_id == "a"
def test_registry_singleton_state():
from engine import time as time_mod
time_mod.registry.reset("zzz_test")
time_mod.record_reasoning("zzz_test")
time_mod.record_tool_call("zzz_test")
snap = time_mod.registry.snapshot_all()
assert "zzz_test" in snap
assert snap["zzz_test"]["tau"] == (time_mod.W_REASONING_STEP + time_mod.W_TOOL_CALL)
time_mod.registry.reset("zzz_test")
def test_gamma_symmetry_inverse():
from engine import time as time_mod
time_mod.registry.reset("p")
time_mod.registry.reset("q")
# give them different paces
for i in range(5):
time_mod.registry.get("p").tick("r", 1.0, now=1000.0 + i * 0.5) # 2 ops/s
time_mod.registry.get("q").tick("r", 1.0, now=1000.0 + i * 2.0) # 0.5 ops/s
g_pq = time_mod.registry.gamma("p", "q")
g_qp = time_mod.registry.gamma("q", "p")
# γ_pq ≈ 4, γ_qp ≈ 0.25
assert g_pq > 2.0
assert g_qp < 0.5
time_mod.registry.reset("p")
time_mod.registry.reset("q")
def test_transform_uses_gamma():
from engine import time as time_mod
time_mod.registry.reset("a")
time_mod.registry.reset("b")
time_mod.registry.get("a").tau = 10.0
time_mod.registry.get("a").pace = 2.0
time_mod.registry.get("b").tau = 5.0
time_mod.registry.get("b").pace = 1.0
# γ_a->b = 2.0/1.0 = 2.0
transformed = time_mod.registry.transform("a", "b")
assert abs(transformed - 20.0) < 1e-6
time_mod.registry.reset("a")
time_mod.registry.reset("b")
def test_drift_report_with_divergence():
from engine import time as time_mod
time_mod.registry.reset("x")
time_mod.registry.reset("y")
# give x much more τ than y, similar pace
time_mod.registry.get("x").tau = 100.0
time_mod.registry.get("x").pace = 1.0
time_mod.registry.get("y").tau = 5.0
time_mod.registry.get("y").pace = 1.0
report = time_mod.registry.drift_report()
assert report["max_drift"] > 90
assert any(p["divergent"] for p in report["pairs"])
time_mod.registry.reset("x")
time_mod.registry.reset("y")
def test_drift_report_empty_with_one_agent():
from engine import time as time_mod
# reset everything to ensure isolation
for c in time_mod.registry.all():
time_mod.registry.reset(c.agent_id)
time_mod.registry.get("only")
time_mod.registry.get("only").tau = 10.0
report = time_mod.registry.drift_report()
assert report["pairs"] == []
assert report["max_drift"] == 0.0
time_mod.registry.reset("only")