ADR: DataSource Protocol for Federation¶
Status: Accepted Date: 2026-01-29
Decision¶
Introduce a DataSource protocol that abstracts data access, enabling CivicOS to query civic data without knowing whether it's local or from a federated instance. For the pilot, only a local implementation exists; federation support will be added when a second jurisdiction joins.
Context¶
CivicOS queries civic data through a StorageBackend that talks directly to the database. This works for a single jurisdiction, but federation requires querying data that may live on another operator's instance. Rather than retrofit this abstraction later, we introduced it during the pilot phase as a thin delegation layer with zero behavior change.
Protocol Design¶
@runtime_checkable
class DataSource(Protocol):
"""Read-only interface for civic data access (local or federated)."""
@property
def source_type(self) -> str:
"""Returns 'local', 'federated', or 'hybrid'."""
...
def get_meetings(self, jurisdiction_id, since, until, limit) -> List[Dict]: ...
def get_decisions(self, jurisdiction_id, since, until, limit) -> List[Dict]: ...
def get_elections(self, jurisdiction_id, include_past, limit) -> List[Dict]: ...
def get_budget_items(self, jurisdiction_id, fiscal_year, department, limit) -> List[Dict]: ...
# ... additional query methods
Key Design Choices¶
-
Read-only —
DataSourcehandles queries only. Write operations stay onStorageBackendbecause only local data should be written. -
Dict-based returns — Methods return
List[Dict]for JSON serialization and cross-process compatibility (important for MCP relay communication). -
Simple parameter types — Query methods use
str,datetime,int,bool— types that serialize cleanly over the MCP protocol.
Implementations¶
LocalDataSource (Current)¶
Wraps StorageBackend with zero behavior change:
class LocalDataSource:
def __init__(self, storage: StorageBackend):
self._storage = storage
@property
def source_type(self) -> str:
return "local"
def get_meetings(self, jurisdiction_id, since, until, limit):
return self._storage.get_meetings(
jurisdiction_id=jurisdiction_id,
since=since, until=until, limit=limit,
)
CivicOS routes all queries through _data_source:
class CivicOS:
def __post_init__(self):
self._storage = get_storage_backend(database_url)
self._data_source = LocalDataSource(self._storage)
def whats_next(self, topics=None, days=30):
meetings = self._data_source.get_meetings(...)
FederatedDataSource (Future)¶
Will use the civicos-relay MCP protocol to fan out queries to remote instances:
class FederatedDataSource:
def __init__(self, relay_urls: List[str]):
self._relays = [MCPClient(url) for url in relay_urls]
def get_meetings(self, jurisdiction_id, since, until, limit):
relay = self._get_relay_for_jurisdiction(jurisdiction_id)
return relay.call_tool("get_meetings", {...})
A factory function will select the appropriate implementation based on configuration:
def get_data_source(storage=None, relay_urls=None) -> DataSource:
if relay_urls:
return FederatedDataSource(relay_urls)
return LocalDataSource(storage or get_storage_backend())
Rationale¶
- Federation readiness — A second city joining would require this abstraction anyway
- Low risk — Simple delegation pattern with zero behavior change for existing code
- Clear boundaries — Separates where data lives from how it's queried
- Testability — Enables mocking data sources without a database
Alternatives Considered¶
- Query remote databases directly — Rejected: Requires VPN/firewall complexity, doesn't scale
- Replicate all data locally — Rejected: Expensive, sync complexity, stale data risk
- Wait until second city joins — Rejected: Harder to retrofit the abstraction into existing query paths
References¶
- Entity ID Namespacing — How namespaced IDs enable federation
- Federation Domain Architecture — Domain and operator model
- civicos-relay package — Relay protocol details