first to addDB service

updated to reflect refactoring
2025-06-15 20:20:54 +01:00 · 2025-05-04 18:55:18 +01:00
7 changed files with 365 additions and 79 deletions
--- a/.env.example
+++ b/.env.example
@@ -12,3 +12,13 @@ AI_HANDLER_URL="http://ai_service:8000"
 # OpenAI API Key
 OPENAI_API_KEY=your_openai_api_key_here
 AI_HANDLER_TOKEN=common_token_here
+
+# db_service
+LOG_TOKEN=your_log_token_here
+
+
+# PostgreSQL Configuration
+POSTGRES_USER=database_user
+POSTGRES_PASSWORD=database_password
+POSTGRES_DB=database_name
+POSTGRES_HOST=postgres
--- a/README.md
+++ b/README.md
@@ -2,108 +2,130 @@

 *Botbot, your not so friendly Bot*

-A Matrix chat bot that listens to specified rooms, records conversations, leverages OpenAI for AI-driven summaries, and assists with answering questions.
+**Botbot** aimsto be a multi‑service chat assistant for [Matrix](https://matrix.org) that captures room conversations, stores them, and uses Large‑Language‑Models to provide concise summaries, answer follow‑up questions, and surface action points.

-## Objectives
+### Roadmap / Aspirational goals

- Record message history on rooms its participating.
- Create discussion summaries to capture actionable items, deadlines, and decisions on subjects discussed on those rooms.
- Use collected knoledge to answer questions placed by participants in discussions that are left unansewerd after some time, and also when its direclty addressed using it's @handle
+* Persistent conversation store (PostgreSQL) for long‑range context.
+* Pluggable LLM back‑ends (local models, Ollama, etc.).
+* Structured meeting summaries with action‑items and deadlines.
+* Additional chat front‑ends (Telegram, WhatsApp, …).

-also, 
+---

- Support additional AI backends beyond OpenAI (e.g., local LLMs, alternative APIs).
- Possibly support other Chat services beyownd Matrix like Telegram, Teams or Whatsapp.
+## Architecture

+```mermaid
+flowchart LR
+  MS["matrix_service<br/>(Python & nio)"]
+  AI["ai_service<br/>(FastAPI)"]
+  Redis["Redis<br/>Deduplicates replies / guarantees idempotency"]
+
+  MS -- "HTTP (Bearer token)" --> AI
+  AI -- "Matrix Events" --> MS
+
+  MS -.-> Redis
+  AI -.-> Redis
+```
+
+| Component           | Image / Entry‑Point                   | Purpose                                                                               |
+| ------------------- | ------------------------------------- | ------------------------------------------------------------------------------------- |
+| **matrix\_service** | `python matrix_service/main.py`       | Listens to Matrix rooms, forwards each message to `ai_service`, posts the reply back. |
+| **ai\_service**     | `python ai_service/main.py` (FastAPI) | Builds a prompt, calls the configured LLM (OpenAI today), caches the reply in Redis.  |
+| **redis**           | `redis:7`                             | Reply cache & simple key/value store.                                                 |
+
+The services talk to each other over the internal Docker network.  Authentication between them is a static bearer token (`AI_HANDLER_TOKEN`).
+
+---

 ## Current Features

- **Auto-join Rooms**: Automatically joins rooms when invited.
- **Message Callbacks**: Responds to basic commands:
-  - `!ping` → `Pong!`
-  - `hello botbot` → Greeting
+* **Auto‑Join on Invite** – secure E2EE join including device verification.
+* **Stateless AI handler** – FastAPI endpoint `/api/v1/message` receiving a JSON payload.
+* **Idempotent replies** – duplicate Matrix events reuse the cached answer.

-## Prerequisites

- Docker & Docker Compose (or Python 3.8+)
- A Matrix account for the bot
- OpenAI API key
+---

-## Installation
-
-1. **Clone the repository**
-
-   ```bash
-   git clone https://gitea.alluna.pt/jfig/botbot.git
-   cd botbot
-   ```
-
-2. **Configure environment variables**
-
-   Copy the example file and edit it:
-
-   ```bash
-   cp .env.example .env
-   ```
-
-   Then open `.env` and set:
-
-   ```ini
-   LOG_LEVEL=INFO
-   HOMESERVER_URL=https://matrix.org
-   USER_ID=@botbot_user:matrix.org
-   PASSWORD=your_matrix_password
-   OPENAI_API_KEY=your_openai_api_key
-   ```
-
-## Usage
-
-### Using Docker Compose (development/hot-reload)
+## Quick Start (development)

 ```bash
-docker-compose up --build
+# clone & cd
+$ git clone https://gitea.alluna.pt/jfig/botbot.git
+$ cd botbot
+
+# copy environment template
+$ cp .env.example .env
+# edit .env with your homeserver, credentials and OpenAI key
+
+# start everything (hot‑reload volumes mounted)
+$ docker compose up --build
 ```

-### Building and Running Manually (production)
+The default compose file launches three containers:
+
+* `matrix_service` – watches your rooms
+* `ai_service` – handles AI prompts
+* `redis` – reply cache
+
+Stop with <kbd>Ctrl‑C</kbd> or `docker compose down`.
+
+### Production (single image per service)
+
+Build once, then deploy with your orchestrator of choice:

 ```bash
-# Build container
-docker build -t botbot .
-
-# Run container
-docker run -d --env-file .env \
-  -v matrix_data:/app/data \
-  --restart unless-stopped \
-  botbot
+$ docker compose -f docker-compose.yml --profile prod build
+$ docker compose -f docker-compose.yml --profile prod up -d
 ```

-## Configuration Options
+---

- `LOG_LEVEL`: One of `CRITICAL`, `ERROR`, `WARNING`, `INFO`, `DEBUG`, `NOTSET`.
- `HOMESERVER_URL`: Matrix homeserver endpoint (e.g., `https://matrix.org`).
- `USER_ID`: Bot's full Matrix user ID (e.g., `@botbot_user:matrix.org`).
- `PASSWORD`: Password for the bot account.
- `OPENAI_API_KEY`: API key for OpenAI usage.
+## Configuration

-## How It Works
+All settings are environment variables.  The table below reflects the current codebase (commit `ae27a2c`).

-1. **Startup**: Loads environment and logs into the Matrix homeserver.
-2. **Callbacks**:
-   - `message_callback`: Handles text messages and triggers AI logic.
-   - `invite_cb`: Joins rooms on invitation.
-3. **AI Integration**: Future development will:
-   - Pull recent chat history.
-   - Call OpenAI endpoints to generate summaries or answers.
+| Variable                       | Service         | Default                  | Description                              |
+| ------------------------------ | --------------- | ------------------------ | ---------------------------------------- |
+| `LOG_LEVEL`                    | both            | `INFO`                   | Python logging level.                    |
+| `MATRIX_HOMESERVER_URL`        | matrix\_service | –                        |  Matrix homeserver base URL.             |
+| `MATRIX_USER_ID`               | matrix\_service | –                        |    Full user id of the bot.              |
+| `MATRIX_PASSWORD`              | matrix\_service | –                        |    Password for the bot account.         |
+| `MATRIX_LOGIN_TRIES`           | matrix\_service |  `5`                     |    Number of login attempts before exit. |
+| `MATRIX_LOGIN_DELAY_INCREMENT` | matrix\_service |  `5`                     | Seconds added per retry.                 |
+| `AI_HANDLER_URL`               | matrix\_service | `http://ai_service:8000` | Where to POST messages.                  |
+| `AI_HANDLER_TOKEN`             | both            | –                        | Shared bearer token (keep secret).       |
+| `OPENAI_API_KEY`               | ai\_service     | –                        | Key for `openai` Python SDK.             |
+| `REDIS_URL`                    | ai\_service     | `redis://redis:6379`     | Connection string used by `redis-py`.    |
+
+---
+
+## API (ai\_service)
+
+```http
+POST /api/v1/message
+Authorization: Bearer <AI_HANDLER_TOKEN>
+Content-Type: application/json
+
+{
+  "roomId": "!foo:matrix.org",
+  "userId": "@alice:matrix.org",
+  "eventId": "$abc123",
+  "serverTimestamp": 1714821630123,
+  "content": "Hello there"
+}
+```
+
+Returns `{"reply": "Hello Alice!"}` or HTTP 401/500 on error.
+
+---

 ## Contributing

-1. Fork the repository.
-2. Create a feature branch (`git checkout -b feature/xyz`).
-3. Commit changes and push (`git push origin feature/xyz`).
-4. Open a pull request with a description of your changes.
+Pull requests are welcome!  Please open an issue first to discuss what you want to change.  All source code is formatted with **ruff** / **black** – run `pre-commit run --all-files` before pushing.
+
+---

 ## License

-This project is licensed under the [MIT License](LICENSE).
-
-
+[MIT](LICENSE)
--- a/db_service/Dockerfile
+++ b/db_service/Dockerfile
@@ -0,0 +1,20 @@
+FROM python:3.11-slim
+
+# Prevent Python from buffering stdout/stderr so logs appear immediately
+ENV PYTHONUNBUFFERED=1
+
+# Install system dependencies (none needed for asyncpg on slim)
+
+WORKDIR /app
+
+# Install Python dependencies first to leverage Docker layer caching
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy service source code only after dependencies
+COPY . /app
+
+EXPOSE 8000
+
+# Use Uvicorn as the ASGI server
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
--- a/db_service/main.py
+++ b/db_service/main.py
@@ -0,0 +1,182 @@
+from __future__ import annotations
+
+import logging
+import os
+from typing import AsyncGenerator
+
+from dotenv import load_dotenv
+from fastapi import Depends, FastAPI, Header, HTTPException, status
+from pydantic import BaseModel, Field
+from sqlalchemy import BigInteger, Column, MetaData, Table, Text, text
+from sqlalchemy.exc import OperationalError
+from sqlalchemy.ext.asyncio import (
+    AsyncEngine,
+    AsyncSession,
+    async_sessionmaker,
+    create_async_engine,
+)
+
+# ---------------------------------------------------------------------------
+# Environment & logging setup
+# ---------------------------------------------------------------------------
+
+load_dotenv()
+LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO").upper()
+logging.basicConfig(level=LOG_LEVEL, format="%(levelname)s | %(name)s | %(message)s")
+log = logging.getLogger("db_service")
+
+LOG_TOKEN = os.getenv("LOG_TOKEN", "changeme")
+POSTGRES_USER = os.getenv("POSTGRES_USER", "postgres")
+POSTGRES_PASSWORD = os.getenv("POSTGRES_PASSWORD", "postgres")
+POSTGRES_DB = os.getenv("POSTGRES_DB", "botbot")
+POSTGRES_HOST = os.getenv("POSTGRES_HOST", "postgres")
+POSTGRES_PORT = os.getenv("POSTGRES_PORT", "5432")
+
+DATABASE_URL = (
+    f"postgresql+asyncpg://{POSTGRES_USER}:{POSTGRES_PASSWORD}" f"@{POSTGRES_HOST}:{POSTGRES_PORT}/{POSTGRES_DB}"
+)
+ADMIN_URL = (
+    f"postgresql+asyncpg://{POSTGRES_USER}:{POSTGRES_PASSWORD}" f"@{POSTGRES_HOST}:{POSTGRES_PORT}/postgres"
+)
+
+# ---------------------------------------------------------------------------
+# SQLAlchemy table definition (metadata)
+# ---------------------------------------------------------------------------
+
+metadata = MetaData()
+
+messages = Table(
+    "messages",
+    metadata,
+    Column("event_id", Text, primary_key=True),
+    Column("room_id", Text, nullable=False),
+    Column("user_id", Text, nullable=False),
+    Column("ts_ms", BigInteger, nullable=False),
+    Column("body", Text, nullable=False),
+)
+
+# ---------------------------------------------------------------------------
+# FastAPI app
+# ---------------------------------------------------------------------------
+
+app = FastAPI(title="Botbot Logging Service", version="1.1.0")
+
+
+class MessageIn(BaseModel):
+    """Payload received from matrix_service."""
+
+    event_id: str = Field(..., example="$14327358242610PhrSn:matrix.org")
+    room_id: str = Field(..., example="!someroomid:matrix.org")
+    user_id: str = Field(..., example="@alice:matrix.org")
+    ts_ms: int = Field(..., example=1713866689000, description="Matrix server_timestamp in ms since epoch")
+    body: str = Field(..., example="Hello, world!")
+
+
+# ---------------------------------------------------------------------------
+# Database engine/session factories (populated on startup)
+# ---------------------------------------------------------------------------
+
+engine: AsyncEngine | None = None
+SessionLocal: async_sessionmaker[AsyncSession] | None = None
+
+
+async def ensure_database_exists() -> None:
+    """Connect to the admin DB and create `POSTGRES_DB` if it is missing."""
+
+    log.info("Checking whether database %s exists", POSTGRES_DB)
+
+    admin_engine = create_async_engine(ADMIN_URL, pool_pre_ping=True)
+    try:
+        async with admin_engine.begin() as conn:
+            db_exists = await conn.scalar(
+                text("SELECT 1 FROM pg_database WHERE datname = :db"),
+                {"db": POSTGRES_DB},
+            )
+            if not db_exists:
+                log.warning("Database %s not found – creating it", POSTGRES_DB)
+                await conn.execute(text(f'CREATE DATABASE "{POSTGRES_DB}"'))
+                log.info("Database %s created", POSTGRES_DB)
+    finally:
+        await admin_engine.dispose()
+
+
+async def create_engine_and_tables() -> None:
+    """Initialise SQLAlchemy engine and create the `messages` table if needed."""
+
+    global engine, SessionLocal  # noqa: PLW0603
+
+    engine = create_async_engine(DATABASE_URL, pool_pre_ping=True)
+
+    async with engine.begin() as conn:
+        await conn.run_sync(metadata.create_all)
+
+    SessionLocal = async_sessionmaker(engine, expire_on_commit=False)
+    log.info("Database initialised and tables ensured.")
+
+
+async def get_session() -> AsyncGenerator[AsyncSession, None]:
+    async with SessionLocal() as session:  # type: ignore[arg-type]
+        yield session
+
+
+# ---------------------------------------------------------------------------
+# Lifespan events
+# ---------------------------------------------------------------------------
+
+@app.on_event("startup")
+async def on_startup() -> None:  # noqa: D401 (imperative mood)
+    """Ensure the database *and* table exist before serving traffic."""
+    log.info("Starting up")
+
+    try:
+        await create_engine_and_tables()
+    except OperationalError as err:
+        # Common case: database itself does not yet exist.
+        if "does not exist" in str(err):
+            log.warning("Primary database missing – attempting to create it")
+            await ensure_database_exists()
+            # Retry now that DB exists
+            await create_engine_and_tables()
+        else:
+            log.error("Database connection failed: %s", err)
+            raise
+
+
+@app.on_event("shutdown")
+async def on_shutdown() -> None:  # noqa: D401
+    if engine:
+        await engine.dispose()
+
+
+# ---------------------------------------------------------------------------
+# API endpoints
+# ---------------------------------------------------------------------------
+
+
+@app.get("/healthz", tags=["health"])
+async def healthz() -> dict[str, str]:
+    return {"status": "ok"}
+
+
+@app.post("/api/v1/log", status_code=status.HTTP_202_ACCEPTED, tags=["log"])
+async def log_message(
+    payload: MessageIn,
+    x_log_token: str = Header(alias="X-Log-Token"),
+    session: AsyncSession = Depends(get_session),
+) -> dict[str, str]:
+    """Persist one Matrix message to Postgres.
+
+    Requires header `X-Log-Token` matching the `LOG_TOKEN` env‑var.
+    """
+
+    if x_log_token != LOG_TOKEN:
+        raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="invalid token")
+
+    stmt = (
+        messages.insert()
+        .values(**payload.model_dump())
+        .on_conflict_do_nothing(index_elements=[messages.c.event_id])
+    )
+    await session.execute(stmt)
+    await session.commit()
+    return {"status": "accepted"}
--- a/db_service/requirements.txt
+++ b/db_service/requirements.txt
@@ -0,0 +1,10 @@
+# Web framework & ASGI server
+fastapi
+uvicorn[standard]
+
+# Database access
+sqlalchemy[asyncio]>=2.0
+asyncpg>=0.29
+
+# Environment / configuration helpers
+python-dotenv>=1.0
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -1,4 +1,42 @@
 services:
+  # -----------------------
+  # Database (PostgreSQL)
+  # -----------------------
+  postgres:
+    image: postgres:16-alpine
+    restart: unless-stopped
+    environment:
+      - POSTGRES_USER
+      - POSTGRES_PASSWORD
+      - POSTGRES_DB
+    volumes:
+      - postgres-data:/var/lib/postgresql/data
+    healthcheck:
+      test: "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"
+      interval: 10s
+      timeout: 5s
+      retries: 5
+
+  # -----------------------
+  # Conversation‑logging microservice
+  # -----------------------
+  db_service:
+    build:
+      context: ./db_service
+    restart: unless-stopped
+    environment:
+      - POSTGRES_USER
+      - POSTGRES_PASSWORD
+      - POSTGRES_DB
+      - POSTGRES_HOST
+      - LOG_TOKEN
+    depends_on:
+      postgres:
+        condition: service_healthy
+    ports:
+      - "8000:8000"  # expose externally only if needed
+
+
  matrix_service:
    build: ./matrix_service
    environment:
@@ -25,6 +63,10 @@ services:
  redis:
    image: redis:7
    restart: unless-stopped
+    volumes:
+      - redis-data:/data    

 volumes:
-  matrix_data:
+  matrix_data:
+  redis-data:
+  postgres-data:
--- a/matrix_service/requirements.txt
+++ b/matrix_service/requirements.txt
@@ -1,4 +1,4 @@
-matrix-nio[e2e]>=0.25.0
+matrix-nio[e2e]>=0.25.2
 python-dotenv>=1.0.0
 httpx>=0.23.0
 pydantic>=1.10
Author	SHA1	Message	Date
Joao Figueiredo	9d5fb3f5be	first to addDB service	2025-06-15 20:20:54 +01:00
Joao Figueiredo	873280d027	updated to reflect refactoring	2025-05-04 18:55:18 +01:00