Compare commits

..

2 Commits

Author SHA1 Message Date
9d5fb3f5be first to addDB service 2025-06-15 20:20:54 +01:00
873280d027 updated to reflect refactoring 2025-05-04 18:55:18 +01:00
7 changed files with 365 additions and 79 deletions

View File

@@ -12,3 +12,13 @@ AI_HANDLER_URL="http://ai_service:8000"
# OpenAI API Key
OPENAI_API_KEY=your_openai_api_key_here
AI_HANDLER_TOKEN=common_token_here
# db_service
LOG_TOKEN=your_log_token_here
# PostgreSQL Configuration
POSTGRES_USER=database_user
POSTGRES_PASSWORD=database_password
POSTGRES_DB=database_name
POSTGRES_HOST=postgres

176
README.md
View File

@@ -2,108 +2,130 @@
*Botbot, your not so friendly Bot*
A Matrix chat bot that listens to specified rooms, records conversations, leverages OpenAI for AI-driven summaries, and assists with answering questions.
**Botbot** aimsto be a multiservice chat assistant for [Matrix](https://matrix.org) that captures room conversations, stores them, and uses LargeLanguageModels to provide concise summaries, answer followup questions, and surface action points.
## Objectives
### Roadmap / Aspirational goals
- Record message history on rooms its participating.
- Create discussion summaries to capture actionable items, deadlines, and decisions on subjects discussed on those rooms.
- Use collected knoledge to answer questions placed by participants in discussions that are left unansewerd after some time, and also when its direclty addressed using it's @handle
* Persistent conversation store (PostgreSQL) for longrange context.
* Pluggable LLM backends (local models, Ollama, etc.).
* Structured meeting summaries with actionitems and deadlines.
* Additional chat frontends (Telegram, WhatsApp, …).
also,
---
- Support additional AI backends beyond OpenAI (e.g., local LLMs, alternative APIs).
- Possibly support other Chat services beyownd Matrix like Telegram, Teams or Whatsapp.
## Architecture
```mermaid
flowchart LR
MS["matrix_service<br/>(Python & nio)"]
AI["ai_service<br/>(FastAPI)"]
Redis["Redis<br/>Deduplicates replies / guarantees idempotency"]
MS -- "HTTP (Bearer token)" --> AI
AI -- "Matrix Events" --> MS
MS -.-> Redis
AI -.-> Redis
```
| Component | Image / EntryPoint | Purpose |
| ------------------- | ------------------------------------- | ------------------------------------------------------------------------------------- |
| **matrix\_service** | `python matrix_service/main.py` | Listens to Matrix rooms, forwards each message to `ai_service`, posts the reply back. |
| **ai\_service** | `python ai_service/main.py` (FastAPI) | Builds a prompt, calls the configured LLM (OpenAI today), caches the reply in Redis. |
| **redis** | `redis:7` | Reply cache & simple key/value store. |
The services talk to each other over the internal Docker network. Authentication between them is a static bearer token (`AI_HANDLER_TOKEN`).
---
## Current Features
- **Auto-join Rooms**: Automatically joins rooms when invited.
- **Message Callbacks**: Responds to basic commands:
- `!ping``Pong!`
- `hello botbot` → Greeting
* **AutoJoin on Invite** secure E2EE join including device verification.
* **Stateless AI handler** FastAPI endpoint `/api/v1/message` receiving a JSON payload.
* **Idempotent replies** duplicate Matrix events reuse the cached answer.
## Prerequisites
- Docker & Docker Compose (or Python 3.8+)
- A Matrix account for the bot
- OpenAI API key
---
## Installation
1. **Clone the repository**
```bash
git clone https://gitea.alluna.pt/jfig/botbot.git
cd botbot
```
2. **Configure environment variables**
Copy the example file and edit it:
```bash
cp .env.example .env
```
Then open `.env` and set:
```ini
LOG_LEVEL=INFO
HOMESERVER_URL=https://matrix.org
USER_ID=@botbot_user:matrix.org
PASSWORD=your_matrix_password
OPENAI_API_KEY=your_openai_api_key
```
## Usage
### Using Docker Compose (development/hot-reload)
## Quick Start (development)
```bash
docker-compose up --build
# clone & cd
$ git clone https://gitea.alluna.pt/jfig/botbot.git
$ cd botbot
# copy environment template
$ cp .env.example .env
# edit .env with your homeserver, credentials and OpenAI key
# start everything (hotreload volumes mounted)
$ docker compose up --build
```
### Building and Running Manually (production)
The default compose file launches three containers:
* `matrix_service` watches your rooms
* `ai_service` handles AI prompts
* `redis` reply cache
Stop with <kbd>CtrlC</kbd> or `docker compose down`.
### Production (single image per service)
Build once, then deploy with your orchestrator of choice:
```bash
# Build container
docker build -t botbot .
# Run container
docker run -d --env-file .env \
-v matrix_data:/app/data \
--restart unless-stopped \
botbot
$ docker compose -f docker-compose.yml --profile prod build
$ docker compose -f docker-compose.yml --profile prod up -d
```
## Configuration Options
---
- `LOG_LEVEL`: One of `CRITICAL`, `ERROR`, `WARNING`, `INFO`, `DEBUG`, `NOTSET`.
- `HOMESERVER_URL`: Matrix homeserver endpoint (e.g., `https://matrix.org`).
- `USER_ID`: Bot's full Matrix user ID (e.g., `@botbot_user:matrix.org`).
- `PASSWORD`: Password for the bot account.
- `OPENAI_API_KEY`: API key for OpenAI usage.
## Configuration
## How It Works
All settings are environment variables. The table below reflects the current codebase (commit `ae27a2c`).
1. **Startup**: Loads environment and logs into the Matrix homeserver.
2. **Callbacks**:
- `message_callback`: Handles text messages and triggers AI logic.
- `invite_cb`: Joins rooms on invitation.
3. **AI Integration**: Future development will:
- Pull recent chat history.
- Call OpenAI endpoints to generate summaries or answers.
| Variable | Service | Default | Description |
| ------------------------------ | --------------- | ------------------------ | ---------------------------------------- |
| `LOG_LEVEL` | both | `INFO` | Python logging level. |
| `MATRIX_HOMESERVER_URL` | matrix\_service | |  Matrix homeserver base URL. |
| `MATRIX_USER_ID` | matrix\_service | |    Full user id of the bot. |
| `MATRIX_PASSWORD` | matrix\_service | |    Password for the bot account. |
| `MATRIX_LOGIN_TRIES` | matrix\_service |  `5` |    Number of login attempts before exit. |
| `MATRIX_LOGIN_DELAY_INCREMENT` | matrix\_service |  `5` | Seconds added per retry. |
| `AI_HANDLER_URL` | matrix\_service | `http://ai_service:8000` | Where to POST messages. |
| `AI_HANDLER_TOKEN` | both | | Shared bearer token (keep secret). |
| `OPENAI_API_KEY` | ai\_service | | Key for `openai` Python SDK. |
| `REDIS_URL` | ai\_service | `redis://redis:6379` | Connection string used by `redis-py`. |
---
## API (ai\_service)
```http
POST /api/v1/message
Authorization: Bearer <AI_HANDLER_TOKEN>
Content-Type: application/json
{
"roomId": "!foo:matrix.org",
"userId": "@alice:matrix.org",
"eventId": "$abc123",
"serverTimestamp": 1714821630123,
"content": "Hello there"
}
```
Returns `{"reply": "Hello Alice!"}` or HTTP 401/500 on error.
---
## Contributing
1. Fork the repository.
2. Create a feature branch (`git checkout -b feature/xyz`).
3. Commit changes and push (`git push origin feature/xyz`).
4. Open a pull request with a description of your changes.
Pull requests are welcome! Please open an issue first to discuss what you want to change. All source code is formatted with **ruff** / **black** run `pre-commit run --all-files` before pushing.
---
## License
This project is licensed under the [MIT License](LICENSE).
[MIT](LICENSE)

20
db_service/Dockerfile Normal file
View File

@@ -0,0 +1,20 @@
FROM python:3.11-slim
# Prevent Python from buffering stdout/stderr so logs appear immediately
ENV PYTHONUNBUFFERED=1
# Install system dependencies (none needed for asyncpg on slim)
WORKDIR /app
# Install Python dependencies first to leverage Docker layer caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy service source code only after dependencies
COPY . /app
EXPOSE 8000
# Use Uvicorn as the ASGI server
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

182
db_service/main.py Normal file
View File

@@ -0,0 +1,182 @@
from __future__ import annotations
import logging
import os
from typing import AsyncGenerator
from dotenv import load_dotenv
from fastapi import Depends, FastAPI, Header, HTTPException, status
from pydantic import BaseModel, Field
from sqlalchemy import BigInteger, Column, MetaData, Table, Text, text
from sqlalchemy.exc import OperationalError
from sqlalchemy.ext.asyncio import (
AsyncEngine,
AsyncSession,
async_sessionmaker,
create_async_engine,
)
# ---------------------------------------------------------------------------
# Environment & logging setup
# ---------------------------------------------------------------------------
load_dotenv()
LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO").upper()
logging.basicConfig(level=LOG_LEVEL, format="%(levelname)s | %(name)s | %(message)s")
log = logging.getLogger("db_service")
LOG_TOKEN = os.getenv("LOG_TOKEN", "changeme")
POSTGRES_USER = os.getenv("POSTGRES_USER", "postgres")
POSTGRES_PASSWORD = os.getenv("POSTGRES_PASSWORD", "postgres")
POSTGRES_DB = os.getenv("POSTGRES_DB", "botbot")
POSTGRES_HOST = os.getenv("POSTGRES_HOST", "postgres")
POSTGRES_PORT = os.getenv("POSTGRES_PORT", "5432")
DATABASE_URL = (
f"postgresql+asyncpg://{POSTGRES_USER}:{POSTGRES_PASSWORD}" f"@{POSTGRES_HOST}:{POSTGRES_PORT}/{POSTGRES_DB}"
)
ADMIN_URL = (
f"postgresql+asyncpg://{POSTGRES_USER}:{POSTGRES_PASSWORD}" f"@{POSTGRES_HOST}:{POSTGRES_PORT}/postgres"
)
# ---------------------------------------------------------------------------
# SQLAlchemy table definition (metadata)
# ---------------------------------------------------------------------------
metadata = MetaData()
messages = Table(
"messages",
metadata,
Column("event_id", Text, primary_key=True),
Column("room_id", Text, nullable=False),
Column("user_id", Text, nullable=False),
Column("ts_ms", BigInteger, nullable=False),
Column("body", Text, nullable=False),
)
# ---------------------------------------------------------------------------
# FastAPI app
# ---------------------------------------------------------------------------
app = FastAPI(title="Botbot Logging Service", version="1.1.0")
class MessageIn(BaseModel):
"""Payload received from matrix_service."""
event_id: str = Field(..., example="$14327358242610PhrSn:matrix.org")
room_id: str = Field(..., example="!someroomid:matrix.org")
user_id: str = Field(..., example="@alice:matrix.org")
ts_ms: int = Field(..., example=1713866689000, description="Matrix server_timestamp in ms since epoch")
body: str = Field(..., example="Hello, world!")
# ---------------------------------------------------------------------------
# Database engine/session factories (populated on startup)
# ---------------------------------------------------------------------------
engine: AsyncEngine | None = None
SessionLocal: async_sessionmaker[AsyncSession] | None = None
async def ensure_database_exists() -> None:
"""Connect to the admin DB and create `POSTGRES_DB` if it is missing."""
log.info("Checking whether database %s exists", POSTGRES_DB)
admin_engine = create_async_engine(ADMIN_URL, pool_pre_ping=True)
try:
async with admin_engine.begin() as conn:
db_exists = await conn.scalar(
text("SELECT 1 FROM pg_database WHERE datname = :db"),
{"db": POSTGRES_DB},
)
if not db_exists:
log.warning("Database %s not found creating it", POSTGRES_DB)
await conn.execute(text(f'CREATE DATABASE "{POSTGRES_DB}"'))
log.info("Database %s created", POSTGRES_DB)
finally:
await admin_engine.dispose()
async def create_engine_and_tables() -> None:
"""Initialise SQLAlchemy engine and create the `messages` table if needed."""
global engine, SessionLocal # noqa: PLW0603
engine = create_async_engine(DATABASE_URL, pool_pre_ping=True)
async with engine.begin() as conn:
await conn.run_sync(metadata.create_all)
SessionLocal = async_sessionmaker(engine, expire_on_commit=False)
log.info("Database initialised and tables ensured.")
async def get_session() -> AsyncGenerator[AsyncSession, None]:
async with SessionLocal() as session: # type: ignore[arg-type]
yield session
# ---------------------------------------------------------------------------
# Lifespan events
# ---------------------------------------------------------------------------
@app.on_event("startup")
async def on_startup() -> None: # noqa: D401 (imperative mood)
"""Ensure the database *and* table exist before serving traffic."""
log.info("Starting up")
try:
await create_engine_and_tables()
except OperationalError as err:
# Common case: database itself does not yet exist.
if "does not exist" in str(err):
log.warning("Primary database missing attempting to create it")
await ensure_database_exists()
# Retry now that DB exists
await create_engine_and_tables()
else:
log.error("Database connection failed: %s", err)
raise
@app.on_event("shutdown")
async def on_shutdown() -> None: # noqa: D401
if engine:
await engine.dispose()
# ---------------------------------------------------------------------------
# API endpoints
# ---------------------------------------------------------------------------
@app.get("/healthz", tags=["health"])
async def healthz() -> dict[str, str]:
return {"status": "ok"}
@app.post("/api/v1/log", status_code=status.HTTP_202_ACCEPTED, tags=["log"])
async def log_message(
payload: MessageIn,
x_log_token: str = Header(alias="X-Log-Token"),
session: AsyncSession = Depends(get_session),
) -> dict[str, str]:
"""Persist one Matrix message to Postgres.
Requires header `X-Log-Token` matching the `LOG_TOKEN` envvar.
"""
if x_log_token != LOG_TOKEN:
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="invalid token")
stmt = (
messages.insert()
.values(**payload.model_dump())
.on_conflict_do_nothing(index_elements=[messages.c.event_id])
)
await session.execute(stmt)
await session.commit()
return {"status": "accepted"}

View File

@@ -0,0 +1,10 @@
# Web framework & ASGI server
fastapi
uvicorn[standard]
# Database access
sqlalchemy[asyncio]>=2.0
asyncpg>=0.29
# Environment / configuration helpers
python-dotenv>=1.0

View File

@@ -1,4 +1,42 @@
services:
# -----------------------
# Database (PostgreSQL)
# -----------------------
postgres:
image: postgres:16-alpine
restart: unless-stopped
environment:
- POSTGRES_USER
- POSTGRES_PASSWORD
- POSTGRES_DB
volumes:
- postgres-data:/var/lib/postgresql/data
healthcheck:
test: "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"
interval: 10s
timeout: 5s
retries: 5
# -----------------------
# Conversationlogging microservice
# -----------------------
db_service:
build:
context: ./db_service
restart: unless-stopped
environment:
- POSTGRES_USER
- POSTGRES_PASSWORD
- POSTGRES_DB
- POSTGRES_HOST
- LOG_TOKEN
depends_on:
postgres:
condition: service_healthy
ports:
- "8000:8000" # expose externally only if needed
matrix_service:
build: ./matrix_service
environment:
@@ -25,6 +63,10 @@ services:
redis:
image: redis:7
restart: unless-stopped
volumes:
- redis-data:/data
volumes:
matrix_data:
matrix_data:
redis-data:
postgres-data:

View File

@@ -1,4 +1,4 @@
matrix-nio[e2e]>=0.25.0
matrix-nio[e2e]>=0.25.2
python-dotenv>=1.0.0
httpx>=0.23.0
pydantic>=1.10