Building a Municipal AI Complaint System FastAPI, ML, PostgreSQL, and Docker

I have been working on a project that I think is actually useful in a real-world context, a municipal complaint management system that uses machine learning to automatically categorize, score, and prioritize citizen-reported issues. Think pothole reports, drainage problems, broken streetlights. Instead of someone manually triaging these, the system does it for you.
In this post I want to walk through how I built the backend engine using FastAPI, connected it to a trained ML model, set up PostgreSQL for persistence, and wrapped the whole thing in Docker so it is easy to run anywhere.
What the system actually does
At a high level, a citizen submits a complaint through the frontend. The backend receives it, runs it through two ML models, one to classify the issue category and one to predict its priority level, and then enriches the prediction with geospatial context by querying nearby hospitals and schools using OpenStreetMap's Overpass API. The final result is stored in PostgreSQL and returned to the frontend with a full priority breakdown.
The reason I added geospatial context is that a broken road near a hospital is genuinely more urgent than one in a low-traffic area. That kind of signal is not in the complaint text, so I had to pull it from the map.
The ML prediction engine
The core logic lives in predictor.py. When a complaint comes in, it does four things in sequence: encodes the issue type, geocodes the location string into lat/lon coordinates, queries nearby amenities, and then runs the two models.
async def predict_complaint(data, citizen_count):
issue_type_encoded = encode_issue_type(data.issue_type)
lat, lon = await geocode_address(data.location)
nearby = await get_nearby_places(lat, lon)
predicted_category = category_model.predict([data.issue_description])[0]
desc = data.issue_description.lower()
if any(w in desc for w in ['accident', 'dangerous', 'injured', 'flooding', 'fire', 'collapse']):
severity = random.randint(8, 10)
elif any(w in desc for w in ['overflowing', 'broken', 'large', 'not working']):
severity = random.randint(5, 7)
else:
severity = random.randint(1, 4)
hospital_count = nearby["hospital_count"]
school_count = nearby["school_count"]
if hospital_count > 0:
area_imp = 10
elif school_count > 0:
area_imp = 9
else:
area_imp = random.randint(2, 5)
features = [[severity, area_imp, citizen_count, issue_type_encoded]]
priority_level = priority_model.predict(features)[0]
norm_count = (citizen_count / 50) * 10
priority_score = round((0.4 * severity) + (0.3 * norm_count) + (0.3 * area_imp), 2)
return { ... }
The priority score formula gives 40% weight to severity, 30% to citizen report count, and 30% to area importance. That weighting was a deliberate design decision, severity matters most, but a complaint reported by 30 people carries more weight than one reported by a single person even if the description sounds similar.
One thing I want to be transparent about, the severity scoring right now is keyword based, not model-based. That is a known limitation. The idea is to eventually replace this with a fine-tuned text classifier that scores severity directly from the description.
Geocoding and the Overpass API
To get coordinates from a location string, I used OpenStreetMap's Nominatim endpoint. From there, I query the Overpass API in a 500 meter radius to count hospitals and schools.
async def get_nearby_places(lat: float, lon: float, radius: int = 500):
query = f"""
[out:json];
(
node["amenity"="hospital"](around:{radius},{lat},{lon});
node["amenity"="school"](around:{radius},{lat},{lon});
);
out;
"""
async with httpx.AsyncClient(timeout=10.0) as client:
response = await client.post(OVERPASS_URL, data=query, headers={...})
data = response.json()
hospitals, schools = [], []
for element in data.get("elements", []):
tags = element.get("tags", {})
if tags.get("amenity") == "hospital":
hospitals.append(element)
elif tags.get("amenity") == "school":
schools.append(element)
return {
"hospitals": format_places(hospitals),
"schools": format_places(schools),
"hospital_count": len(hospitals),
"school_count": len(schools)
}
I also handle the case where geocoding fails gracefully , if lat/lon comes back as None, the function returns empty lists with zero counts so the rest of the prediction pipeline does not break.
Database schema
For persistence I used PostgreSQL with asyncpg. The schema is straightforward one table that stores the full prediction output alongside the original complaint data.
async def create_tables(db_pool):
query = """
CREATE TABLE IF NOT EXISTS complaints (
id SERIAL PRIMARY KEY,
report_id VARCHAR(50) UNIQUE,
issue_description TEXT NOT NULL,
issue_type VARCHAR(100),
priority_level VARCHAR(20),
priority_score FLOAT,
severity_score FLOAT,
area_importance FLOAT,
citizen_reports_count INT,
location VARCHAR(255),
issue_status VARCHAR(50) DEFAULT 'submitted',
report_datetime TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
"""
async with db_pool.acquire() as conn:
await conn.execute(query)
print("Database tables checked/created.")
I call this function at application startup via FastAPI's lifespan event, so the table is always guaranteed to exist before the first request comes in. The issue_status field defaults to submitted and is meant to be updated as the municipality acts on the complaint.
Dockerizing the setup
The Dockerfile is pretty lean. I am using the official Python 3.13 slim image, copying in requirements, then the app code, and starting uvicorn.
FROM python:3.13-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]
And the compose file wires the backend to a PostgreSQL 15 container.
version: "3.9"
services:
backend:
build: .
container_name: municipal_backend
ports:
- "8000:8000"
volumes:
- .:/app
env_file:
- .env
depends_on:
- db
db:
image: postgres:15
container_name: municipal_db
restart: always
environment:
POSTGRES_USER: wajahat
POSTGRES_PASSWORD: municipal123
POSTGRES_DB: municipal_db
ports:
- "5433:5432"
Note that I mapped the host port to 5433 instead of 5432. I already had a local PostgreSQL instance running on 5432, so this avoids a port conflict without touching the container config at all.
One thing to keep in mind, the depends_on field only waits for the container to start, not for Postgres to be ready to accept connections. In production I would add a healthcheck or a retry loop in the app startup to handle that gap.
Connecting the frontend
The frontend is a React app that hits the FastAPI endpoints. The main flow is: the user fills out a complaint form, it posts to /predict, and the response comes back with the full priority breakdown including the nearby places data and the map coordinates, which I render on an OpenStreetMap embed.
Because the prediction involves two async HTTP calls (geocoding + Overpass), I made sure the FastAPI route is fully async end-to-end. Blocking anywhere in that chain would tank response times under load.
What I would do differently next time
A few honest things I noted while building this:
The keyword based severity scoring should be replaced with a proper classifier. It works well enough for demo purposes but it is fragile.
The Overpass API can be slow, caching the nearby places result by lat/lon would make a big difference in response times.
The depends_on issue in Docker Compose is a known gotcha that should be handled with a proper startup probe rather than hoping Postgres is ready in time.
Credentials are currently in the .env file which is fine locally but should be moved to a secrets manager before this goes anywhere near production.
The full code is on my GitHub. If you have questions or feedback, drop them in the comments, I am always happy to talk through the decisions I made here.
github link forBackEnd : https://github.com/saeed-taj/municipal-ml-system.git
github link for FrontEnd:
https://github.com/saeed-taj/municipal-ml-system-frontend.git



