Systems builder. Tool maker.
Senior backend engineer at Kiwi.com.
Author of Repid. Building dicexdice.io. PyCon speaker.
By the numbers
The project I'm most proud of
Repid is an asyncio-native Python task queue. I built it at Paragraphe, my previous startup, because scaling Celery to our scraping volume meant infra costs we couldn't justify. Today it runs in production at Kiwi.com, processing 5M+ messages daily. In benchmarks it's 23× faster than Celery on I/O-bound work. But speed isn't why people pick it up:
- Generates AsyncAPI 3.0 schemas automatically from your actors. Your task queue is self-documenting.
- Strict Pydantic validation of payloads and headers before execution. Bad messages fail fast.
- Dependency injection in actor signatures. Testable by default.
- Producer-only, consumer-only, or fully end-to-end. It fits in your current infra.
- InMemoryServer for unit tests. No broker required.
- Wide broker support out-of-the-box: RabbitMQ, NATS, Redis, Google Pub/Sub, SQS, Kafka.
from repid import Repid, Router, InMemoryServer
app = Repid()
app.servers.register_server(
"default",
InMemoryServer(),
is_default=True,
)
router = Router()
@router.actor(channel="articles")
async def fetch_article(url: str) -> None:
article = await scraper.fetch(url)
await db.save(article)
app.include_router(router)
# Enqueue from anywhere:
# payload validated by Pydantic
await app.send_message_json(
channel="articles",
payload={"url": "https://example.com"},
headers={"topic": "fetch_article"},
)
Why me, specifically
I build infrastructure other teams run on, and I know what it costs
Repid replaced Celery at a pre-revenue startup once its infrastructure costs became unsustainable. Now adopted at Kiwi.com for production workloads. At dicexdice.io I run 99.9% uptime on Hetzner with a minimal budget. I don't separate engineering decisions from economic constraints. Infrastructure cost is a design requirement.
I learned resilience on hardware before applying it in the cloud
I started with embedded C and PCB design, where a bug doesn't throw an exception; it bricks a device. That mindset transferred: when I redesigned Kiwi.com's auth backend, I assumed it would be attacked and how it can fail. It passed multiple external security audits and keeps 99.999% uptime. I design for failure modes first, not as an afterthought.
I make teams ship faster and sleep better
Rebuilt CI/CD from 15 minutes to 90 seconds. Built the observability stack for 10 microservices: distributed tracing, structured logging and dashboards - visibility the team didn't have before. Drove a 20× reduction in weekly error rate through systematic root cause analysis. We went from being paged several times a week to once every few months.
I build consensus, then systems
Authored RFCs that aligned frontend, mobile, security, and data science teams behind a single architecture. Led 10+ cross-company initiatives affecting millions of monthly active users, secured long-term stakeholder buy-in, and shipped on schedule. Spoken at PyCon Poland and Lithuania. Writing the code is rarely the hard part; getting five teams to agree on what to build and why is. I do both.
Where I've worked
Kiwi.com one of the biggest flight aggregators in Europe · kiwi.com
Senior Software Engineer, Customer-Core Team (formerly Account Team) – present ~4 years- Redesigned auth backend through end-to-end system design and API contract overhaul, cutting 40ms from every page request and reducing core service load by 70% via optimistic caching. Passed multiple external security audits and reduced scraping and account takeover attacks by 90%.
- Built a customer 360° data platform on Google Pub/Sub using CQRS projections, unifying user data across 10+ microservices for real-time personalization.
- Maintained 99.99999% error rate SLO and 99.995%+ uptime SLO across 600+ production deployments. Drove 20× reduction in weekly error rate through systematic root cause analysis and performance optimization.
- Championed reliability engineering across services. Built full observability stack for 10 microservices with distributed tracing, structured logging, and Datadog/Grafana dashboards.
- Resolved 5+ critical production incidents with significant revenue impact; improved on-call processes and runbooks as part of broader reliability engineering efforts.
- Iteratively optimized PostgreSQL schema design and query patterns, achieving 10× storage reduction and 2× latency improvement.
- Cut deployment time from 15 minutes to 90 seconds by redesigning CI/CD pipelines with Docker, Kubernetes, and containerization best practices.
- Led 10+ cross-company initiatives through RFCs and architecture reviews, aligning frontend, mobile, security, and data science teams to ship the auth redesign on schedule.
- Integrated AI-powered tooling into development workflows and introduced OpenAPI schemas to standardize service interfaces and enforce API design contracts.
Dice x Dice LLM-assisted D&D-style game master · dicexdice.io
Founder & Backend Engineer (side project) – present ~10 months- Architected and shipped an LLM-powered tabletop RPG game master serving personalized campaigns via prompt routing, state machines, and game-state projections.
- Built the entire auth stack from scratch using Passkeys and OIDC/OAuth2, enabling passwordless login and eliminating credential-spraying attack vectors.
- Built a GenAI observability pipeline tracking token usage, latency distributions, and prompt tracing, cutting LLM failure debugging time from hours to minutes.
- Operates production infrastructure on Hetzner (Kubernetes, PostgreSQL, MongoDB, Redis, RabbitMQ, Grafana stack), maintaining 99.9% uptime on a minimal budget.
Paragraphe news aggregation startup · TechnoSpark-backed
Founder & Backend Engineer – 1 year- Architected and deployed a scalable RESTful API using Python and FastAPI with asyncio, achieving 100+ RPS per instance at sub-50ms latency.
- Built horizontally scalable web and RSS scrapers processing 10,000+ pages daily to feed a personalization pipeline; powered a recommendation engine that classified users by content category and delivered daily top-10 article recommendations.
- Deployed and operated production HA infrastructure using Kubernetes, Docker, Consul, Vault via Infrastructure-as-Code (Terraform, Ansible). CI/CD deploys took ~3 minutes from day one.
- Created Repid, an asyncio-native task queue, to replace Celery in the scraping pipeline, delivering 23× higher throughput on I/O-bound workloads and cutting infrastructure costs.
- Developed a cross-platform mobile app with Flutter, shipping from system design through beta testing with real users.
FGD indie hypercasual game studio
Lead Developer – 6 months- Led development of 2 hypercasual mobile games in the Godot engine as part of a 15-person team.
- Drove the team to 2nd place at a GameJam competition.
- Introduced GitLab CI/CD pipelines with automated testing and deployment, accelerating the team's iteration cycles.
- Performed 200+ code reviews and refactoring contributions across the codebase, improving overall code quality.
Freelance various clients · Python + embedded
Python developer · embedded software & hardware – ~2 years- Implemented an automated proxy rotation pool to bypass geo-restrictions, improving scraper reliability and availability.
- Built an automated local weather forecast analytics and prediction pipeline, collecting ~1,400 data points daily and generating visual reports.
- Created and maintained 5+ Telegram bots using both raw API and framework-based approaches, sustaining 99.9% uptime.
- Established CI/CD pipelines including Docker-based cross-compiled arm64 image builds using Docker Buildx, applying containerization best practices.
- Developed embedded firmware for STM32/AVR, designed KiCad PCBs, and built BLE/Wi-Fi devices and small robotics projects.
What I work with, and what I want to work with
Daily drivers
Also comfortable with
Languages I'm exploring besides Python
Where I'd like to be based
Public speaking
Also: Python Belgrade Meetup 2022.
Writing
I write on my blog, mostly about Python, async patterns, distributed systems, and what I've broken lately.
Open source
Beyond Repid, I've contributed to OpenTelemetry, Pydantic, and a handful of smaller libraries. Bots, scrapers, some a custom PCBs, robotics. Find them on GitHub.
Spoken languages
Russian (native) · English (fluent) · French (intermediate) · Serbian (beginner)
Want to talk?
I'm looking for senior or staff backend roles on a product people actually use, where I can drive meaningful work across code, infra, and cross-team delivery.
Aleksandr Sulimov · me@aleksul.space · Belgrade, Serbia