Lubo Bali - Senior Data & AI Platform Engineer | Builder of Production AI Systems
Chicago, IL • (312)-358-0008 • data@lubobali.com •
- 3+ years data & AI engineering + 12 years financial operations ($500K+ annual revenue)
- Endorsed by Zach Wilson (ex-Netflix, Airbnb, Meta): "Best project in the bootcamp, ready to make money"
- Shipping production multi-tenant B2B SaaS at HelloPayments and solo-built LuBot.ai (168K lines, 9 NVIDIA models, 6,683 tests)
WORK EXPERIENCE
HELLOPAYMENTS
Senior Data & AI Platform Engineer Mar 2026 - Present
- Lead engineer for AutomateWithUs, a multi-product B2B SaaS for Independent Sales Organizations (ISOs) in payment processing. Own backend, frontend, and architecture for the flagship product DataSwim, an automated residual and commission engine replacing manual Excel. 588 backend tests at 91% coverage, 102 frontend tests at 90% coverage.
- Built the DataSwim Residual Engine end-to-end: pure compute layer (residual math, rules repo, ledger writer, scheduler) and 7 API
endpoints (compute, rules CRUD, statements, agent commissions, monthly performance, portfolio valuation). Live pilot tenant has 6 months of ledger and $31K+ in computed agent splits.
- Engineered the AI Column Mapper service: LLM-assisted mapping of unknown processor CSV schemas to canonical columns with
auto-apply or re-propose. Universal versioned column-map schema means zero code change to onboard a new payment processor.
- Built a 3-tier LLM gateway on LiteLLM: self-hosted GPU server (DeepSeek R1 70B via Tailscale, $0) -> NVIDIA NIM free tier -> OpenRouter paid fallback. 80% of analytics queries served from SQL templates at $0 LLM cost.
- Designed and shipped multi-tenant SaaS architecture for the payment-industry use case: physical per-customer database isolation (one Postgres per tenant, one schema per product). Approved by CEO as the foundation for 6 future products. Compliance-ready (SOC 2 / PCI /HIPAA scope).
- Migrated PaymentsHub processor ingestion from a brittle Playwright/Chromium scrape to Keycloak OIDC + REST API with token refresh.Killed an entire class of reliability incidents.
- Rebuilt team management on Clerk-native APIs: 12+ endpoints, append-only audit log, webhook handlers for organization invitations. Resolved multiple race-condition bugs in production.
- Shipped 4 customer-facing dashboards in one sprint (Portfolio Valuation, Monthly Performance, per-merchant Statement, Agent Commissions). Caught a residual-math bug (annual vs monthly multiplier) that would have understated portfolio value across all customers.
- Authored the company's Platform Security Architecture v1.1 aligned to NIST CSF 2.0, OWASP LLM Top 10, and CIS Controls v8.1. Hardened deployment with container security, encrypted credentials, and svix webhook signature verification.
- Led solo incident response on a production host compromise (SSH brute-force into a Go-based botnet): contained the attack, removed malware, ran a 12-point persistence audit, rotated 13 production secrets, hardened SSH to Tailscale-only access, and authored a reusable Linux hardening protocol now used as the baseline for new servers.
- Tech stack: Clerk, LiteLLM, OpenRouter, Keycloak OIDC, Playwright, Next.js, Resend, Hypothesis
LUBOT.AI
Founder & Lead Engineer Jun 2025 - Present
- Solo-built a production-ready multi-tenant AI SaaS platform. 168K lines of production code, 6,683 tests at zero failures, CI green on every push. Two product lines on one platform: SMB AI analytics and AI Financial Advisor (Stock & Crypto).
- Engineered the LLM 120B Smart Router with a 4-tier intent cascade (Deterministic -> Regex -> Embeddings -> NVIDIA Nemotron 3 Super 120B function calling). 95%+ of queries routed at zero LLM cost; 80% of analytics queries served from SQL templates.
- Architected a Netflix-style personalization engine across 9 phases: implicit signal columns (dwell time, query source, mode-switch intent), nightly engagement-metrics worker, conversation middleware, and follow-up rewriting. Agent learns each user's behavior over time.
- Built 9 intelligence modules with a zero-hallucination pipeline and source citations (anomaly detection, driver analysis, Prophet forecasting, concentration risk, FRED macro, options flow, sector rotation, sentiment, catalysts). All arithmetic pre-computed in Python; LLM only writes narrative around named fields.
- Designed the multi-mode SaaS platform powering both product lines on shared infrastructure: identity SSO, central control plane, per-org data stores across 4 schemas, Stripe-webhook provisioning, trial expiry, per-org quota enforcement, usage event writer. 60+ commits across 7 phases.
- Maintained shipping velocity through a 15-step refactor of an 8,000-line monolith into 32 focused modules at 100% test coverage. Zero regressions, new features shipped in parallel during the refactor.
- 100% NVIDIA Nemotron stack with 9 models: Ultra 253B (PhD analysis), Super 120B-a12b (1M-context MoE, agentic), Nano 8B (fast analytics), Vision 12B VL, NV-EmbedQA-E5-V5 (1024-dim embeddings for RAG), plus self-hosted Nemotron-3-Nano 30B and
Nemotron-mini 2.7B on RTX 4090.
- Built production observability and security: Sentry, Langfuse LLM tracing, Prometheus + Grafana metrics on every route, Uptime Kuma, Request-ID middleware, structured JSON logs, RS256 JWT verification, distroless containers + CIS hardening, svix webhook signature verification.
- Deployed on Hetzner VPS with self-hosted CI/CD (Forgejo Actions + GitHub mirror, Tailscale-secured staging-to-prod pipeline, pip + npm caching, real PostgreSQL service container in CI). 6,600+ tests run on every push.
- Tech stack: NVIDIA Nemotron stack, FAISS, AdalFlow, Langfuse, Sentry, Prometheus + Grafana, Vite + React 19, TradingView Charts
REMAX
Data Engineer Jan 2024 - Feb 2025
- Built a Python ETL pipeline processing 10K+ residential property listings from MLS APIs into MySQL. Real estate agents needed fast access to market changes.
- Cut 3+ hours of daily manual data work to 15 minutes with automated refreshes and validation checks.
- Created Tableau dashboards tracking property inventory and pricing trends across 20+ neighborhoods. Helped 15+ agents spot opportunities faster than competitors.
- Tech stack: Python, MySQL, Tableau, MLS APIs
2828 ARTHINGTON CONDOMINIUM ASSOCIATION
Financial Data Manager Jan 2013 - Jan 2024
- Tracked $500K+ annual revenue across 50 units for 12 consecutive years. Managed financial operations using SQL and Excel for lease agreements, vendor contracts, and budget planning.
- Cut month-end close from 3 days to under 2 days by automating reconciliation in Excel. Eliminated manual errors from copy-paste workflows.
- Tech stack: PostgreSQL, MySQL, SQL, Excel, Power BI, Tableau, financial reporting
TECHNICAL PROFICIENCIES
Data & Pipelines: PostgreSQL (Neon multi-tenant), Snowflake, SQL, dbt, Databricks, Delta Lake, Apache Iceberg, Airflow, Soda, Apache Spark, PySpark, Flink, Kafka, Databricks DLT, Python, TypeScript, FastAPI, REST APIs, SSE streaming, Redis
AI/ML: NVIDIA Nemotron (Ultra 253B, Super 120B-a12b, Nano 8B, Vision 12B VL, NV-EmbedQA-E5-V5), NIM API, LiteLLM, Ollama (DeepSeek R1 70B), OpenRouter (Claude Sonnet 4.5), AdalFlow, FAISS, Prophet, Langfuse
Multi-Tenant SaaS: Clerk SSO (RS256 JWT), Supabase control plane, per-tenant database isolation, organization-scoped RBAC, Stripe webhooks, Svix signature verification
Frontend: Vite, React 19, Next.js, TypeScript, Tailwind v4, shadcn/ui, TanStack Query, Zustand, Plotly, TradingView Lightweight Charts
Infrastructure: Hetzner VPS, Linux, Nginx, Docker (distroless), Tailscale, Forgejo Actions (self-hosted CI/CD), Backblaze B2, Git
Tooling: Sentry, Langfuse, Prometheus, Grafana, Uptime Kuma, structured JSON logging, Request-ID middleware, Pytest, Hypothesis
(property-based), Vitest, mypy --strict, Ruff, pre-commit hooks, branch coverage gates
Security: NIST CSF 2.0, NIST IR 8596 (AI), OWASP LLM Top 10, OWASP Agentic Apps 2026, MITRE ATLAS v5.1, CIS Controls v8.1, Zero Trust (NIST SP 800-207), Keycloak OIDC, encrypted credentials, incident response
Data Patterns: CDC, SCD, Dimensional Modeling, Growth Accounting, Write-Audit-Publish, Medallion Architecture
EDUCATION
Associate of Applied Science, Computer Systems Graduated May 2026
Lincoln Land Community College
AI Engineering Track (Highest Distinction) May 2026
DataExpert.io
Data Engineering Track (Highest Distinction) Apr 2026
DataExpert.io
Analytics Engineering Track (Highest Distinction) Mar 2026
DataExpert.io
PROJECTS
SENTINEL-AI - Production AI Inbox Triage Agent (Public on GitHub, LinkedIn-launched)
- Real-time email triage with allowlist, prompt-injection defense, idempotency, fail-closed safety. 181 tests, 93% coverage, ruff + mypy --strict +bandit + pip-audit clean.
- Tech stack: Python 3.13, FastAPI, Gmail API, Google Calendar API, NVIDIA NIM fallback, Postgres, Hetzner
- GitHub: github.com/lubobali/Sentinel-AI
MERGEAI - 5-Agent AI Data Analyst (Scored 100/100)
- Upload any CSV, ask questions in plain English. 3 NVIDIA agents collaborate to write SQL, validate results, and return insights.
- Tech stack: TypeScript, Next.js 15, NVIDIA NIM API, Neon PostgreSQL, Drizzle ORM, Vercel
- Live: merge-ai-omega.vercel.app | Demo: youtu.be/Yr0CkXKNF0M | GitHub: github.com/lubobali/mergeAI
BITCOIN-DLT-STREAMING-PIPELINE - Real-time Crypto Trade Pipeline
- Real-time BTC-USD trade ingestion from Polygon.io WebSocket through Databricks Delta Live Tables, sub-second end-to-end latency. Bronze/Silver/Gold medallion architecture with quarantine pattern for data quality.
- Tech stack: Databricks DLT, PySpark, Delta Lake, Polygon.io WebSocket, Plotly
- GitHub: github.com/lubobali/bitcoin-dlt-streaming-pipeline
SPARK-STOCK-PIPELINE-OPTIMISATION - PySpark Performance Engineering
- 8 anti-patterns fixed in production PySpark code, 4x runtime speedup, correctness bugs caught and resolved during real-world tuning.
- Tech stack: PySpark, Delta Lake, Polygon.io
- GitHub: github.com/lubobali/spark-stock-pipeline-optimisation
ADDITIONAL PROJECTS
View all at: GitHub | LuboBali.com | LuBot.ai
Lubo Bali
Chicago, IL data@lubobali.com See My Portfolio
14 years data experience: 12 in financial operations ($500K+ annual revenue) + 2 in data engineering
Endorsed by Zach Wilson (ex-Netflix, Airbnb, Meta): "Best project in the bootcamp, ready to make money"
Solo-built LuBot.ai - production AI analytics platform (112K lines, 36 tables, 6 NVIDIA models)
WORK EXPERIENCE
LUBOT.AI Jun 2025 – Present
AI/Data Engineer (Founder)
Serves real users with personalized AI analytics - built the entire platform solo: 112K lines Python, 248 files, 36
database tables delivering insights that adapt
to each user
Designed 36-table PostgreSQL schema tracking user behavior patterns - clicks, queries, preferences, interaction
history powering the personalization engine
Implemented 18 nightly workers that learn while users sleep - route optimization, user profiling, baseline calculations
make each session smarter than the last
Built 9-module Intelligence Engine: anomaly detection, driver analysis, correlation discovery, forecasting,
concentration risk - all with domain-aware context
Engineered zero-hallucination pipeline with source citations - every insight traces back to actual data
100% NVIDIA Nemotron stack - 6 models + self-hosted RTX 4090. 4-tier intent routing, 3-tier response system
Tech stack: Python, PostgreSQL, FastAPI, Docker, NVIDIA NIM API, FAISS, Redis, Next.js, React
REMAX Jan 2024 – Jan 2025
Data Engineer
Built Python ETL pipeline processing 10K+ residential property listings from MLS APIs into MySQL - real estate agents needed fast access to market changes
Cut 3+ hours of daily manual data work to 15 minutes with automated refreshes and validation checks
Created Tableau dashboards tracking property inventory and pricing trends across 20+ neighborhoods - helped 15+ agents spot opportunities faster than competitors
Tech stack: Python, MySQL, Tableau, MLS APIs
2828 ARTHINGTON CONDOMINIUM ASSOCIATION Financial Data Manager Jan 2013 – Jan 2024
Tracked $500K+ annual revenue across 50 units for 12 consecutive years - managed financial operations using SQL
and Excel for lease agreements, vendor contracts, and budget planning
Generated monthly/quarterly reports on revenue trends, expense patterns, and budget variances - board used these to make decisions on $100K+ capital improvements
Cut month-end close from 3 days to under 2 days by automating reconciliation in Excel - eliminated manual errors from copy-paste workflows
Tech stack: PostgreSQL, MySQL, SQL, Excel, Power BI, Tableau, financial reporting
TECHNICAL PROFICIENCIES
Data & Pipelines: PostgreSQL, Neon, Snowflake, SQL, dbt, Databricks, Delta Lake, Airflow, Soda, Apache Iceberg, Python,TypeScript, FastAPI, REST APIs, Redis
Streaming: Apache Spark, Flink, Kafka AI/ML: NVIDIA Nemotron (Ultra 253B, Nano 8B, Vision 12B), NIM API, AdalFlow, FAISS, Prophet, Ollama
Data Patterns: Change Data Capture (CDC), SCD, Dimensional Modeling, Growth Accounting
Infrastructure: AWS, Hetzner, Linux, Nginx, Docker, Distroless Containers, Backblaze B2, Git, Tableau
Frontend: Next.js, React, Tailwind CSS, Plotly
Certifications: DataExpert.io Analytics Engineering Excellence (Feb 2026), Data Engineering (Aug 2025), DataAnalytics Accelerator (Jun 2025)
EDUCATION
Associate of Applied Science in Computer Systems Expected Jul 2026
Lincoln Land Community College
Data/AI Analytics Engineering Bootcamp Apr 2026
DataExpert.io - Snowflake, dbt, Airflow, Apache Iceberg, Databricks, Delta Lake, Lakehouse architecture
Data Engineering Bootcamp Aug 2025
DataExpert.io - Dimensional modeling, Apache Spark, Flink, Kafka, Airflow, data quality
Data Analytics Bootcamp Jun 2025
Data Career Jumpstart
PROJECTS
Upload any CSV, ask questions in plain English - 3 NVIDIA agents collaborate to write SQL, validate results, and
return insights.
Tech stack: TypeScript, Next.js 15, NVIDIA NIM API, Neon PostgreSQL, Drizzle ORM, Vercel
Live: merge-ai-omega.vercel.app | Demo: youtu.be/Yr0CkXKNF0M | GitHub: github.com/lubobali/mergeAI
AIRFLOW DATA QUALITY PIPELINE
Production-grade Airflow DAG with write-audit-publish pattern - catches bad data before it hits production.
Tech stack: Airflow, Soda, PostgreSQL, Python | GitHub: github.com/lubobali/airflow-dq-pipeline
ADDITIONAL PROJECTS
View all at: GitHub | LuboBali.com | LuBot.ai

Download Resume📄(PDF)

