An analytical solution for a jewelry and accessories e-commerce platform built from scratch. It enabled the Chief Product Officer to shape the product, understand and optimize traffic funnels, and discover new business opportunities.
(metrics shown are adjusted for demo purposes)

Stakeholders had two main goals: understanding the overall performance of the e-commerce store and understanding the shopper journey to identify which funnel steps require attention.

The next step is extending the funnel to include marketing data and transforming it into end-to-end analytics.
Implementation details
The system works as follows: it processes raw API interaction data, transforms it into analytical tables, and serves business intelligence dashboards through Metabase. The architecture supports real-time data ingestion, hourly ETL updates, and historical data tracking, enabling comprehensive analytics of user behavior, catalog interactions, product views, and conversion funnels.
Data Ingestion
Raw API calls are captured in real-time and stored in the pd.api_calls table, containing request/response JSON data, user sessions, endpoints, and timestamps. Data is written in batches with millisecond precision timestamps.
ETL Processing
Python-based ETL scripts organized in jobs.py process raw data into analytical tables:
sessions: Aggregated session data with device information, user types, and behavioral metricscatalog_sessions: Enriched catalog browsing data including search queries, filters (material, section, metal, price ranges), pagination, and sorting preferencesproduct_sessions: Product-level analytics tracking views, favorites, cart additions, and purchase eventssnapshots: Versioned snapshots of production database entities (cart, items, favorites, stock) with change tracking
Data Warehouse Schema
ClickHouse database organized into multiple schemas:
- pd: Raw source data from API calls
- dwh: Analytical tables optimized for BI tools and ad-hoc analysis
- history Historical snapshots with versioning and deletion tracking
- mysql: Database links for seamless data migration from production MySQL
- etl: Temporary tables for ETL processing
Scheduled Jobs
main.py orchestrates scheduled ETL jobs running hourly, with systemd service integration for production deployment. Jobs are designed to support point-in-time reprocessing for specific dates.
Tech Stack
- Language: python
- Database: clickhouse
- BI Tool: metabase
- Orchestration: systemd service for scheduled jobs
- Data Sources: production MySQL database, API calls logs