A Python project that transforms raw xls and pdf files into standardized analytical formats. The idea was to collect crop yield forecasts from different producers in one place and to integrate maritime information as well. It extracts data from numerous unstructured files and unifies it into a final table ready for analysis. This solution saved managers countless hours and enabled large-scale, multimillion-dollar decision-making.

Features
Crop module automatically extracts and cleans agricultural production data from messy XLS files, standardizing crops, regions, and units into a unified schema.
Vessel module scrapes port calls from ShippingExplorer, VesselTracking, VesselFinder. Resolves MMSI–IMO inconsistencies, collects full vessel characteristics from multiple maritime sources and merges everything into a unified vessel database for analytics and fleet segmentation.
Tech Stack
Python & pandas library