Background
The problem
In multi-dimensional datasets, standard sorting only works along a single axis. The skyline problem asks: which objects are not dominated by any other object across all N dimensions simultaneously?
This comes up in database query optimization and multi-criteria decision making — finding hotels that are neither too expensive nor too far away, without being beaten on both dimensions by another option.
Solution
What I built
A self-contained Python class that accepts a dataset and N category dimensions, then returns the skyline — objects not dominated by any other object across all N categories.
- Supports arbitrary N-dimensional inputs, not limited to 2 categories.
- Clean class interface, drop-in ready for any Python project.
- No external dependencies — pure Python stdlib.
- Returns skyline set with dominance metadata for downstream processing.
Engineering
Technical decisions
Object A dominates object B if A is at least as good as B on every category, and strictly better on at least one. The naive dominance check is O(n²·k) for n objects and k dimensions. The implementation prunes dominated candidates early to keep this tractable, and uses a clean class structure that separates the skyline computation from result formatting.
Results
Outcome
- Published as an open source library on GitHub.
- Reusable across any multi-dimensional ranking or filtering use case.
- A clean, readable reference implementation of the skyline algorithm in pure Python.