0. Introduction
(SFC) are fascinating mathematical constructs with many practical applications in data science and data engineering. While they may sound abstract, they’re often hiding in plain sight—behind terms like Z-ordering or Liquid Clustering (used, for example, in platforms like Databricks). If you’ve worked with large-scale data platforms, chances are you’ve already used SFCs without realizing it.
Despite its relevance in modern systems, information on this topic is often fragmented, making it difficult to bridge theory and practice. This article aims to bridge that gap, while focusing on the Hilbert curve.
My goal is to offer a condensed and accessible overview of SFCs: starting with their mathematical origins, moving through practical implementation techniques, and ending with real-world applications in data processing and optimization. It’s not the plan to replace existing sources but rather reference them for more detailed information. Further sources for terminology and details will be referenced throughout the text.
You might ask: What is so fascinating about curves? After all, a regular curve is easy to understand and probably not the first topic I would pick up a book about. But SFCs are different. They traverse every point in a continuous space, have fractal properties, and produce visually striking patterns when plotted in 2D or 3D-especially in lower iterations. So, let us take a closer look.
(If you want to start with visualization and animations directly, check out my GitHub repository)
1. History and Theory of Space-Filling Curves
The study of SFCs dates back to the 19th century, when Georg Cantor made a groundbreaking discovery. He showed that ”two finite-dimensional smooth manifolds have the same cardinality, regardless of their dimensions.” [1]
To illustrate this, consider the unit interval [0, 1] ⊂ R and the unit square [0, 1]² ⊂ R². Intuitively, one might expect the square to have a larger cardinality than the line segment. However, Cantor demonstrated that both sets actually have the same cardinality, using his method of interleaving decimals.
This result implies the existence of a bijection between the interval and the square, meaning there is a one-to-one correspondence between their elements. Following Cantor’s discovery, a natural question arose: Is there also a continuous bijection between these sets? Eugen Netto answered this question in the negative.
In this context, continuity can be interpreted geometrically: a continuous mapping would allow one to “draw” the image in 2D or 3D without lifting the pen – forming a curve. This insight laid the groundwork for the later development of SFCs — curves that, while continuous, can come arbitrarily close to filling a higher-dimensional
space.
2. Peano Curve: The Discovery of Space-Filling Curves
After Netto’s sobering discovery, the question arose as to whether such a mapping, if not bijective, could be surjective. The first one who was able to define such a mapping was G. Peano, constructing the so-called Peano curve.
The Peano curve is defined recursively. Its domain is the unit interval [0, 1] ⊂ R, and its image lies in the unit square [0, 1]² ⊂ R². By repeatedly subdividing the interval [0, 1] into thirds, and correspondingly partitioning the square in R² into a 3 × 3 grid, the construction converges to the actual space-filling curve as the number of iterations tends to infinity. [1]
The image of the Peano curve of order 1 is copied and mirrored in higher orders. It can be observed that the basic pattern of the first-order Peano curve reappears in higher orders, but is mirrored in every second iteration. This alternating process of mirroring and rotating the basic element is a feature shared by other SFCs as well.
(Image from Wikipedia under public domain license, modified by author)
Thus, the graphs of the Peano curve at finite iterations (Figure 1) do not represent the “final” SFC. Only in the limit, as the number of iterations of this recursive mapping approaches infinity, does the actual SFC emerge, which traverses every point in [0, 1]². Visually, in this limit, the curve would essentially appear as a filled square spanning from (0, 0) to (1, 1)
This observation raises an initially counterintuitive question: By definition, a curve is one-dimensional. While it can be embedded in a higher-dimensional space (n > 1), its intrinsic parameter domain remains one-dimensional. Yet, if the Peano curve passes through every point in [0, 1]² and thus completely fills the plane, can its image still be regarded as one-dimensional? The answer is no: the image of the Peano curve has Hausdorff dimension 2. Another characteristic of an SFC is that its image has positive Jordan content (Peano-Jordan Measure). These facts may seem surprising, but it aligns with the properties of fractals: many such sets have Hausdorff dimensions greater than 1, and some even non-integer Hausdorff dimensions.
3. The Hilbert Curve – Popular till today!
Although Peano was the first to construct an SFC, a much more famous example is the Hilbert curve, defined by David Hilbert in 1891. Its definition is slightly simpler and begins with a 2 x 2 grid. Like the Peano curve, the mapping of the Hilbert curve recursively subdivides each interval in [0, 1] and each square in [0, 1]² into four smaller intervals/squares at each step. As with the Peano curve, the Hilbert curve converges to a true SFC in the limit as the number of iterations approaches infinity.
(Image by author)
For the purposes of this article, we will focus on the Hilbert curve, as its properties make it a valuable tool in modern data platforms.
3.1 Formal Definition of the Hilbert Curve
Starting with the interval [0,1] as the domain of the Hilbert curve, each recursion step divides the current interval into four equal subintervals: a is the left endpoint and h the interval width, the subintervals are:
For any chosen point in [0, 1], exactly one of these subintervals contains the point. This interval can then be subdivided again using the same rule, producing a finer interval that still contains the point. This process can be continued infinitely, yielding an arbitrarily precise location of the point along the curve. The same recursive subdivision is applied in [0, 1]² in parallel, splitting each square into four smaller squares:
General properties:
- Surjective: From its recursive definition it follows that the Hilbert curve is surjective: every point in [0, 1]² is covered in the limit. The nested intervals are compact, and adjacent intervals share boundary points (e.g., a + h/4 is both the right endpoint of the first subinterval and the left endpoint of the second).
Thus the entire square is filled. The mapping, however, is not injective—attempts to enforce bijectivity (e.g., by opening intervals) break continuity. - Continuous: This property is clear from visual representations: the curve can be drawn without lifting the pen. Formally, it can be established by showing that the Hilbert curve arises as the uniform limit of continuous functions, and uniform convergence preserves continuity.
- Nowhere differentiable: By looking at graphs of the Hilbert Curve it is obvious that this curve is not
differentiable. A proof for this property was given by H.Sagan using the difference quotient. - Locality preserving: In contrast to simpler mappings such as the Z-order curve, the Hilbert curve tends to preserve locality: points that are close in the one-dimensional parameter are often mapped to nearby. This aspect is crucial for applications in big data platforms.
- Positive Jordan Content: In the limit of infinitely many iterations, the image of the Hilbert curve has positive Jordan measure, meaning that it occupies a nonzero area of the plane. (Peano-Jordan Measure)
- Hausdorff Dimension of 2: Correspondingly, the Hilbert curve does not behave like a usual one-dimensional line, but has Hausdorff dimension 2, reflecting that it fully fills the unit square.
Even though, early definitions of the Hilbert Curve are approached in 2D, higher dimensions are also feasible. The algorithm we discuss in the next section works in any finite dimension.
4 Computing the Hilbert Curve With Skilling’s Algorithm
The definition of the Hilbert Curve was given in a geometrical manner without an algebraic definition for computing coordinates on a given grid, for a given point in I. It took almost 100 years after Hilbert released his idea before mathematicians thought about ways how to compute points for a given Hilbert index. Who could blame them? After all, for a long time there were no computers that could draw curves with hundreds or thousands of points. While researching I discovered multiple ways how to compute the Hilbert curve – from complex numbers to L-Systems. While some are super extensive, others preserve the iterative approach for computing single points of the curve. What I was looking for was something simple:
- A function that takes a Hilbert index (i.e. any numbers like 1,2,3 in 1D space) and returns its coordinates. You can consider the Hilbert index as the number of the interval from left to right for Hilbert Curve of order
- A function that does the inverse, mapping a coordinate back to its Hilbert index.
While searching the internet for possible implementations I came across a Github repository of Princeton University implementing the algorithm of John Skilling, that was published in a paper from 2004 called Programming the Hilbert Curve. Unfortunately, this paper is not freely available for the public, so I decided to analyze the code from the Princeton repository.
4.1 Skilling’s Algorithm – Overview
Skilling observed that mapping Hilbert indices to coordinates can be expressed elegantly in terms of binary operations. For example, consider the indices 0, 1, 2, 3 in one dimension. These correspond to the coordinates (0, 0), (1, 0), (1, 1), (0, 1) in a 2 × 2 grid. Here, the values 0, 1, 2, 3 no longer represent fractional points in the unit interval (like 1/3), but instead discrete interval numbers. With a 2 × 2 grid, there are exactly four intervals in [0, 1] and four corresponding squares in [0, 1]2. Skilling’s algorithm generalizes this idea. It computes the mapping from a Hilbert index to its corresponding coordinate (and vice versa) in any finite dimension using binary transformations. The essential steps are:
- Convert the Hilbert index from decimal to binary.
- Transform the binary number into its Gray code representation.
- Disentangle the Gray code into a coordinate structure.
- Apply rotations and reflections using XOR operations.
- Convert the binary coordinates back to decimal
4.2 Binary Representation
To understand why binaries are much better suited for computing points of the Hilbert Curve from Hilbert Indices and vice versa the following examples might help (we discuss everything in 2D, but the algorithm works in any dimensional space):
The Hilbert Curve is defined on a 2×2, 4×4, 8×8, 16×16…etc. grid. (Remember the definition above and its recursive approach).
By looking at the numbers, one might discover that the number of intervals grow with 2n, where n is the order of the curve. This matches perfectly with binary encoding: for an n-th order curve, we
need exactly n bits per axis to describe the grid.
Take the 4 × 4 grid (second order) as an example. Two bits per axis are sufficient:
- The first bit identifies the major quadrant (lower left, upper left, lower right, or upper right).
- The second bit specifies the position inside that quadrant.
For instance, Hilbert index 2 has the binary form 0010. Interpreting this:
- 00 selects the lower-left quadrant.
- 10 selects the upper-right subquadrant inside it.
subquadrant. Consider the repetitive pattern of 00, 01, 10, 11 in every quadrant, forming a Hilbert curve of
order 1. (Image by author)
However, if we continue this process for indices greater than 3, we encounter a challenge: the orientation of the curve changes from one quadrant to the next. Correctly handling these rotations and reflections is exactly where Gray code and XOR operations (as in Skilling’s algorithm) become essential.
4.3 Gray Code Representation
The next step in Skilling’s algorithm is a transformation from binary to Gray code. The key difference is that in Gray code, consecutive numbers differ in only one bit. This property is crucial: It ensures that the curve moves smoothly from one quadrant to the next (even though the orientation of the curve in each quadrant is still not correct)
By looking at the binary numbers and the orientation of the different sections of the curve, we can see that the curve is still not correct, but the end of each quadrant now connects to the beginning of the next.
as the first cell of the next (Image by author)
4.4 Disentanglement of the Bits
The real “magic” of Skilling’s method begins with a reordering of the Gray-coded bits—a step called disentanglement. In our 4 × 4 example, we originally interpreted the four bits as (bitx1, bity1, bitx2, bity2) where the first pair encodes the major quadrant and the second pair the sub-quadrant. However, for coordinate computation we need a structure of the form (bitx1, bitx2, bity1, bity2) so that all x-bits and y-bits can later be combined into the respective decimal coordinates (x, y). This step is called disentanglement of the bits.
4.5 Corrective Transformations
After disentangling the bits, the final step of Skilling’s algorithm is to rotate and mirror the subcurves within each quadrant so that they connect seamlessly into the Hilbert curve of order n.
Figure 6 illustrates this process for the 4 × 4 case. The table on the left shows how Gray-coded coordinates are converted into standard binary numbers by applying simple transformations: swaps and bit-flips.
The diagram on the right visualizes the effect: the upper quadrants are rotated by 180◦, the lower quadrants are mirrored along the diagonal, and in some cases (e.g. the yellow quadrant) no transformation is needed at all.
The key insight is that after these corrective transformations, the coordinates are once again in standard binary form. This means that the output of Skilling’s algorithm can be converted directly to decimal coordinates in the format (x, y), without further adjustment
Skilling algorithm key transformations: Input: Gray code formatted (bitx1, bitx2, bity1, bity2) In python the format would be: [-1, ndims, nbits]. Example: The number 4 would be represented as the following list/np-array: [[01],[10]]. For the x-Dimension 1 is the least significant bit (LSB), and 0 the most significant bit
(MSB).
- Loop from the most significant bit (MSB) to least significant bit (LSB)
- Innerloop from highest dimension (y in 2D) to lowest dimension
- : Look at the current bit. If 1: Flip every lower bit in dimension 0 (usually x) If 0: Swap values between
lower bits in current dimension and dimension 0 (if they differ).
Step 3 can be easily computed with numpy using XOR operations. The whole process of flipping and swapping bits in each iteration is visualized in the following animations.
If you want to analyze the algorithm in more detail or simply generate your own animations in 2D or 3D, check out my GitHub Repository
5 Applications of Space Filling Curves
After discussing theoretical aspects and implementation details of the Hilbert Curve, the question arises, where it can be applied. During the implementation we saw how to transform Hilbert Indices into coordinates. For the following application, the inverse of this process is more interesting.
One valuable aspect of the Hilbert Curve is that it maps a 1D ordered set (i.e. 1,2,3…) to coordinates in an n-dimensional space. It gives an order to the points it traverses and it can live in vector spaces of arbitrary size. Thus, the Hilbert Curve is used for data partitioning and cluster, image compression and also for building features in machine learning, when dealing with spatial data.
5.1 Data Partitioning/Clustering using SFCs
One of the most prominent applications of SFCs is data partitioning. For example, in Databricks, Z-ordering is based on the Z-curve, while liquid clustering relies on the Hilbert Curve. The reason is simple:
the Hilbert curve preserves locality better than the Z-curve, which is crucial when indexing and partitioning multidimensional data. In figure 9 you can see how some exemplary data points are mapped to points of the Hilbert curve, by assigning each point to one partition given by the curve.
exemplarily (Image by author)
When a query is applied to the data (e.g. SELECT * FROM table WHERE x in (1,2) and y in (2,3), all points in this range ((1,2), (1,3), (2,2), (2,3)) are converted to Hilbert indices and the system can directly retrieve all matching entries. The key advantage is that this mapping enables fast and flexible data retrieval. Unlike traditional indexing, the Hilbert-based partitioning adapts naturally to updates or growth in the dataset — without requiring the entire index to be recomputed.
5.2 Data Indexing: Hilbert Curve vs. Z-Curve
To highlight the practical advantages of the Hilbert curve, I compared its performance with the Z-curve on a set of synthetic range queries.
For the experiment, I generated 100 random range queries of fixed size. For each query, I computed the Hilbert and Z-curve indices and counted the number of clusters, while a cluster is a set of consecutive indices. For example, if the query returned the indices [1,2,3,5,6,8,9], this would form three clusters: [1,2,3], [5,6], and [8,9].
If the data is stored in index order, clusters correspond to sequential reads, whereas gaps between clusters imply costly jumps to new storage addresses.
(Image by author)
To quantify performance, I used two metrics:
- Cluster count: Fewer clusters imply less fragmentation and fewer storage jumps.
- Intra-cluster spread: The average number of indices per cluster
The worst-case scenario would be extreme fragmentation: every point forming a cluster of its own. Figure 11 compares the performance for the Z-curve and Hilbert curve for two, three and four dimensions, a query size of 7 (7×7 in 2D, 7x7x7 in 3D etc.) and 6 bits per axis (i.e. 64 values per axis)
The results clearly show that the Hilbert curve preserves locality much better than the Z-curve. Across all tested dimensions, queries result in fewer clusters and thus higher intra-cluster density with Hilbert indices. In practice, this translates into more efficient data retrieval and reduced I/O costs, particularly for multidimensional range queries.
6 Beyond Space-Filling Curves
The goal of this article was to illustrate the elegance of SFCs and to give a glimpse into their applications in data indexing. However, the latest research in this field goes beyond classical SFCs.
The main limitation of all space-filling curves is their fixed mechanism. Once defined, their structure offers little room for adaptation to different datasets or workload patterns. In practice, this rigidity can limit performance.
To overcome this, researchers such as Chen et al. (University of Electronic Science and Technology of China & Huawei) have proposed AdaCurve, a machine learning–based approach. Instead of relying on a predetermined mapping, AdaCurve trains a model to generate a one-dimensional index directly from high-dimensional data points, optimized according to both the dataset and the query workload. [3]
This idea is highly promising: while Hilbert and other SFCs offer elegant but rigid mappings, AdaCurve adapts dynamically, producing an indexing system that is tailored to the data and queries at hand. Such adaptability could pave the way for significantly more efficient indexing in large-scale data platforms in the future.
References
[1] H. Sagan, Space-Filling Curves. Springer-Verlag, 1994.
[2] M. Bader, Space-Filling Curves – An Introduction with Applications in Scientific Computing. Springer-Verlag, 2013.
[3] X. CHEN, “Optimizing block skipping for high-dimensional data with learned adaptive curve,” SIGMOD, vol. 3, 2025. [Online]. Available:https://zheng- kai.com/paper/2025_sigmod_chen.pdf
Source link
#Beauty #SpaceFilling #Curves #Understanding #Hilbert #Curve