...

Eulerian Melodies: Graph Algorithms for Music Composition


composers are known to reuse motifs (i.e., characteristic note progressions or melodic fragments) across their works. For example, famous Hollywood composers such as John Williams (Superman, Star Wars, Harry Potter) and Hans Zimmer (Inception, Interstellar, The Dark Knight) deftly recycle motifs to create instantly recognizable, signature soundtracks.

In this article, we show how to do something similar using data science. Specifically, we will compose music by drawing on the graph-theoretic concept of Eulerian paths to assemble stochastically generated musical motifs into acoustically pleasing melodies. After providing an overview of theoretical concepts and a canonical use case to ground our understanding of the fundamentals, we will walk through an end-to-end Python implementation of the algorithmic music composition procedure.

Note: All figures in the following sections have been created by the author of this article.

A Primer on Eulerian Paths

Suppose we have a graph consisting of nodes and edges. The degree of a node in an undirected graph refers to the number of edges connected to that node. The in-degree and out-degree of a node in a directed graph refer to the number of incoming and outgoing edges for that node, respectively. A Eulerian path is defined as a walk along the nodes and edges of a graph that starts at some node and ends at some node, and visits each edge exactly once; if we start and end at the same node, this is called a Eulerian circuit.

In an undirected graph, a Eulerian path exists if and only if zero or two nodes have an odd degree, and all nodes with nonzero degree are part of a single connected component in the graph. Meanwhile, in a directed graph, a Eulerian path exists if and only if at most one node (the starting node) has one more outgoing edge than incoming edge, at most one node (the ending node) has one more incoming edge than outgoing edge, all other nodes have equal incoming and outgoing edges, and all nodes with nonzero in-degree or out-degree are part of a single connected component. The constraints related to being part of a single connected component ensure that all edges in the graph are reachable.

Figures 1 and 2 below show graphical representations of the Seven Bridges of Königsberg and the House of Nikolaus, respectively. These are two famous puzzles that involve finding a Eulerian path.

Figure 1: The Königsberg Problem

In Figure 1, two islands (Kneiphof and Lomse) are connected to each other and the two mainland parts (Altstadt and Vorstadt) of the city of Königsberg in Prussia by a total of seven bridges. The question is whether there is any way to visit all four parts of the city using each bridge exactly once; in other words, we want to know whether a Eulerian path exists for the undirected graph shown in Figure 1. In 1736, the famous mathematician Leonhard Euler — after whom Eulerian paths and circuits get their name — showed that such a path cannot exist for this particular problem. We can see why using the definitions outlined previously: all four parts (nodes) of the city of Königsberg have an odd number of bridges (edges), i.e., it is not the case that zero or two nodes have an odd degree.

Figure 2: The House of Nikolaus Puzzle

In Figure 2, the objective is to draw the House of Nikolaus starting at any of the five corners (nodes marked 1-5) and tracing each of the lines (edges) exactly once. Here, we see that two nodes have a degree of four, two nodes have a degree of three, and one node has a degree of two, so a Eulerian path must exist. In fact, as the following animation shows, it is apparently possible to construct 44 distinct Eulerian paths for this puzzle:

Source: Wikipedia (CC0 1.0 Universal)

Eulerian paths can be derived programmatically using Hierholzer’s algorithm as explained in the video below:

Hierholzer’s algorithm uses a search technique called backtracking, which this article covers in more detail.

Eulerian Paths for Fragment Assembly

Given a set of nodes that represent fragments of information, we can use the concept of Eulerian paths to piece the fragments together in a meaningful way.

To see how this could work, let us start by considering a problem that does not require much domain know-how: given a list of positive two-digit integers, is it possible to arrange these integers in a sequence x1, x2, …, xn such that the tens digit of integer xi matches the units digit of integer xi+1? Suppose we have the following list: [22, 23, 25, 34, 42, 55, 56, 57, 67, 75, 78, 85]. By inspection, we note that, for example, if xi = 22 (with units digit 2), then xi+1 can be 23 or 25 (tens digit 2), whereas if xi = 78, then xi+1 can only be 85. Now, if we translate the list of integers into a directed graph, where each digit is a node, and each two-digit integer is modeled as a directed edge from its tens digit to its units digit, then finding a Eulerian path in this directed graph will give us one possible solution to our problem as required. A Python implementation of this approach is shown below:

from collections import defaultdict

def find_eulerian_path(numbers):
    # Initialize graph
    graph = defaultdict(list)
    indeg = defaultdict(int)
    outdeg = defaultdict(int)
    
    for num in numbers:
        a, b = divmod(num, 10)  # a = tens digit, b = units digit
        graph[a].append(b)
        outdeg[a] += 1
        indeg[b] += 1
    
    # Find start node
    start = None
    start_nodes = end_nodes = 0
    for v in set(indeg) | set(outdeg):
        outd = outdeg[v]
        ind = indeg[v]
        if outd - ind == 1:
            start_nodes += 1
            start = v
        elif ind - outd == 1:
            end_nodes += 1
        elif ind == outd:
            continue
        else:
            return None  # No Eulerian path possible
    
    if not start:
        start = numbers[0] // 10  # Arbitrary start if Eulerian circuit
    
    if not ( (start_nodes == 1 and end_nodes == 1) or (start_nodes == 0 and end_nodes == 0) ):
        return None  # No Eulerian path
    
    # Use Hierholzer's algorithm
    path = []
    stack = [start]
    local_graph = {u: list(vs) for u, vs in graph.items()}
    
    while stack:
        u = stack[-1]
        if local_graph.get(u):
            v = local_graph[u].pop()
            stack.append(v)
        else:
            path.append(stack.pop())
    
    path.reverse()  # We get the path in reverse order due to backtracking
    
    # Convert the path to a solution sequence with the original numbers
    result = []
    for i in range(len(path) - 1):
        result.append(path[i] * 10 + path[i+1])
    
    return result if len(result) == len(numbers) else None


given_integer_list = [22, 23, 25, 34, 42, 55, 56, 57, 67, 75, 78, 85]
solution_sequence = find_eulerian_path(given_integer_list)
print(solution_sequence)

Result:

[23, 34, 42, 22, 25, 57, 78, 85, 56, 67, 75, 55]

DNA fragment assembly is a canonical use case of the above procedure in the area of bioinformatics. Essentially, during DNA sequencing, scientists obtain several short DNA fragments that must be stitched together to derive viable candidates for the full DNA sequence, and this can potentially be done relatively efficiently using the concept of a Eulerian path (see this paper for more details). Each DNA fragment, known as a k-mer, consists of k letters drawn from the set { A, C, G, T } denoting the nucleotide bases that can make up a DNA molecule; e.g., ACT and CTG would be 3-mers. A so-called de Bruijn graph can now be constructed with nodes representing (k-1)-mer prefixes (e.g., AC for ACT and CT for CTG), and directed edges denoting an overlap between the source and destination nodes (e.g., there would be an edge going from AC to CT due to the overlapping letter C). Deriving a viable candidate for the full DNA sequence amounts to finding a Eulerian path in the de Bruijn graph. The video below shows a worked example:

An Algorithm for Generating Melodies

If we have a set of fragments that represent musical motifs, we can use the approach outlined in the previous section to arrange the motifs in a sensible sequence by translating them to a de Bruijn graph and identifying a Eulerian path. In the following, we will walk through an end-to-end implementation of this in Python. The code has been tested on macOS Sequoia 15.6.1.

Part 1: Installation and Project Setup

First, we need to install FFmpeg and FluidSynth, two tools that are useful for processing audio data. Here is how to install both using Homebrew on a Mac:

brew install ffmpeg
brew install fluid-synth

We will also be using uv for Python project management. Installation instructions can be found here.

Now we will create a project folder called eulerian-melody-generator, a main.py file to hold the melody-generation logic, and a virtual environment based on Python 3.12:

mkdir eulerian-melody-generator
cd eulerian-melody-generator
uv init --bare
touch main.py
uv venv --python 3.12
source .venv/bin/activate

Next, we need to create a requirements.txt file with the following dependencies, and place the file in the eulerian-melody-generator directory:

matplotlib==3.10.5
midi2audio==0.1.1
midiutil==1.2.1
networkx==3.5

The packages midi2audio and midiutil are needed for audio processing, while matplotlib and networkx will be used to visualize the de Bruijn graph. We can now install these packages in our virtual environment:

uv add -r requirements.txt

Execute uv pip list to verify that the packages have been installed.

Finally, we will need a SoundFont file to render the audio output in response to MIDI data. For the purposes of this article, we will use the file TimGM6mb.sf2, which can be found on this MuseScore site or downloaded directly from here. We will place the file next to main.py in the eulerian-melody-generator directory.

Part 2: Melody Generation Logic

Now, we will implement the melody generation logic in main.py. Let us start by adding the relevant import statements and defining some useful lookup variables:

import os
import random
import subprocess
from collections import defaultdict
from midiutil import MIDIFile
from midi2audio import FluidSynth
import networkx as nx
import matplotlib.pyplot as plt

# Resolve the SoundFont path (assume this is same as working directory)
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
SOUNDFONT_PATH = os.path.abspath(os.path.join(BASE_DIR, ".", "TimGM6mb.sf2"))

# 12‑note chromatic reference
NOTE_TO_OFFSET = {
    "C": 0, "C#":1, "D":2, "D#":3, "E":4,
    "F":5, "F#":6, "G":7, "G#":8, "A":9,
    "A#":10, "B":11
}

# Popular pop‑friendly interval patterns (in semitones from root)
MAJOR          = [0, 2, 4, 5, 7, 9, 11]
NAT_MINOR      = [0, 2, 3, 5, 7, 8, 10]
MAJOR_PENTA    = [0, 2, 4, 7, 9]
MINOR_PENTA    = [0, 3, 5, 7, 10]
MIXOLYDIAN     = [0, 2, 4, 5, 7, 9, 10]
DORIAN         = [0, 2, 3, 5, 7, 9, 10]

We will also define a couple of helper functions to create a dictionary of scales in all twelve keys:

def generate_scales_all_keys(scale_name, intervals):
    """
    Build a given scale in all 12 keys.
    """
    scales = {}
    chromatic = [*NOTE_TO_OFFSET]  # Get dict keys
    for i, root in enumerate(chromatic):
        notes = [chromatic[(i + step) % 12] for step in intervals]
        key_name = f"{root}-{scale_name}"
        scales[key_name] = notes
    return scales


def generate_scale_dict():
    """
    Build a master dictionary of all keys.
    """
    scale_dict = {}
    scale_dict.update(generate_scales_all_keys("Major", MAJOR))
    scale_dict.update(generate_scales_all_keys("Natural-Minor", NAT_MINOR))
    scale_dict.update(generate_scales_all_keys("Major-Pentatonic", MAJOR_PENTA))
    scale_dict.update(generate_scales_all_keys("Minor-Pentatonic", MINOR_PENTA))
    scale_dict.update(generate_scales_all_keys("Mixolydian", MIXOLYDIAN))
    scale_dict.update(generate_scales_all_keys("Dorian", DORIAN))
    return scale_dict

Next, we will implement functions to generate k-mers and their corresponding de Bruijn graph. Note that the k-mer generation is constrained to guarantee a Eulerian path in the de Bruijn graph. We also use a random seed during k-mer generation to ensure reproducibility:

def generate_eulerian_kmers(k, count, scale_notes, seed=42):
    """
    Generate k-mers over the given scale that form a connected De Bruijn graph with a guaranteed Eulerian path.
    """
    random.seed(seed)
    if count  0]
    end_candidates   = [n for n in nodes if in_deg[n] - out_deg[n] > 0]
    if len(start_candidates) > 1 or len(end_candidates) > 1:
        # For simplicity: regenerate until condition met
        return generate_eulerian_kmers(k, count, scale_notes, seed+1)

    return edges


def build_debruijn_graph(kmers):
    """
    Build a De Bruijn-style graph.
    """
    adj = defaultdict(list)
    in_deg = defaultdict(int)
    out_deg = defaultdict(int)
    for kmer in kmers:
        prefix = tuple(kmer[:-1])
        suffix = tuple(kmer[1:])
        adj[prefix].append(suffix)
        out_deg[prefix] += 1
        in_deg[suffix]   += 1
    return adj, in_deg, out_deg

We will implement a function to visualize and save the de Bruijn graph for later use:

def generate_and_save_graph(graph_dict, output_file="debruijn_graph.png", seed=100, k=1):
    """
    Visualize graph and save it as a PNG.
    """
    # Create a directed graph
    G = nx.DiGraph()

    # Add edges from adjacency dict
    for prefix, suffixes in graph_dict.items():
        for suffix in suffixes:
            G.add_edge(prefix, suffix)

    # Layout for nodes (larger k means more spacing between nodes)
    pos = nx.spring_layout(G, seed=seed, k=k)

    # Draw nodes and edges
    plt.figure(figsize=(10, 8))
    nx.draw_networkx_nodes(G, pos, node_size=1600, node_color="skyblue", edgecolors="black")
    nx.draw_networkx_edges(
        G, pos, 
        arrowstyle="-|>", 
        arrowsize=20, 
        edge_color="black",
        connectionstyle="arc3,rad=0.1",
        min_source_margin=20,
        min_target_margin=20
    )
    nx.draw_networkx_labels(G, pos, labels={node: " ".join(node) for node in G.nodes()}, font_size=10)

    # Edge labels
    edge_labels = { (u,v): "" for u,v in G.edges() }
    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_color="red", font_size=8)

    plt.axis("off")
    plt.tight_layout()
    plt.savefig(output_file, format="PNG", dpi=300)
    plt.close()
    print(f"Graph saved to {output_file}")

Next, we will implement functions to derive a Eulerian path in the de Bruijn graph, and flatten the path into a sequence of notes. In a departure from the DNA fragment assembly approach discussed earlier, we will not deduplicate the overlapping portions of the k-mers during the flattening process to allow for a more aesthetically pleasing melody:

def find_eulerian_path(adj, in_deg, out_deg):
    """
    Find an Eulerian path in the De Bruijn graph.
    """
    start = None
    for node in set(list(adj) + list(in_deg)):
        if out_deg[node] - in_deg[node] == 1:
            start = node
            break
    if start is None:
        start = next(n for n in adj if adj[n])
    stack = [start]
    path  = []
    local_adj = {u: vs[:] for u, vs in adj.items()}
    while stack:
        v = stack[-1]
        if local_adj.get(v):
            u = local_adj[v].pop()
            stack.append(u)
        else:
            path.append(stack.pop())
    return path[::-1]


def flatten_path(path_nodes):
    """
    Flatten a list of note tuples into a single list.
    """
    flattened = []
    for kmer in path_nodes:
        flattened.extend(kmer)
    return flattened

Now, we will write some functions to compose and export the melody as an MP3 file. The key function is compose_and_export, which adds variation to the rendering of the notes that make up the Eulerian path (e.g., different note lengths and octaves) to ensure that the resulting melody does not sound too monotonous. We also suppress/redirect verbose output from FFmpeg and FluidSynth:

def note_with_octave_to_midi(note, octave):
    """
    Helper function for converting a musical pitch like "C#" 
    in some octave into its numeric MIDI note number.
    """
    return 12 * (octave + 1) + NOTE_TO_OFFSET[note]


@contextlib.contextmanager
def suppress_fd_output():
    """
    Redirects stdout and stderr at the OS file descriptor level.
    This catches output from C libraries like FluidSynth.
    """
    with open(os.devnull, 'w') as devnull:
        # Duplicate original file descriptors
        old_stdout_fd = os.dup(1)
        old_stderr_fd = os.dup(2)
        try:
            # Redirect to /dev/null
            os.dup2(devnull.fileno(), 1)
            os.dup2(devnull.fileno(), 2)
            yield
        finally:
            # Restore original file descriptors
            os.dup2(old_stdout_fd, 1)
            os.dup2(old_stderr_fd, 2)
            os.close(old_stdout_fd)
            os.close(old_stderr_fd)


def compose_and_export(final_notes,
                       bpm=120,
                       midi_file="output.mid",
                       wav_file="temp.wav",
                       mp3_file="output.mp3",
                       soundfont_path=SOUNDFONT_PATH):

    # Classical-style rhythmic motifs
    rhythmic_patterns = [
        [1.0, 1.0, 2.0],           # quarter, quarter, half
        [0.5, 0.5, 1.0, 2.0],      # eighth, eighth, quarter, half
        [1.5, 0.5, 1.0, 1.0],      # dotted quarter, eighth, quarter, quarter
        [0.5, 0.5, 0.5, 0.5, 2.0]  # run of eighths, then half
    ]

    # Build an octave contour: ascend then descend
    base_octave = 4
    peak_octave = 5
    contour = []
    half_len = len(final_notes) // 2
    for i in range(len(final_notes)):
        if i = len(final_notes):
                break
            octave = contour[note_index]
            events.append((final_notes[note_index], octave, dur))
            note_index += 1

    # Write MIDI
    mf = MIDIFile(1)
    track = 0
    mf.addTempo(track, 0, bpm)
    time = 0
    for note, octv, dur in events:
        pitch = note_with_octave_to_midi(note, octv)
        mf.addNote(track, channel=0, pitch=pitch,
                   time=time, duration=dur, volume=100)
        time += dur
    with open(midi_file, "wb") as out_f:
        mf.writeFile(out_f)

    # Render to WAV
    with suppress_fd_output():
        fs = FluidSynth(sound_font=soundfont_path)
        fs.midi_to_audio(midi_file, wav_file)

    # Convert to MP3
    subprocess.run(
        [
            "ffmpeg", "-y", "-hide_banner", "-loglevel", "quiet", "-i", 
            wav_file, mp3_file
        ],
        check=True
    )

    print(f"Generated {mp3_file}")

Finally, we will demonstrate how the melody generator can be used in the if name == "main" section of the main.py. Several parameters — the scale, tempo, k-mer length, number of k-mers, number of repetitions (or loops) of the Eulerian path, and the random seed — can be varied to produce different melodies:

if __name__ == "__main__":
    
    SCALE = "C-Major-Pentatonic" # Set "key-scale" e.g. "C-Mixolydian"
    BPM = 200  # Beats per minute (musical tempo)
    KMER_LENGTH = 4  # Length of each k-mer
    NUM_KMERS = 8  # How many k-mers to generate
    NUM_REPEATS = 8  # How often final note sequence should repeat
    RANDOM_SEED = 2  # Seed value to reproduce results

    scale_dict = generate_scale_dict()
    chosen_scale = scale_dict[SCALE]
    print("Chosen scale:", chosen_scale)

    kmers = generate_eulerian_kmers(k=KMER_LENGTH, count=NUM_KMERS, scale_notes=chosen_scale, seed=RANDOM_SEED)
    adj, in_deg, out_deg = build_debruijn_graph(kmers)
    generate_and_save_graph(graph_dict=adj, output_file="debruijn_graph.png", seed=20, k=2)
    path_nodes = find_eulerian_path(adj, in_deg, out_deg)
    print("Eulerian path:", path_nodes)

    final_notes = flatten_path(path_nodes) * NUM_REPEATS  # Several loops of the Eulerian path
    mp3_file = f"{SCALE}_v{RANDOM_SEED}.mp3"  # Construct a searchable filename
    compose_and_export(final_notes=final_notes, bpm=BPM, mp3_file=mp3_file)

Executing uv run main.py produces the following output:

Chosen scale: ['C', 'D', 'E', 'G', 'A']
Graph saved to debruijn_graph.png
Eulerian path: [('C', 'C', 'C'), ('C', 'C', 'E'), ('C', 'E', 'D'), ('E', 'D', 'E'), ('D', 'E', 'E'), ('E', 'E', 'A'), ('E', 'A', 'D'), ('A', 'D', 'A'), ('D', 'A', 'C')]
Generated C-Major-Pentatonic_v2.mp3

As a simpler alternative to following the steps above, the author of this article has created a Python library called emg to achieve the same result, assuming FFmpeg and FluidSynth have already been installed (see details here). Install the library with pip install emg or uv add emg and use it as shown below:

from emg.generator import EulerianMelodyGenerator

# Path to your SoundFont file
sf2_path = "TimGM6mb.sf2"

# Create a generator instance
generator = EulerianMelodyGenerator(
    soundfont_path=sf2_path,
    scale="C-Major-Pentatonic",
    bpm=200,
    kmer_length=4,
    num_kmers=8,
    num_repeats=8,
    random_seed=2
)

# Run the full pipeline
generator.run_generation_pipeline(
    graph_png_path="debruijn_graph.png",
    mp3_output_path="C-Major-Pentatonic_v2.mp3"
)

(Optional) Part 3: Converting MP3 to MP4

We can use FFmpeg to convert the MP3 file to an MP4 file (taking the PNG export of the de Bruijn graph as cover art), which can be uploaded to platforms such as YouTube. The option -loop 1 repeats the PNG image for the whole audio length, -tune stillimage optimizes the encoding for static images, -shortest makes sure that the video stops roughly when the audio ends, and -pix_fmt yuv420p ensures that the output pixel format is compatible with most players:

ffmpeg -loop 1 -i debruijn_graph.png -i C-Major-Pentatonic_v2.mp3 \
  -c:v libx264 -tune stillimage -c:a aac -b:a 192k \
  -pix_fmt yuv420p -shortest C-Major-Pentatonic_v2.mp4

Here is the end result uploaded to YouTube:

The Wrap

In this article, we have seen how an abstract subject like graph theory can have a practical application in the seemingly unrelated area of algorithmic music composition. Interestingly, our use of stochastically generated musical fragments to construct the Eulerian path, and the random variations in note length and octave, echo the practice of aleatoric music composition (alea being the Latin word for “dice”), in which some aspects of the composition and its performance are left to chance.

Beyond music, the concepts discussed in the above sections have practical data science applications in a number of other areas, such as bioinformatics (e.g., DNA fragment assembly), archeology (e.g., reconstructing ancient artifacts from scattered fragments at excavation sites), and logistics (e.g., optimal scheduling of parcel delivery). As technology continues to evolve and the world becomes increasingly digitalized, Eulerian paths and related graph‑theoretic concepts will likely find many more innovative applications across diverse domains.

Source link

#Eulerian #Melodies #Graph #Algorithms #Music #Composition