...

From Python to JavaScript: A Playbook for Data Analytics in n8n with Code Node Examples


my first n8n workflow, as a data scientist, it felt like I was cheating.

I could connect to APIs without reading 30-page docs, trigger workflows from Gmail or Sheets, and deploy something useful in minutes.

However, the significant drawback is that n8n is not natively optimised to run a Python environment in the cloud instances used by our customers.

Like many data scientists, my daily toolbox for data analytics is built on NumPy and Pandas

To stay in my comfort zone, I often outsourced calculations to external APIs instead of using n8n JavaScript code nodes.

Production Planning n8n workflow with API function calling – (Image by Samir Saci)

For instance, this is what is done with a Production Planning Optimisation tool, which is orchestrated through a workflow that includes an Agent node that calls a FastAPI microservice.

This approach worked, but I had clients who requested to have complete visibility of the data analytics tasks on their n8n user interface.

I realised that I need to learn just enough JavaScript to perform data processing with the native code nodes of n8n.

Example of JavaScript node grouping sales by ITEM – (Image by Samir Saci)

In this article, we will experiment with small JavaScript snippets inside n8n Code nodes to perform everyday data analytics tasks.

For this exercise, I will use a dataset of sales transactions and walk it through to an ABC and Pareto analysis, which are widely used in Supply Chain Management.

ABC XYZ & Pareto Charts widely used in Supply Chain Management – (Image by Samir Saci)

I will provide side-by-side examples of Pandas vs. JavaScript in n8n Code nodes, allowing us to translate our familiar Python data analysis steps directly into automated n8n workflows.

Example of JavaScript vs. Pandas – (Image by Samir Saci)

The idea is to implement these solutions for small datasets or quick prototyping within the capabilities of a cloud enterprise n8n instance (i.e. without community nodes).

The experimental workflow we will build together – (Image by Samir Saci)

I will end the experiment with a quick comparative study of the performance versus a FastAPI call.

You can follow me and replicate the entire workflow using a Google Sheet and a workflow template shared in the article.

Let’s start!

Building a Data Analytics Workflow using JavaScript in n8n

Before starting to build nodes, I will introduce the context of this analysis.

ABC & Pareto Charts for Supply Chain Management

For this tutorial, I propose that you build a simple workflow that takes sales transactions from Google Sheets and transforms them into a comprehensive ABC and Pareto charts.

This will replicate the ABC and Pareto Analysis module of the LogiGreen Apps developed by my startup, LogiGreen.

ABC Analysis Module of the LogiGreen Apps – (Image by Samir Saci)

The goal is to generate a set of visuals for the inventory teams of a supermarket chain to help them understand the distribution of sales across their stores.

We will focus on generating two visuals.

The first chart shows an ABC-XYZ analysis of sales items:

ABC XYZ Chart – (Image by Samir Saci)
  • X-axis (Percentage of Turnover %): the contribution of each item to total revenue.
  • Y-axis (Coefficient of Variation): demand variability of each item.
  • Vertical red lines split items into A, B, and C classes based on turnover share.
  • The horizontal blue line marks stable vs variable demand (CV=1)

Together, it highlights which items are high-value & stable (A, low CV) versus those that are low-value or highly variable, guiding prioritisation in inventory management.

The second visual is a Pareto analysis of sales turnover:

Pareto Chart generated by the Logigreen App – Image by Samir Saci
  • X-axis: percentage of SKUs (ranked by sales).
  • Y-axis: cumulative percentage of annual turnover.
  • The curve illustrates how a small fraction of items contributes to the majority of revenue.

In short, this highlights (or not) the classic Pareto rule, which affirms that 80% of sales can come from 20% of the SKUs.

How did I generate these two visuals? I simply used Python.

On my YouTube channel, I shared a complete tutorial on how to do it using Pandas and Matplotlib.

The objective of this tutorial is to prepare sales transactions and generate these visuals in a Google Sheet using only n8n’s native JavaScript nodes.

Building a Data Analytics Workflow in n8n

I propose to build a workflow that is manually triggered to facilitate debugging during development.

Final Workflow manually triggered to collect data from Google Sheets to generate visuals – (Image by Samir Saci)

To follow this tutorial, you need to

You can now connect your duplicated sheet using the second node, which will extract the dataset from the worksheet: Input Data.

Connect the second node to your copy of the Google Sheet to collect input data – (Image by Samir Saci)

This dataset includes retail sales transactions at the daily granularity:

  • ITEM: an item that can be sold in multiple stores
  • SKU: represents an `SKU` sold in a specific store 
  • FAMILY: a group of items
  • CATEGORY: a product category can include multiple families
  • STORE: a code representing a sales location
  • DAY of the transaction
  • QTY: sales quantity in units
  • TO: sales quantity in euros

The output is the table’s content in JSON format, ready to be ingested by other nodes.

Python Code

import pandas as pd
df = pd.read_csv("sales.csv") 

We can now begin processing the dataset to build our two visualisations.

Step 1: Filter out transactions without sales

Let us begin with the simple action of filtering out transactions with sales QTY equal to zero.

Filter out transactions without sales using the filter node – (Image by Samir Saci)

We do not need JavaScript; a simple Filter node can do the job.

Python Code

df = df[df["QTY"] != 0]

Step 2: Prepare data for Pareto Analysis

We first need to aggregate the sales per ITEM and rank products by turnover.

Python Code

sku_agg = (df.groupby("ITEM", as_index=False)
             .agg(TO=("TO","sum"), QTY=("QTY","sum"))
             .sort_values("TO", ascending=False))

In our workflow, this step will be done in the JavaScript node TO, QTY GroupBY ITEM:

const agg = {};
for (const {json} of items) {
  const ITEM = json.ITEM;
  const TO = Number(json.TO);
  const QTY = Number(json.QTY);
  if (!agg[ITEM]) agg[ITEM] = { ITEM, TO: 0, QTY: 0 };
  agg[ITEM].TO += TO;
  agg[ITEM].QTY += QTY;
}
const rows = Object.values(agg).sort((a,b)=> b.TO - a.TO);
return rows.map(r => ({ json: r }));

This node returns a ranked table of sales per ITEM in quantity (QTY) and turnover (TO): 

  1. We initiate agg as a dictionary keyed by ITEM
  2. We loop over n8n rows in items
  • Converting TO and QTY to numbers
  • Add the QTY and TO value into the running totals of each ITEM
  1. We finally transform the dictionary into an array sorted by TO desc and return items
Output data of the aggregation of sales by ITEM – (Image by Samir Saci)

We now have the data ready to perform a Pareto Analysis on sales quantity (QTY) or turnover (TO).

For that, we need to calculate cumulative sales and rank SKUs from the highest to the lowest contributor.

Python Code

abc = sku_agg.copy()  # from Step 2, already sorted by TO desc
total = abc["TO"].sum() or 1.0
abc["cum_turnover"] = abc["TO"].cumsum()
abc["cum_share"]    = abc["cum_turnover"] / total             
abc["sku_rank"]     = range(1, len(abc) + 1)
abc["cum_skus"]     = abc["sku_rank"] / len(abc)               
abc["cum_skus_pct"] = abc["cum_skus"] * 100                    

This step will be done in the code node Pareto Analysis:

const rows = items
  .map(i => ({
    ...i.json,
    TO: Number(i.json.TO || 0),   
    QTY: Number(i.json.QTY || 0),
  }))
  .sort((a, b) => b.TO - a.TO);

const n = rows.length; // number of ITEM
const totalTO = rows.reduce((s, r) => s + r.TO, 0) || 1;

We collect the dataset items from the previous node

  1. For each row, we clean up the fields TO and QTY (in case we have missing values)
  2. We sort all SKUs by turnover in descending order.
  3. We store in variables the number of items and the total turnover
let cumTO = 0;
rows.forEach((r, idx) => {
  cumTO += r.TO;
  r.cum_turnover = cumTO;                     
  r.cum_share = +(cumTO / totalTO).toFixed(6); 
  r.sku_rank = idx + 1;
  r.cum_skus = +((idx + 1) / n).toFixed(6);   
  r.cum_skus_pct = +(r.cum_skus * 100).toFixed(2);
});

return rows.map(r => ({ json: r }));

Then we loop over all items in sorted order.

  1. Use the variable cumTO to compute the cumulative contribution
  2. Add several Pareto metrics to each row:
  • cum_turnover: cumulative turnover up to this item
  • cum_share: cumulative share of turnover
  • sku_rank: ranking position of the item
  • cum_skus: cumulative number of SKUs as a fraction of total SKUs
  • cum_skus_pct: same as cum_skus, but in %.

We are then done with the data preparation of the pareto chart.

Final results – (Image by Samir Saci)

This dataset will be stored in the worksheet Pareto by the node Update Pareto Sheet.

And with a bit of magic, we can generate this graph in the first worksheet:

Pareto Chart generated using data processed by the n8n workflow – (Image by Samir Saci)

We can now continue with the ABC XYZ chart.

Step 3: Calculate the demand variability and sales contribution

We could reuse the output of the pareto chart for the sales contribution, but we will consider each chart as independent.

I will split the code for the node Demand Variability’ and ‘Sales x Sales % into multiple segments for clarity.

Block 1: define functions for mean and standard deviation

function mean(a){ return a.reduce((s,x)=>s + x, 0) / (a.length || 1); }
function stdev_samp(a){
  if (a.length  s + (x - m) ** 2, 0) / (a.length - 1);
  return Math.sqrt(v);
}

These two functions will be used for the coefficient of variation (Cov)

  • mean(a): computes the average of an array.
  • stdev_samp(a): computes the sample standard deviation

They take as inputs the daily sales distributions of each ITEM that we build in this second block.

Block 2: Create the daily sales distribution of each ITEM

const series = {};  // ITEM -> { day -> qty_sum }
let totalQty = 0;

for (const { json } of items) {
  const item = String(json.ITEM);
  const day  = String(json.DAY);
  const qty  = Number(json.QTY || 0);

  if (!series[item]) series[item] = {};
  series[item][day] = (series[item][day] || 0) + qty;
  totalQty += qty;
}

Python Code

import pandas as pd
import numpy as np
df['QTY'] = pd.to_numeric(df['QTY'], errors='coerce').fillna(0)
daily_series = df.groupby(['ITEM', 'DAY'])['QTY'].sum().reset_index()

Now we can compute the metrics applied to the daily sales distributions.

const out = [];
for (const [item, dayMap] of Object.entries(series)) {
  const daily = Object.values(dayMap); // daily sales quantities
  const qty_total = daily.reduce((s,x)=>s+x, 0);
  const m = mean(daily);               // average daily sales
  const sd = stdev_samp(daily);        // variability of sales
  const cv = m ? sd / m : null;        // coefficient of variation
  const share_qty_pct = totalQty ? (qty_total / totalQty) * 100 : 0;

  out.push({
    ITEM: item,
    qty_total,
    share_qty_pct: Number(share_qty_pct.toFixed(2)),
    mean_qty: Number(m.toFixed(3)),
    std_qty: Number(sd.toFixed(3)),
    cv_qty: cv == null ? null : Number(cv.toFixed(3)),
  });
}

For each ITEM, we calculate

  • qty_total: total sales
  • mean_qty: average daily sales.
  • std_qty: standard deviation of daily sales.
  • cv_qty: coefficient of variation (variability measure for XYZ classification)
  • share_qty_pct: % contribution to total sales (used for ABC classification)

Here is the Python version in case you were lost:

summary = daily_series.groupby('ITEM').agg(
    qty_total=('QTY', 'sum'),
    mean_qty=('QTY', 'mean'),
    std_qty=('QTY', 'std')
).reset_index()

summary['std_qty'] = summary['std_qty'].fillna(0)

total_qty = summary['qty_total'].sum()
summary['cv_qty'] = summary['std_qty'] / summary['mean_qty'].replace(0, np.nan)
summary['share_qty_pct'] = 100 * summary['qty_total'] / total_qty

We are nearly done.

We just need to sort by descending contribution to prepare for the ABC class mapping:

out.sort((a,b) => b.share_qty_pct - a.share_qty_pct);
return out.map(r => ({ json: r }));

We now have for each ITEM, the key metrics needed to create the scatter plot.

Output of the node Demand Variability x Sales % – (Image by Samir Saci)

Only the ABC classes are missing at this step.

Step 4: Add ABC classes

We take the output of the previous node as input.

let rows = items.map(i => i.json);
rows.sort((a, b) => b.share_qty_pct - a.share_qty_pct);

Just in case, we sort ITEMS by descending by sales share (%) → most important SKUs first.

(This step can be omitted as it is normally already completed at the end of the previous code node.)

Then we can apply the class based on hardcoded conditions:

  • A: SKUs that together represent the first 5% of sales
  • B: SKUs that together represent the next 15% of sales
  • C: Everything after 20%.
let cum = 0;
for (let r of rows) {
  cum += r.share_qty_pct;

  // 3) Assign class based on cumulative %
  if (cum  ({ json: r }));

This can be done that way using Python Code.

df = df.sort_values('share_qty_pct', ascending=False).reset_index(drop=True)
df['cum_share'] = df['share_qty_pct'].cumsum()
def classify(cum):
    if cum 

The results can now be used to generate this chart, which can be found in the first sheet of the Google Sheet:

ABC XYZ Chart generated with the data processed by the workflow using JavaScript – (Image by Samir Saci)

I struggled (probably due to my limited knowledge of Google Sheets) to find a “manual” solution to create this scatter plot with the correct colour mapping.

Therefore, I used a Google Apps Script available in the Google Sheet to create it.

Script included in the Google Sheet to generate the visual – (Image by Samir Saci)

As a bonus, I added more nodes to the n8n template that perform the same type of GroupBy to calculate sales by store or a pair of ITEM-store.

The experimental workflow we built together – (Image by Samir Saci)

They can be used to create visuals like this one:

Total Daily Sales Quantity per Store – (Image by Samir Saci)

To conclude this tutorial, we can confidently declare that the job is done.

For a live demo of the workflow, you can have a look at this short tutorial

Our customers, who run this workflow on their n8n cloud instance, can now gain visibility into each step of the data processing.

But at which cost? Are we loosing in performance?

This is what we will discover in the next section.

Comparative Study of Performance: n8n JavaScript Node vs. Python in FastAPI

To answer this question, I prepared a straightforward experiment.

The same dataset and transformations were processed using two different approaches inside n8n:

  1. All in JavaScript nodes with functions directly within n8n.
  2. Outsourcing to FastAPI microservices by replacing the JavaScript logic with HTTP requests to Python endpoints.
Simple Workflow using FastAPI microservice – (Image by Samir Saci)

These two endpoints are connected to functions that will load the data directly from the VPS instance where I hosted the microservice.

@router.post("/launch_pareto")
async def launch_speedtest(request: Request):
    try:
        session_id = request.headers.get('session_id', 'session')

        folder_in = f'data/session/speed_test/input'
        if not path.exists(folder_in):
                makedirs(folder_in)

        file_path = folder_in + '/sales.csv'
        logger.info(f"[SpeedTest]: Loading data from session file: {file_path}")
        df = pd.read_csv(file_path, sep=";")
        logger.info(f"[SpeedTest]: Data loaded successfully: {df.head()}")

        speed_tester = SpeedAnalysis(df)
        output = await speed_tester.process_pareto()
        
        result = output.to_dict(orient="records")
        result = speed_tester.convert_numpy(result)
        
        logger.info(f"[SpeedTest]: /launch_pareto completed successfully for {session_id}")
        return result
    except Exception as e:
        logger.error(f"[SpeedTest]: Error /launch_pareto: {str(e)}\n{traceback.format_exc()}")
        raise HTTPException(status_code=500, detail=f"Failed to process Speed Test Analysis: {str(e)}")
    
@router.post("/launch_abc_xyz")
async def launch_abc_xyz(request: Request):
    try:
        session_id = request.headers.get('session_id', 'session')

        folder_in = f'data/session/speed_test/input'
        if not path.exists(folder_in):
                makedirs(folder_in)

        file_path = folder_in + '/sales.csv'
        logger.info(f"[SpeedTest]: Loading data from session file: {file_path}")
        df = pd.read_csv(file_path, sep=";")
        logger.info(f"[SpeedTest]: Data loaded successfully: {df.head()}")

        speed_tester = SpeedAnalysis(df)
        output = await speed_tester.process_abcxyz()
        
        result = output.to_dict(orient="records")
        result = speed_tester.convert_numpy(result)
        
        logger.info(f"[SpeedTest]: /launch_abc_xyz completed successfully for {session_id}")
        return result
    except Exception as e:
        logger.error(f"[SpeedTest]: Error /launch_abc_xyz: {str(e)}\n{traceback.format_exc()}")
        raise HTTPException(status_code=500, detail=f"Failed to process Speed Test Analysis: {str(e)}")

I want to focus this test only on the data processing performance.

The SpeedAnalysis includes all the data processing steps listed in the previous section

  • Grouping sales by ITEM
  • Sorting ITEM by descending order and calculate cumulative sales
  • Calculating standard deviations and means of sales distribution by ITEM
class SpeedAnalysis:
    def __init__(self, df: pd.DataFrame):
        config = load_config()
        
        self.df = df
        
    def processing(self):
        try:
            sales = self.df.copy()
            sales = sales[sales['QTY']>0].copy()
            self.sales = sales

        except Exception as e:
            logger.error(f'[SpeedTest] Error for processing : {e}\n{traceback.format_exc()}')
            
    def prepare_pareto(self):
        try:
            sku_agg = self.sales.copy()
            sku_agg = (sku_agg.groupby("ITEM", as_index=False)
             .agg(TO=("TO","sum"), QTY=("QTY","sum"))
             .sort_values("TO", ascending=False))
            
            pareto = sku_agg.copy()  
            total = pareto["TO"].sum() or 1.0
            pareto["cum_turnover"] = pareto["TO"].cumsum()
            pareto["cum_share"]    = pareto["cum_turnover"] / total              
            pareto["sku_rank"]     = range(1, len(pareto) + 1)
            pareto["cum_skus"]     = pareto["sku_rank"] / len(pareto)               
            pareto["cum_skus_pct"] = pareto["cum_skus"] * 100
            return pareto                    
        except Exception as e:
            logger.error(f'[SpeedTest]Error for prepare_pareto: {e}\n{traceback.format_exc()}')
            
    def abc_xyz(self):
            daily = self.sales.groupby(["ITEM", "DAY"], as_index=False)["QTY"].sum()
            stats = (
                daily.groupby("ITEM")["QTY"]
                .agg(
                    qty_total="sum",
                    mean_qty="mean",
                    std_qty="std"
                )
                .reset_index()
            )
            stats["cv_qty"] = stats["std_qty"] / stats["mean_qty"].replace(0, np.nan)
            total_qty = stats["qty_total"].sum()
            stats["share_qty_pct"] = (stats["qty_total"] / total_qty * 100).round(2)
            stats = stats.sort_values("share_qty_pct", ascending=False).reset_index(drop=True)
            stats["cum_share"] = stats["share_qty_pct"].cumsum().round(2)
            def classify(cum):
                if cum 

Now that we have these endpoints ready, we can begin testing.

Results of the experimentation (Top: Processing using native code nodes / Bottom: FastAPI Microservice) – (Image by Samir Saci)

The results are shown above:

  • JavaScript-only workflow: The whole process was completed in a bit more than 11.7 seconds.
    Most of the time was spent updating sheets and performing iterative calculations within n8n nodes.
  • FastAPI-backed workflow: The equivalent “outsourced” process was completed in ~11.0 seconds.
    Heavy computations were offloaded to Python microservices, which handled them faster than native JavaScript nodes.

In other words, outsourcing complex computations to Python actually improves the performance.

The reason is that FastAPI endpoints execute optimised Python functions directly, while JavaScript nodes inside n8n must iterate (with loops).

For large datasets, I would imagine a delta that is probably not negligible.

This demonstrates that you can do simple data processing within n8n using small JavaScript snippets.

However, our Supply Chain Analytics products can require more advanced processing involving optimisation and advanced statistical libraries.

AI Workflow for Production Planning Optimisation (Image by Samir Saci)

For that, customers can accept dealing with a “black box” approach, as seen in the Production Planning workflow presented in this Towards Data Science article.

But for light processing tasks, we can integrate them into the workflow to provide visibility to non-code users.

For another project, I use n8n to connect Supply Chain IT systems for the transfer of Purchase Orders using Electronic Data Interchange (EDI).

Example of Electronic Data Interchange (EDI) Parsing Workflow – (Image by Samir Saci)

This workflow, deployed for a small logistics company, entirely parses EDI messages using JavaScript nodes.

Example of Electronic Data Interchange Message – (Image by Samir Saci)

As you can discover in this tutorial, we have performed 100% of the Electronic Data Interchange message parsing using JavaScript nodes.

This helped us to improve the robustness of the solution and reduce our workload by handing over the maintenance to the customer.

What is the best approach?

For me, n8n should be used as an orchestration and integration tool connected to our core analytics products.

These analytics products require specific input formats that may not align with our customers’ data.

Therefore, I would advise using JavaScript code nodes to perform this preprocessing.

Workflow for Distribution Planning Optimisation Algorithm – (Image by Samir Saci)

For example, the workflow above connects a Google Sheet (containing input data) to a FastAPI microservice that runs an algorithm for Distribution Planning Optimisation.

The idea is to plug our optimisation algorithm into a Google Sheet used by Distribution Planners to organise store deliveries.

Worksheet used by Planning Teams – (Image by Samir Saci)

The JavaScript code node is used to transform the data collected from the Google Sheet into the input format required by our algorithm.

By doing the job inside the workflow, it remains under the control of the customer who runs the workflow in their own instance.

And we can keep the optimisation part in a microservice hosted on our instance.

To better understand the setup, feel free to have a look at this short presentation

I hope this tutorial and the examples above have given you enough insight to understand what can be done with n8n in terms of data analytics.

Feel free to share your comments about the approach and your thoughts on what could be improved to enhance the workflow’s performance with me.

About Me

Let’s connect on Linkedin and Twitter. I am a Supply Chain Engineer who uses data analytics to improve logistics operations and reduce costs.

For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.

Find your complete guide for Supply Chain Analytics: Analytics Cheat Sheet.

If you are interested in Data Analytics and Supply Chain, look at my website.

Samir Saci | Data Science & Productivity



Source link

#Python #JavaScript #Playbook #Data #Analytics #n8n #Code #Node #Examples