• About
  • Advertise
  • Privacy & Policy
  • Contact
Sunday, December 28, 2025
  • Login
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
Advertisement
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
No Result
View All Result
Home Machine Learning

Sparse AutoEncoder: from Superposition to interpretable features | by Shuyang Xiang | Feb, 2025

AiNEWS2025 by AiNEWS2025
2025-02-01
in Machine Learning
0
Sparse AutoEncoder: from Superposition to interpretable features | by Shuyang Xiang | Feb, 2025
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Disentangle features in complex Neural Network with superpositions

Shuyang Xiang

Towards Data Science

Complex neural networks, such as Large Language Models (LLMs), suffer quite often from interpretability challenges. One of the most important reasons for such difficulty is superposition — a phenomenon of the neural network having fewer dimensions than the number of features it has to represent. For example, a toy LLM with 2 neurons has to present 6 different language features. As a result, we observe often that a single neuron needs to activate for multiple features. For a more detailed explanation and definition of superposition, please refer to my previous blog post: “Superposition: What Makes it Difficult to Explain Neural Network”.

In this blog post, we take one step further: let’s try to disentangle some fsuperposed features. I will introduce a methodology called Sparse Autoencoder to decompose complex neural network, especially LLM into interpretable features, with a toy example of language features.

A Sparse Autoencoder, by definition, is an Autoencoder with sparsity introduced on purpose in the activations of its hidden layers. With a rather simple structure and light training process, it aims to decompose a complex neural network and uncover the features in a more interpretable way and more understandable to humans.

Let us imagine that you have a trained neural network. The autoencoder is not part of the training process of the model itself but is instead a post-hoc analysis tool. The original model has its own activations, and these activations are collected afterwards and then used as input data for the sparse autoencoder.

For example, we suppose that your original model is a neural network with one hidden layer of 5 neurons. Besides, you have a training dataset of 5000 samples. You have to collect all the values of the 5-dimensional activation of the hidden layer for all your 5000 training samples, and they are now the input for your sparse autoencoder.

Image by author: Autoencoder to analyse an LLM

The autoencoder then learns a new, sparse representation from these activations. The encoder maps the original MLP activations into a new vector space with higher representation dimensions. Looking back at my previous 5-neuron simple example, we might consider to map it into a vector space with 20 features. Hopefully, we will obtain a sparse autoencoder effectively decomposing the original MLP activations into a representation, easier to interpret and analyze.

Sparsity is an important in the autoencoder because it is necessary for the autoencoder to “disentangle” features, with more “freedom” than in a dense, overlapping space.. Without existence of sparsity, the autoencoder will probably the autoencoder might just learn a trivial compression without any meaningful features’ formation.

Language model

Let us now build our toy model. I beg the readers to note that this model is not realistic and even a bit silly in practice but it is sufficient to showcase how we build sparse autoencoder and capture some features.

Suppose now we have built a language model which has one particular hidden layer whose activation has three dimensions. Let us suppose also that we have the following tokens: “cat,” “happy cat,” “dog,” “energetic dog,” “not cat,” “not dog,” “robot,” and “AI assistant” in the training dataset and they have the following activation values.

data = torch.tensor([
# Cat categories
[0.8, 0.3, 0.1, 0.05], # "cat"
[0.82, 0.32, 0.12, 0.06], # "happy cat" (similar to "cat")
# Dog categories
[0.7, 0.2, 0.05, 0.2], # "dog"
[0.75, 0.3, 0.1, 0.25], # "loyal dog" (similar to "dog")

# "Not animal" categories
[0.05, 0.9, 0.4, 0.4], # "not cat"
[0.15, 0.85, 0.35, 0.5], # "not dog"

# Robot and AI assistant (more distinct in 4D space)
[0.0, 0.7, 0.9, 0.8], # "robot"
[0.1, 0.6, 0.85, 0.75] # "AI assistant"
], dtype=torch.float32)

Construction of autoencoder

We now build the autoencoder with the following code:

class SparseAutoencoder(nn.Module):
def __init__(self, input_dim, hidden_dim):
super(SparseAutoencoder, self).__init__()
self.encoder = nn.Sequential(
nn.Linear(input_dim, hidden_dim),
nn.ReLU()
)
self.decoder = nn.Sequential(
nn.Linear(hidden_dim, input_dim)
)

def forward(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
return encoded, decoded

According to the code above, we see that the encoder has a only one fully connected linear layer, mapping the input to a hidden representation with hidden_dim and it then passes to a ReLU activation. The decoder uses just one linear layer to reconstruct the input. Note that the absence of ReLU activation in the decoder is intentional for our specific reconstruction case, because the reconstruction might contain real-valued and potentially negative valued data. A ReLU would on the contrary force the output to stay non-negative, which is not desirable for our reconstruction.

We train model using the code below. Here, the loss function has two parts: the reconstruction loss, measuring the accuracy of the autoencoder’s reconstruction of the input data, and a sparsity loss (with weight), which encourages sparsity formulation in the encoder.

# Training loop
for epoch in range(num_epochs):
optimizer.zero_grad()

# Forward pass
encoded, decoded = model(data)

# Reconstruction loss
reconstruction_loss = criterion(decoded, data)

# Sparsity penalty (L1 regularization on the encoded features)
sparsity_loss = torch.mean(torch.abs(encoded))

# Total loss
loss = reconstruction_loss + sparsity_weight * sparsity_loss

# Backward pass and optimization
loss.backward()
optimizer.step()

Now we can have a look of the result. We have plotted the encoder’s output value of each activation of the original models. Recall that the input tokens are “cat,” “happy cat,” “dog,” “energetic dog,” “not cat,” “not dog,” “robot,” and “AI assistant”.

Image by author: features learned by encoder

Even though the original model was designed with a very simple architecture without any deep consideration, the autoencoder has still captured meaningful features of this trivial model. According to the plot above, we can observe at least four features that appear to be learned by the encoder.

Give first Feature 1 a consideration. This feautre has big activation values on the 4 following tokens: “cat”, “happy cat”, “dog”, and “energetic dog”. The result suggests that Feature 1 can be something related to “animals” or “pets”. Feature 2 is also an interesting example, activating on two tokens “robot” and “AI assistant”. We guess, therefore, this feature has something to do with “artificial and robotics”, indicating the model’s understanding on technological contexts. Feature 3 has activation on 4 tokens: “not cat”, “not dog”, “robot” and “AI assistant” and this is possibly a feature “not an animal”.

Unfortunately, original model is not a real model trained on real-world text, but rather artificially designed with the assumption that similar tokens have some similarity in the activation vector space. However, the results still provide interesting insights: the sparse autoencoder succeeded in showing some meaningful, human-friendly features or real-world concepts.

The simple result in this blog post suggests:, a sparse autoencoder can effectively help to get high-level, interpretable features from complex neural networks such as LLM.

For readers interested in a real-world implementation of sparse autoencoders, I recommend this article, where an autoencoder was trained to interpret a real large language model with 512 neurons. This study provides a real application of sparse autoencoders in the context of LLM’s interpretability.

Finally, I provide here this google colab notebook for my detailed implementation mentioned in this article.

Source link

#Sparse #AutoEncoder #Superposition #interpretable #features #Shuyang #Xiang #Feb

Previous Post

To help AIs understand the world, researchers put them in a robot

Next Post

This quantum computer built on server racks paves the way to bigger machines

AiNEWS2025

AiNEWS2025

Next Post
This quantum computer built on server racks paves the way to bigger machines

This quantum computer built on server racks paves the way to bigger machines

Stay Connected test

  • 23.9k Followers
  • 99 Subscribers
  • Trending
  • Comments
  • Latest
A tiny new open source AI model performs as well as powerful big ones

A tiny new open source AI model performs as well as powerful big ones

0
Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

0
Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

0
Best Headphones for Working Out (2024): Bose, Shokz, JLab

Best Headphones for Working Out (2024): Bose, Shokz, JLab

0
Exploring TabPFN: A Foundation Model Built for Tabular Data

Exploring TabPFN: A Foundation Model Built for Tabular Data

2025-12-28
Ars Technica’s Top 20 video games of 2025

Ars Technica’s Top 20 video games of 2025

2025-12-28
Samsung will debut the Music Studio 7 and 5 speakers at CES 2026

Samsung will debut the Music Studio 7 and 5 speakers at CES 2026

2025-12-28
After Outcry, Firefox Promises “Kill Switch” That Turns Off All AI Features

After Outcry, Firefox Promises “Kill Switch” That Turns Off All AI Features

2025-12-28

Recent News

Exploring TabPFN: A Foundation Model Built for Tabular Data

Exploring TabPFN: A Foundation Model Built for Tabular Data

2025-12-28
Ars Technica’s Top 20 video games of 2025

Ars Technica’s Top 20 video games of 2025

2025-12-28
Samsung will debut the Music Studio 7 and 5 speakers at CES 2026

Samsung will debut the Music Studio 7 and 5 speakers at CES 2026

2025-12-28
After Outcry, Firefox Promises “Kill Switch” That Turns Off All AI Features

After Outcry, Firefox Promises “Kill Switch” That Turns Off All AI Features

2025-12-28
Footer logo

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow Us

Browse by Category

  • AI & Cloud Computing
  • AI & Cybersecurity
  • AI & Sentiment Analysis
  • AI Applications
  • AI Ethics
  • AI Future Predictions
  • AI in Education
  • AI in Fintech
  • AI in Gaming
  • AI in Healthcare
  • AI in Startups
  • AI Innovations
  • AI News
  • AI Research
  • AI Tools & Automation
  • Apps
  • AR/VR & AI
  • Business
  • Deep Learning
  • Emerging Technologies
  • Entertainment
  • Fashion
  • Food
  • Gadget
  • Gaming
  • Health
  • Lifestyle
  • Machine Learning
  • Mobile
  • Movie
  • Music
  • News
  • Politics
  • Review
  • Robotics & Smart Systems
  • Science
  • Sports
  • Startup
  • Tech
  • Travel
  • World

Recent News

Exploring TabPFN: A Foundation Model Built for Tabular Data

Exploring TabPFN: A Foundation Model Built for Tabular Data

2025-12-28
Ars Technica’s Top 20 video games of 2025

Ars Technica’s Top 20 video games of 2025

2025-12-28
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2025 JNews - Premium WordPress news & magazine theme by Jegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result

© 2025 JNews - Premium WordPress news & magazine theme by Jegtheme.