What’s Inside a Neural Network?. Plotting surface of error in 3D using… | by Aleksei Rozanov

First and foremost, we need synthetic data to work with. The data should exhibit some non-linear dependency. Let’s define it like this:

In python it will have the following shape:

np.random.seed(42)
X = np.random.normal(1, 4.5, 10000)
y = np.piecewise(X, [X < -2,(X >= -2) & (X < 2), X >= 2], [lambda X: 2*X + 5, lambda X: 7.3*np.sin(X), lambda X: -0.03*X**3 + 2]) + np.random.normal(0, 1, X.shape)

After visualization:

Since we are visualizing a 3D space, our neural network will only have 2 weights. This means the ANN will consist of a single hidden neuron. Implementing this in PyTorch is quite intuitive:

class ANN(nn.Module):
def __init__(self, input_size, N, output_size):
super().__init__()
self.net = nn.Sequential()
self.net.add_module(name='Layer_1', module=nn.Linear(input_size, N, bias=False))
self.net.add_module(name='Tanh',module=nn.Tanh())
self.net.add_module(name='Layer_2',module=nn.Linear(N, output_size, bias=False))
def forward(self, x):
return self.net(x)

Important! Don’t forget to turn off the biases in your layers, otherwise you’ll end up having x2 more parameters.

To build the error surface, we first need to create a grid of possible values for W1 and W2. Then, for each weight combination, we will update the parameters of the network and calculate the error:

W1, W2 = np.arange(-2, 2, 0.05), np.arange(-2, 2, 0.05)
LOSS = np.zeros((len(W1), len(W2)))
for i, w1 in enumerate(W1):
model.net._modules['Layer_1'].weight.data = torch.tensor([[w1]], dtype=torch.float32)
for j, w2 in enumerate(W2):
model.net._modules['Layer_2'].weight.data = torch.tensor([[w2]], dtype=torch.float32)


model.eval()
total_loss = 0
with torch.no_grad():
for x, y in test_loader:
preds = model(x.reshape(-1, 1))
total_loss += loss(preds, y).item()


LOSS[i, j] = total_loss / len(test_loader)

It may take some time. If you make the resolution of this grid too coarse (i.e., the step size between possible weight values), you might miss local minima and maxima. Remember how the learning rate is often schedule to decrease over time? When we do this, the absolute change in weight values can be as small as 1e-3 or less. A grid with a 0.5 step simply won’t capture these fine details of the error surface!

At this point, we don’t care at all about the quality of the trained model. However, we do want to pay attention to the learning rate, so let’s keep it between 1e-1 and 1e-2. We’ll simply collect the weight values and errors during the training process and store them in separate lists:

model = ANN(1,1,1)
epochs = 25
lr = 1e-2
optimizer = optim.SGD(model.parameters(),lr =lr)


model.net._modules['Layer_1'].weight.data = torch.tensor([[-1]], dtype=torch.float32)
model.net._modules['Layer_2'].weight.data = torch.tensor([[-1]], dtype=torch.float32)


errors, weights_1, weights_2 = [], [], []


model.eval()
with torch.no_grad():
total_loss = 0
for x, y in test_loader:
preds = model(x.reshape(-1,1))
error = loss(preds, y)
total_loss += error.item()
weights_1.append(model.net._modules['Layer_1'].weight.data.item())
weights_2.append(model.net._modules['Layer_2'].weight.data.item())
errors.append(total_loss / len(test_loader))


for epoch in tqdm(range(epochs)):
model.train()


for x, y in train_loader:
pred = model(x.reshape(-1,1))
error = loss(pred, y)
optimizer.zero_grad()
error.backward()
optimizer.step()


model.eval()
test_preds, true = [], []
with torch.no_grad():
total_loss = 0
for x, y in test_loader:
preds = model(x.reshape(-1,1))
error = loss(preds, y)
test_preds.append(preds)
true.append(y)


total_loss += error.item()
weights_1.append(model.net._modules['Layer_1'].weight.data.item())
weights_2.append(model.net._modules['Layer_2'].weight.data.item())
errors.append(total_loss / len(test_loader))

Finally, we can visualize the data we have collected using plotly. The plot will have two scenes: surface and SGD trajectory. One of the ways to do the first part is to create a figure with a plotly surface. After that we will style it a little by updating a layout.

The second part is as simple as it is — just use Scatter3d function and specify all three axes.

import plotly.graph_objects as go
import plotly.io as pio
plotly_template = pio.templates["plotly_dark"]
fig = go.Figure(data=[go.Surface(z=LOSS, x=W1, y=W2)])


fig.update_layout(
title='Loss Surface',
scene=dict(
xaxis_title='w1',
yaxis_title='w2',
zaxis_title='Loss',
aspectmode='manual',
aspectratio=dict(x=1, y=1, z=0.5),
xaxis=dict(showgrid=False), 
yaxis=dict(showgrid=False), 
zaxis=dict(showgrid=False), 
),
width=800,
height=800
)


fig.add_trace(go.Scatter3d(x=weights_2, y=weights_1, z=errors,
mode='lines+markers',
line=dict(color='red', width=2),
marker=dict(size=4, color='yellow') ))
fig.show()

Running it in Google Colab or locally in Jupyter Notebook will allow you to investigate the error surface more closely. Honestly, I spent a buch of time just looking at this figure:)

I’d love to see you surfaces, so please feel free to share it in comments. I strongly believe that the more imperfect the surface is the more interesting it is to investigate it!

===========================================

All my publications on Medium are free and open-access, that’s why I’d really appreciate if you followed me here!

P.s. I’m extremely passionate about (Geo)Data Science, ML/AI and Climate Change. So if you want to work together on some project pls contact me in LinkedIn and check out my website!

Follow for more

from Artificial Intelligence – My Blog https://ift.tt/czkK50h
via IFTTT

Hot Posts

Recent Posts

What’s Inside a Neural Network?. Plotting surface of error in 3D using… | by Aleksei Rozanov | Sep, 2024

Posted by AI Global Tech

Post a Comment

0 Comments

Comments

Popular Post

PhD Scholarships for Indian Students to Study Abroad in 2024-2025

Wait, how did a decentralized service like Bluesky go down?

Tomorrow: Join Ali Ghodsi and Dario Amodei for a fireside chat

Awesome Plotly with code series (Part 9): To dot, to slope or to stack? | by Jose Parreño | Feb, 2025

Most Popular

PhD Scholarships for Indian Students to Study Abroad in 2024-2025

Wait, how did a decentralized service like Bluesky go down?

Tomorrow: Join Ali Ghodsi and Dario Amodei for a fireside chat

Awesome Plotly with code series (Part 9): To dot, to slope or to stack? | by Jose Parreño | Feb, 2025

Stories We Can’t Stop Thinking About: Deepfakes, the Tesla Backlash, and All Things Chips

Insta360 X5 Review: The Best 360 Camera You Can Buy

Epic Games submits ‘Fortnite’ to the iOS App Store

Scholarships for MBA in Australia for Indian Students in 2024-2025

Try generating video in Gemini, powered by Veo 2

Navigating the Quantum Realm in 2025

Categories

Random Posts

Featured post

ScreenAI: A visual language model for UI and visually-situated language understanding

Popular Posts

Chat with Your Images Using Llama 3.2-Vision Multimodal LLMs | by Lihi Gur Arie, PhD | Dec, 2024

PhD Scholarships for Indian Students to Study Abroad in 2024-2025

The 17 Best Barefoot Shoes for Running or Walking (2024)

Contact form

Hot Posts

Ad Code

Recent Posts

What’s Inside a Neural Network?. Plotting surface of error in 3D using… | by Aleksei Rozanov | Sep, 2024

Posted by AI Global Tech

You may like these posts

Post a Comment

0 Comments

Comments

Popular Post

Most Popular

Categories

Ad Code

Random Posts

Featured post

Popular Posts

Contact form