Wednesday, March 11, 2026

Weights and Bias: KDnuggets Crash Course

Share

Weights and Bias: KDnuggets Crash Course
Photo by the author

If you train models outside of one notebook, you probably suffer from the same headaches: you tweak five knobs, repeat the training, and by Friday you can’t remember which run generated a “good” ROC curve or what piece of data you used. Weights and deviations (W&B) provides a paper trail – metrics, configurations, charts, datasets and models – so you can answer the question of what has changed based on evidence, not guesswork.

Below is a hands-on tour. It’s thoughtful, lightweight, and designed for teams that want a tidy history of experimentation without building their own platform. Let’s call it a hassle-free guide.

# Why W&B at all?

Notebooks become experiments. Experiments are multiplying. You’ll soon be asking: Which run used this slice of data? Why is today’s ROC curve higher? Can I recreate last week’s baseline?

W&B gives you space for:

  • Log metrics, configurations, charts and system statistics
  • Versionize datasets and models with artifacts
  • Run a hyperparameter search
  • Share dashboards without screenshots

You can run miniature and layered functions as needed.

# Setup in 60 seconds

Start by installing the library and logging in with your API key. If you don’t have it yet, you will find it here.

pip install wandb
wandb login # paste your API key once

Weights and Bias: KDnuggets Crash CourseWeights and Bias: KDnuggets Crash Course
Photo by the author

// Minimal sanity checks

import wandb, random, time

wandb.init(project="kdn-crashcourse", name="hello-run", config={"lr": 0.001, "epochs": 5})
for epoch in range(wandb.config.epochs):
    loss = 1.0 / (epoch + 1) + random.random() * 0.05
    wandb.log({"epoch": epoch, "loss": loss})
    time.sleep(0.1)
wandb.finish()

Now you should see something like this:

Weights and Bias: KDnuggets Crash CourseWeights and Bias: KDnuggets Crash Course
Photo by the author

Now let’s get to the useful bits.

# Properly track experiments

// Log hyperparameters and metrics

To treat wandb.config as your single source of truth regarding the knobs of your experiment. Give your metrics clear names so that your charts are automatically grouped.

cfg = dict(arch="resnet18", lr=3e-4, batch=64, seed=42)
run = wandb.init(project="kdn-mlops", config=cfg, tags=["baseline"])

# training loop ...
for step, (x, y) in enumerate(loader):
    # ... compute loss, acc
    wandb.log({"train/loss": loss.item(), "train/acc": acc, "step": step})

# log a final summary
run.summary["best_val_auc"] = best_auc

Some tips:

  • Employ namespaces such as train/loss Or val/auc for automatic grouping of charts
  • Add tags e.g "lr-finder" Or "fp16" so you can filter the waveforms later
  • Employ run.summary[...] for one-off results you want to see on your run card

// Log images, confusion matrices and custom plots

wandb.log({
    "val/confusion": wandb.plot.confusion_matrix(
        preds=preds, y_true=y_true, class_names=classes)
})

You can also save any Matplotlib plot:

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(history)
wandb.log({"training/curve": fig})

// Versioning of datasets and models with artifacts

Artifacts answer questions such as: “What exact files did this run use?” and “What did we practice?” No more final_final_v3.parquet arcana.

import wandb

run = wandb.init(project="kdn-mlops")

# Create a dataset artifact (run once per version)
raw = wandb.Artifact("imdb_reviews", type="dataset", description="raw dump v1")
raw.add_dir("data/raw") # or add_file("path")
run.log_artifact(raw)

# Later, consume the latest version
artifact = run.use_artifact("imdb_reviews:latest")
data_dir = artifact.download() # folder path pinned to a hash

Log your model in the same way:

import torch
import wandb

run = wandb.init(project="kdn-mlops")

model_path = "models/resnet18.pt"
torch.save(model.state_dict(), model_path)

model_art = wandb.Artifact("sentiment-resnet18", type="model")
model_art.add_file(model_path)
run.log_artifact(model_art)

Now the origin is obvious: this model was created from this data, within this code commit.

// Rating tables and error analysis

wandb.Table is a lightweight data frame for results, forecasts and slices.

table = wandb.Table(columns=["id", "text", "pred", "true", "prob"])
for r in batch_results:
    table.add_data(r.id, r.text, r.pred, r.true, r.prob)
wandb.log({"eval/preds": table})

Filter the table in the UI to find failure patterns (e.g. miniature reviews, infrequent classes, etc.).

// Viewing hyperparameters

Define the search space in YAML, launch agents and let W&B coordinate.

# sweep.yaml
method: bayes
metric: {name: val/auc, goal: maximize}
parameters:
  lr: {min: 1e-5, max: 1e-2}
  batch: {values: [32, 64, 128]}
  dropout: {min: 0.0, max: 0.5}

Start sweep:

wandb sweep sweep.yaml # returns a SWEEP_ID
wandb agent // # run 1+ agents

Your training script should be read wandb.config Down lr, batchetc. The panel shows best attempts, parallel coordinates and best configuration.

# Drop-in integrations

Choose the one you operate and move on.

// PyTorch Lightning

from pytorch_lightning.loggers import WandbLogger
logger = WandbLogger(project="kdn-mlops")
trainer = pl.Trainer(logger=logger, max_epochs=10)

// Hard

import wandb
from wandb.keras import WandbCallback

wandb.init(project="kdn-mlops", config={"epochs": 10})
model.fit(X, y, epochs=wandb.config.epochs, callbacks=[WandbCallback()])

// Scikit-learn

from sklearn.metrics import roc_auc_score
wandb.init(project="kdn-mlops", config={"C": 1.0})
# ... fit model
wandb.log({"val/auc": roc_auc_score(y_true, y_prob)})

# Model registration and staging

Think of the registry as a named shelf that houses the best models. You press artifact once and then manage aliases such as staging Or production so subsequent code can pull the correct one without guessing file paths.

run = wandb.init(project="kdn-mlops")
art = run.use_artifact("sentiment-resnet18:latest")
registry = wandb.sdk.artifacts.model_registry.ModelRegistry()
entry = registry.push(art, name="sentiment-classifier")
entry.aliases.add("staging")

Reverse the alias by promoting the novel build. Consumers are always reading sentiment-classifier:production.

# Reproducibility checklist

  • Configurations: Store each hyperparameter in wandb.config
  • Code and commit: Employ wandb.init(settings=wandb.Settings(code_dir=".")) to your snapshot code or rely on CI to include git SHA
  • Environment: Journal requirements.txt or Docker tag and attach it to the artifact
  • Seeds: Log them in and set them up

Minimum Seed Helper:

def set_seeds(s=42):
    import random, numpy as np, torch
    random.seed(s)
    np.random.seed(s)
    torch.manual_seed(s)
    torch.cuda.manual_seed_all(s)

# Collaborate and share without screenshots

Add notes and tags for team members to search. Employ reports to combine charts, tables and comments into links you can drop in Slack or PR. Interested parties can follow the course of events without opening the notebook.

# CI and automation tips

  • Start wandb agent at training nodes to perform CI searches
  • Log dataset artifact after ETL job execution; Train tasks may directly depend on this version
  • After evaluation, promote model aliases (stagingproduction) in a miniature final step
  • Pass WANDB_API_KEY as secret and associated with the group WANDB_RUN_GROUP

# Privacy and reliability tips

  • Employ private projects for teams by default
  • Employ offline mode for runs with air gaps. Then train normally wandb sync Later
export WANDB_MODE=offline
  • Don’t log raw personal information. If necessary, hash IDs before logging in.
  • For enormous files, store them as artifacts rather than attaching them wandb.log.

# Common Problems (and Quick Fixes)

  • “My run didn’t register anything.” The script may have crashed earlier wandb.finish() was summoned. Also check if you haven’t set it WANDB_DISABLED=true in your environment.
  • Logging seems ponderous. Log scalars at every step, but save enormous assets such as images or tables at the end of the epoch. You can also pass commit=False Down wandb.log() and group multiple logs together.
  • Seeing duplicate waveforms in the UI? If you are restarting from a checkpoint, set id AND resume="allow" IN wandb.init() continue the same run.
  • Are you experiencing mysterious data drift? Put each snapshot of your dataset into an artifact and pin your runs to the explicit versions.

# Pocket cheat sheet

// 1. Start running

wandb.init(project="proj", config=cfg, tags=["baseline"])

// 2. Log metrics, images or tables

wandb.log({"train/loss": loss, "img": [wandb.Image(img)]})

// 3. Version the dataset or model

art = wandb.Artifact("name", type="dataset")
art.add_dir("path")
run.log_artifact(art)

// 4. Consume the artifact

path = run.use_artifact("name:latest").download()

// 5. Start browsing

wandb sweep sweep.yaml && wandb agent //

# Summary

Start miniature: initialize a run, record some metrics, and push the model file as an artifact. When it feels natural, add an overview and a miniature report. You’ll get repeatable experiments, traceable data and models, and a dashboard that explains your work without a slideshow.

Józef Ferrer is an analytical engineer from Barcelona. He graduated in physical engineering and currently works in the field of data analytics applied to human mobility. He is a part-time content creator focusing on data science and technology. Josep writes about all things artificial intelligence, describing the application of the ongoing explosion in the field.

Latest Posts

More News