Core Modules#

The core module provides the base classes for building forecasting models. These classes form the foundation of the DeepTimeSeries framework and define the interface for all forecasting modules.

All forecasting models in DeepTimeSeries inherit from ForecastingModule, which extends PyTorch Lightning’s LightningModule. This provides automatic training, validation, and testing loops, metric tracking, and loss calculation.

ForecastingModule#

The base class for all forecasting models. It provides automatic training/validation/test step implementations, metric tracking, and loss calculation.

Key Features:

  • PyTorch Lightning Integration: Inherits from pl.LightningModule, providing automatic training loops

  • Automatic Loss Calculation: Aggregates losses from all heads with their respective weights

  • Metric Tracking: Automatically updates and computes metrics for each head during training, validation, and testing

  • Encoding-Decoding Architecture: Standardizes the encode-decode pattern for time series forecasting

  • Multi-head Support: Can handle multiple output heads with different loss weights

Architecture:

The module follows an encoder-decoder pattern:

  1. Encoding: The encode() method processes the input encoding window

  2. Decoding: The decode() method generates predictions (can differ between training and evaluation)

  3. Forward: Automatically combines encoding and decoding

Required Methods to Implement:

  • ``encode(inputs)``: Process the encoding window and return encoder outputs. This method receives a dictionary with keys like 'encoding.targets' and 'encoding.nontargets' containing tensors of shape (batch_size, encoding_length, n_features). Should return a dictionary of encoder outputs that will be passed to the decoder.

  • ``decode_eval(inputs)``: Generate predictions during evaluation/inference. Receives a dictionary combining the original inputs and encoder outputs. Should return a dictionary with head tags as keys (e.g., 'head.targets') and prediction tensors as values.

  • ``decode_train(inputs)``: (Optional) Generate predictions during training. Defaults to decode_eval() if not overridden. Can be used to implement teacher forcing or other training-specific behaviors.

  • ``make_chunk_specs()``: Generate chunk specifications based on model parameters. Should return a list of BaseChunkSpec instances that define the input/output structure. This is used by TimeSeriesDataset to extract the correct data windows.

Key Methods:

  • ``forward(inputs)``: Main forward pass that combines encoding and decoding. Automatically merges encoder outputs with inputs for the decoder. Returns the final output dictionary.

  • ``calculate_loss(outputs, batch)``: Computes the total loss by aggregating losses from all heads with their respective weights. Returns a scalar tensor.

  • ``training_step(batch, batch_idx)``: PyTorch Lightning training step. Computes forward pass, loss, and metrics. Automatically called during training.

  • ``validation_step(batch, batch_idx)``: PyTorch Lightning validation step. Computes forward pass and loss, updates metrics. Metrics are computed at epoch end.

  • ``test_step(batch, batch_idx)``: PyTorch Lightning test step. Similar to validation step but for testing.

  • ``forward_metrics(outputs, batch, stage)``: Computes metrics for the current batch. Returns a dictionary of metric values.

  • ``update_metrics(outputs, batch, stage)``: Updates metric states without computing values. Used during validation/test.

  • ``compute_metrics(stage)``: Computes final metric values from accumulated states. Called at epoch end.

  • ``reset_metrics(stage)``: Resets metric states. Called at the start of each epoch.

Properties:

  • ``encoding_length`` (int): Length of the input window for encoding. Must be a positive integer.

  • ``decoding_length`` (int): Length of the prediction window. Must be a positive integer.

  • ``head`` (BaseHead): Single output head (convenience property when using a single head). Raises an error if multiple heads are defined.

  • ``heads`` (list[BaseHead]): List of output heads. Supports multi-head models where different heads can predict different targets or use different loss functions.

class BaseHead[source]#

Bases: Module

Base class of all Head classes.

calculate_loss(outputs, batch)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

Return type:

Tensor

forward(inputs)[source]#
Parameters:

inputs (Any) –

Return type:

Tensor

get_outputs()[source]#
Return type:

dict[str, Any]

property has_metrics#
property label_tag: str#

Tag of target label. If the tag of head is “head.my_tag” then label_tag is “label.my_tag”.

property loss_weight: float#

Loss weight for loss calculations.

property metrics: MetricModule#
reset()[source]#
property tag: str#

Tag for a head. Prefix ‘head.’ is added automatically.

class DistributionHead(tag, distribution, in_features, out_features, loss_weight=1.0, metrics=None)[source]#

Bases: BaseHead

Parameters:
  • tag (str) –

  • distribution (Distribution) –

  • in_features (int) –

  • out_features (int) –

  • loss_weight (float) –

  • metrics (Metric | list[torchmetrics.metric.Metric] | dict[str, torchmetrics.metric.Metric]) –

calculate_loss(outputs, batch)[source]#
Return type:

Tensor

forward(x)[source]#
get_outputs()[source]#
reset()[source]#
class ForecastingModule[source]#

Bases: LightningModule

Base class of all forecasting modules.

calculate_loss(outputs, batch)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

Return type:

dict[str, Any]

compute_metrics(stage)[source]#
Parameters:

stage (str) –

Return type:

dict[str, Any]

decode(inputs)[source]#
decode_eval(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

decode_train(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

property decoding_length: int#
encode(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

property encoding_length: int#

Encoding length.

forward(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

forward_metrics(outputs, batch, stage)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

  • stage (str) –

Return type:

dict[str, Any]

property head: BaseHead#
property heads: list[deep_time_series.core.BaseHead]#
make_chunk_specs()[source]#
on_test_epoch_end()[source]#
Return type:

None

on_train_epoch_end()[source]#
Return type:

None

on_validation_epoch_end()[source]#
Return type:

None

reset_metrics(stage)[source]#
Parameters:

stage (str) –

Return type:

None

test_step(batch, batch_idx, dataloader_idx=0)[source]#
Parameters:
  • batch (dict[str, Any]) –

  • batch_idx (int) –

  • dataloader_idx (int) –

training_step(batch, batch_idx)[source]#
Parameters:
  • batch (dict[str, Any]) –

  • batch_idx (int) –

Return type:

dict[str, Any]

update_metrics(outputs, batch, stage)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

  • stage (str) –

Return type:

None

validation_step(batch, batch_idx, dataloader_idx=0)[source]#
Parameters:
  • batch (dict[str, Any]) –

  • batch_idx (int) –

  • dataloader_idx (int) –

class Head(tag, output_module, loss_fn, loss_weight=1.0, metrics=None)[source]#

Bases: BaseHead

Parameters:
  • tag (str) –

  • output_module (Module) –

  • loss_fn (Callable[[Tensor, Tensor], Tensor]) –

  • loss_weight (float) –

  • metrics (Metric | list[torchmetrics.metric.Metric] | dict[str, torchmetrics.metric.Metric]) –

calculate_loss(outputs, batch)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

Return type:

Tensor

forward(inputs)[source]#
Parameters:

inputs (Any) –

Return type:

Tensor

get_outputs()[source]#
reset()[source]#
class MetricModule(tag, metrics)[source]#

Bases: Module

Parameters:
  • tag (str) –

  • metrics (Metric | list[torchmetrics.metric.Metric] | dict[str, torchmetrics.metric.Metric]) –

compute(stage)[source]#
forward(outputs, batch, stage)[source]#
reset(stage)[source]#
update(outputs, batch, stage)[source]#

ForecastingModule#

class ForecastingModule[source]#

Bases: LightningModule

Base class of all forecasting modules.

calculate_loss(outputs, batch)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

Return type:

dict[str, Any]

compute_metrics(stage)[source]#
Parameters:

stage (str) –

Return type:

dict[str, Any]

decode(inputs)[source]#
decode_eval(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

decode_train(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

property decoding_length: int#
encode(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

property encoding_length: int#

Encoding length.

forward(inputs)[source]#
Parameters:

inputs (dict[str, Any]) –

Return type:

dict[str, Any]

forward_metrics(outputs, batch, stage)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

  • stage (str) –

Return type:

dict[str, Any]

property head: BaseHead#
property heads: list[deep_time_series.core.BaseHead]#
make_chunk_specs()[source]#
on_test_epoch_end()[source]#
Return type:

None

on_train_epoch_end()[source]#
Return type:

None

on_validation_epoch_end()[source]#
Return type:

None

reset_metrics(stage)[source]#
Parameters:

stage (str) –

Return type:

None

test_step(batch, batch_idx, dataloader_idx=0)[source]#
Parameters:
  • batch (dict[str, Any]) –

  • batch_idx (int) –

  • dataloader_idx (int) –

training_step(batch, batch_idx)[source]#
Parameters:
  • batch (dict[str, Any]) –

  • batch_idx (int) –

Return type:

dict[str, Any]

update_metrics(outputs, batch, stage)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

  • stage (str) –

Return type:

None

validation_step(batch, batch_idx, dataloader_idx=0)[source]#
Parameters:
  • batch (dict[str, Any]) –

  • batch_idx (int) –

  • dataloader_idx (int) –

BaseHead#

Base class for all head modules. Heads are responsible for producing model outputs and calculating losses.

Purpose:

Heads serve as the output layer of forecasting models. They: - Transform encoder/decoder outputs into predictions - Calculate loss between predictions and targets - Track metrics (e.g., MAE, MSE, RMSE) - Support both deterministic (point predictions) and probabilistic (distribution-based) forecasting

Key Properties:

  • tag: Unique identifier for the head (automatically prefixed with 'head.')

  • loss_weight: Weight for this head’s loss in the total loss calculation (default: 1.0)

  • metrics: Optional metrics to track (e.g., torchmetrics.MeanAbsoluteError)

  • label_tag: Corresponding label tag (e.g., 'label.targets' for 'head.targets')

Required Methods to Implement:

  • ``forward(inputs)``: Process inputs and produce predictions. This method is called during autoregressive decoding for each time step. Should return a tensor representing the prediction for the current step. The head accumulates these predictions internally.

  • ``get_outputs()``: Return all accumulated outputs as a dictionary. After autoregressive decoding completes, this method concatenates all predictions from forward() calls. Returns a dictionary with the head tag as key and a tensor of shape (batch_size, decoding_length, n_features) as value.

  • ``reset()``: Reset internal state (called at the start of each forward pass). Clears any accumulated predictions or internal state. Must be called before starting a new autoregressive decoding sequence.

  • ``calculate_loss(outputs, batch)``: Calculate loss between outputs and labels. Receives the output dictionary from get_outputs() and the batch dictionary containing labels. Should return a scalar tensor representing the loss.

Key Properties:

  • ``tag`` (str): Unique identifier for the head. Automatically prefixed with 'head.' (e.g., 'head.targets').

  • ``loss_weight`` (float): Weight for this head’s loss in the total loss calculation. Default is 1.0. Used when multiple heads are present to balance their contributions.

  • ``metrics`` (MetricModule | None): Optional metrics to track. If set, metrics are automatically updated during training/validation/test steps.

  • ``label_tag`` (str): Corresponding label tag. Automatically derived from the head tag (e.g., 'head.targets''label.targets').

  • ``has_metrics`` (bool): Whether metrics are configured for this head.

Creating Custom Heads:

To create a custom head, inherit from BaseHead and implement the required methods:

from deep_time_series.core import BaseHead
import torch
import torch.nn as nn

class CustomHead(BaseHead):
    def __init__(self, tag, output_module, loss_fn):
        super().__init__()
        self.tag = tag
        self.output_module = output_module
        self.loss_fn = loss_fn
        self._outputs = []

    def forward(self, inputs):
        output = self.output_module(inputs)
        self._outputs.append(output)
        return output

    def get_outputs(self):
        return {self.tag: torch.cat(self._outputs, dim=1)}

    def reset(self):
        self._outputs = []

    def calculate_loss(self, outputs, batch):
        return self.loss_fn(outputs[self.tag], batch[self.label_tag])
class BaseHead[source]#

Bases: Module

Base class of all Head classes.

calculate_loss(outputs, batch)[source]#
Parameters:
  • outputs (dict[str, Any]) –

  • batch (dict[str, Any]) –

Return type:

Tensor

forward(inputs)[source]#
Parameters:

inputs (Any) –

Return type:

Tensor

get_outputs()[source]#
Return type:

dict[str, Any]

property has_metrics#
property label_tag: str#

Tag of target label. If the tag of head is “head.my_tag” then label_tag is “label.my_tag”.

property loss_weight: float#

Loss weight for loss calculations.

property metrics: MetricModule#
reset()[source]#
property tag: str#

Tag for a head. Prefix ‘head.’ is added automatically.

DistributionHead#

Probabilistic head for producing distribution-based predictions.

Purpose:

Use DistributionHead when you want to model uncertainty in predictions. Instead of producing a single value, it produces a probability distribution from which you can sample or compute statistics.

Key Features:

  • Supports any PyTorch distribution (e.g., torch.distributions.Normal, torch.distributions.StudentT)

  • Automatically creates linear layers for distribution parameters

  • Applies appropriate transformations to ensure valid parameter values (e.g., softplus for scale parameters)

  • Uses negative log-likelihood as the loss function

Initialization Parameters:

  • tag (str): Unique identifier for the head (e.g., 'targets'). Will be prefixed with 'head.' automatically.

  • distribution (torch.distributions.Distribution): The distribution class to use (not an instance). Common choices include dist.Normal, dist.StudentT, dist.Gamma.

  • in_features (int): Number of input features (hidden size from the model).

  • out_features (int): Number of output features (number of targets to predict).

  • loss_weight (float): Weight for this head’s loss. Default is 1.0.

  • metrics (Metric | list[Metric] | dict[str, Metric] | None): Optional metrics to track.

How It Works:

  1. For each distribution parameter (e.g., loc, scale for Normal), a linear layer is created

  2. During forward pass, inputs are passed through these linear layers

  3. Transformations are applied to ensure valid parameter values (e.g., transform_to(constraint))

  4. A distribution instance is created with these parameters

  5. A sample is drawn from the distribution as the prediction

  6. Both the sample and the parameters are stored in outputs

  7. Loss is computed as negative log-likelihood: -log_prob(targets)

Example:

from deep_time_series.core import DistributionHead
import torch.distributions as dist

head = DistributionHead(
    tag='targets',
    distribution=dist.Normal,
    in_features=hidden_size,
    out_features=n_outputs,
    loss_weight=1.0,
)

Supported Distributions:

Any PyTorch distribution can be used. Common choices include:

  • dist.Normal: For normally distributed targets

  • dist.StudentT: For heavy-tailed distributions

  • dist.Gamma: For positive-valued targets

Output Format:

The head produces multiple outputs: - head.{tag}: Sampled values from the distribution - head.{tag}.{param}: Distribution parameters (e.g., head.targets.loc, head.targets.scale for Normal)

class DistributionHead(tag, distribution, in_features, out_features, loss_weight=1.0, metrics=None)[source]#

Bases: BaseHead

Parameters:
  • tag (str) –

  • distribution (Distribution) –

  • in_features (int) –

  • out_features (int) –

  • loss_weight (float) –

  • metrics (Metric | list[torchmetrics.metric.Metric] | dict[str, torchmetrics.metric.Metric]) –

calculate_loss(outputs, batch)[source]#
Return type:

Tensor

forward(x)[source]#
get_outputs()[source]#
reset()[source]#

MetricModule#

Module for tracking and computing metrics during training, validation, and testing.

Purpose:

MetricModule wraps TorchMetrics to provide stage-aware metric tracking. It automatically creates separate metric instances for training, validation, and testing phases, ensuring metrics are properly isolated and logged.

Key Features:

  • Automatic stage separation (train/val/test) with separate metric instances

  • Prefix management for logging (e.g., train/targets.mae, val/targets.mae)

  • Tag-based organization (automatically links head tags to label tags)

Initialization Parameters:

  • tag (str): The head tag (e.g., 'targets' or 'head.targets'). Used to generate logging prefixes.

  • metrics (Metric | list[Metric] | dict[str, Metric]): The metrics to track. Can be a single metric, list, or dictionary. Examples: MeanAbsoluteError(), [MeanAbsoluteError(), MeanSquaredError()].

Methods:

  • ``forward(outputs, batch, stage)``: Computes metrics for the current batch. Returns a dictionary of metric values with stage prefix.

  • ``update(outputs, batch, stage)``: Updates metric states without computing values. Used during validation/test to accumulate statistics.

  • ``compute(stage)``: Computes final metric values from accumulated states. Called at epoch end.

  • ``reset(stage)``: Resets metric states for the given stage. Called at the start of each epoch.

Internal Properties:

  • head_tag (str): The full head tag (e.g., 'head.targets')

  • label_tag (str): The corresponding label tag (e.g., 'label.targets')

Usage:

Typically, you don’t instantiate MetricModule directly. Instead, assign metrics to a head:

from deep_time_series.core import Head
from torchmetrics import MeanAbsoluteError, MeanSquaredError

head = Head(
    tag='targets',
    output_module=nn.Linear(hidden_size, n_outputs),
    loss_fn=nn.MSELoss(),
    metrics=[MeanAbsoluteError(), MeanSquaredError()],
)

The ForecastingModule automatically calls the metrics during training/validation/test steps.

Metric Tagging:

  • Head tag: head.{tag} (e.g., head.targets)

  • Label tag: label.{tag} (e.g., label.targets)

  • Logged as: {stage}/{tag}.{metric_name} (e.g., train/targets.mae)

class MetricModule(tag, metrics)[source]#

Bases: Module

Parameters:
  • tag (str) –

  • metrics (Metric | list[torchmetrics.metric.Metric] | dict[str, torchmetrics.metric.Metric]) –

compute(stage)[source]#
forward(outputs, batch, stage)[source]#
reset(stage)[source]#
update(outputs, batch, stage)[source]#