torchkit.layers

torchkit.layers.conv2d(*args, **kwargs)[source]

same 2D convolution, i.e. output shape equals input shape.

Parameters

in_planes – The number of input feature maps.
out_planes – The number of output feature maps.
kernel_size – The filter size.
stride – The filter stride.
dilation – The filter dilation factor.
bias – Whether to add a bias.

Return type

Conv2d

torchkit.layers.conv3d(*args, **kwargs)[source]

same 3D convolution, i.e. output shape equals input shape.

Parameters

in_planes – The number of input feature maps.
out_planes – The number of output feature maps.
kernel_size – The filter size.
stride – The filter stride.
dilation – The filter dilation factor.
bias – Whether to add a bias.

Return type

Conv3d

class torchkit.layers.Flatten[source]

Flattens convolutional feature maps for fully-connected layers.

This is a convenience module meant to be plugged into a torch.nn.Sequential model.

Example usage:

import torch.nn as nn
from torchkit import layers

# Assume an input of shape (3, 28, 28).
net = nn.Sequential(
    layers.conv2d(3, 8, kernel_size=3),
    nn.ReLU(),
    layers.conv2d(8, 16, kernel_size=3),
    nn.ReLU(),
    layers.Flatten(),
    nn.Linear(28*28*16, 256),
    nn.ReLU(),
    nn.Linear(256, 2),
)

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool

class torchkit.layers.SpatialSoftArgmax(normalize=False)[source]

Spatial softmax as defined in 1.

Concretely, the spatial softmax of each feature map is used to compute a weighted mean of the pixel locations, effectively performing a soft arg-max over the feature dimension.

Parameters: normalize (bool) –

__init__(normalize=False)[source]

Constructor.

Parameters: normalize (bool) – Whether to use normalized image coordinates, i.e. coordinates in the range [-1, 1].

training: bool

class torchkit.layers.GlobalMaxPool1d[source]

Global max pooling operation for temporal or 1D data.

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool

class torchkit.layers.GlobalMaxPool2d[source]

Global max pooling operation for spatial or 2D data.

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool

class torchkit.layers.GlobalMaxPool3d[source]

Global max pooling operation for 3D data.

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool

class torchkit.layers.GlobalAvgPool1d[source]

Global average pooling operation for temporal or 1D data.

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool

class torchkit.layers.GlobalAvgPool2d[source]

Global average pooling operation for spatial or 2D data.

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool

class torchkit.layers.GlobalAvgPool3d[source]

Global average pooling operation for 3D data.

__init__()[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool

class torchkit.layers.CausalConv1d(in_channels, out_channels, kernel_size, stride=1, dilation=1, bias=True)[source]

A causal a.k.a. masked 1D convolution.

Parameters

in_channels (int) –
out_channels (int) –
kernel_size (Tuple[int, ...]) –
stride (Tuple[int, ...]) –
dilation (Tuple[int, ...]) –
bias (Optional[torch.Tensor]) –

bias: Optional[torch.Tensor]

out_channels: int

kernel_size: Tuple[int, ...]

stride: Tuple[int, ...]

padding: Union[str, Tuple[int, ...]]

dilation: Tuple[int, ...]

transposed: bool

output_padding: Tuple[int, ...]

groups: int

padding_mode: str

weight: torch.Tensor

__init__(in_channels, out_channels, kernel_size, stride=1, dilation=1, bias=True)[source]

Constructor.

Parameters

in_channels (int) – The number of input channels.
out_channels (int) – The number of output channels.
kernel_size (int) – The filter size.
stride (int) – The filter stride.
dilation (int) – The filter dilation factor.
bias (bool) – Whether to add the bias term or not.