.. _features:
.. role:: raw-html(raw)
:format: html
Transformations
===============
Audio signal transformations in Sigment are represented by classes that can be used to apply
a specific type of transformation to the audio data.
Some example transformation classes include ``GaussianWhiteNoise`` and ``TimeStretch``. A full
list of available transformations and their details and parameters can be found :ref:`below`.
Each of these transformation classes are a subclass of the more generic ``Transform`` base class,
which provides a basic interface that can also be used to `write custom transformations `_.
Sigment offers a familiar interface for transformations, taking inspiration from popular augmentation libraries
such as `imgaug `_, `nlpaug `_,
`albumentations `_ and `audiomentations `_.
:raw-html:`
`
.. contents:: Section contents
:local:
.. _transformations:
Available transformations
-------------------------
Identity
^^^^^^^^
.. autoclass:: sigment.transforms.Identity
:members:
Additive Gaussian White Noise
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: sigment.transforms.GaussianWhiteNoise
:members:
Time Stretch
^^^^^^^^^^^^
.. autoclass:: sigment.transforms.TimeStretch
:members:
Pitch Shift
^^^^^^^^^^^
.. autoclass:: sigment.transforms.PitchShift
:members:
Edge Crop
^^^^^^^^^
.. autoclass:: sigment.transforms.EdgeCrop
:members:
Random Crop
^^^^^^^^^^^
.. autoclass:: sigment.transforms.RandomCrop
:members:
Linear Fade In/Out
^^^^^^^^^^^^^^^^^^
.. autoclass:: sigment.transforms.LinearFade
:members:
Normalize
^^^^^^^^^
.. autoclass:: sigment.transforms.Normalize
:members:
Pre-Emphasize
^^^^^^^^^^^^^
.. autoclass:: sigment.transforms.PreEmphasize
:members:
Extract Loudest Section
^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: sigment.transforms.ExtractLoudestSection
:members:
Median Filter
^^^^^^^^^^^^^
.. autoclass:: sigment.transforms.MedianFilter
:members:
Using transformations
---------------------
Each transformation class comes with a number of methods that can be used to apply the transformation to either a ``numpy.ndarray`` or WAV file.
All transformations accept the `p` and `random_state` parameters, inherited from the ``Transform`` base class described below.
.. autoclass:: sigment.transforms.Transform
:members:
:special-members: __call__
:inherited-members:
Quantifiers
===========
Quantifiers are used to specify rules for how a sequence of transformations
or quantifiers should be applied.
Each quantifier class is a subclass of the more generic ``Quantifier`` base class,
which provides a basic interface that can also be used to write custom quantifiers,
though there is rarely a need for this.
As with transformations, Sigment offers a familiar interface for quantifiers, taking inspiration from popular augmentation libraries
such as `imgaug `_ and `nlpaug `_.
:raw-html:`
`
.. contents:: Section contents
:local:
.. _Quantifiers:
Available quantifiers
---------------------
In the below table, the `steps` argument is of type ``List[Transform, Quantifier]``, specifying a sequence of transformations or quantifiers to be applied.
+-------------------------------------------------+----------------------------------------------------+
| Quantifier | Summary |
+=================================================+====================================================+
| | :raw-html:`Quantifier (Base)
` | A base class representing a single quantifier. |
| | ``Quantifier(steps, **kwargs)`` | |
| +----------------------------------------------------+
| | | **Main parameters** |
| | | • None |
| +----------------------------------------------------+
| | **Notes**: As this is a base class, |
| | it should **not** be initialized. |
+-------------------------------------------------+----------------------------------------------------+
| |
+-------------------------------------------------+----------------------------------------------------+
| | :raw-html:`Pipeline
` | Sequentially executes each transformation or |
| | ``Pipeline(steps, **kwargs)`` | quantifier step. |
| +----------------------------------------------------+
| | | **Main parameters** |
| | | • None |
| +----------------------------------------------------+
| | **Notes**: None |
+-------------------------------------------------+----------------------------------------------------+
| |
+-------------------------------------------------+----------------------------------------------------+
| | :raw-html:`Sometimes
` | Probabilistically applies the provided |
| | ``Sometimes(steps, p, **kwargs)`` | transformation or quantifier steps. |
| +----------------------------------------------------+
| | | **Main parameters** |
| | | • `p`: ``0 <= float <= 1`` |
| | | The probability of executing the |
| | transformation or quantifier steps. |
| +----------------------------------------------------+
| | **Notes**: None |
+-------------------------------------------------+----------------------------------------------------+
| |
+-------------------------------------------------+----------------------------------------------------+
| | :raw-html:`Some
` | Randomly applies a number of the provided |
| | ``SomeOf(steps, n, **kwargs)`` | transformation or quantifier steps. |
| +----------------------------------------------------+
| | | **Main parameters** |
| | | • `n`: ``tuple`` or ``int > 0`` |
| | | The number of transformation or quantifier |
| | steps to apply. |
| +----------------------------------------------------+
| | **Notes**: The chosen steps will still be applied |
| | in the same order they were defined by default. |
+-------------------------------------------------+----------------------------------------------------+
| |
+-------------------------------------------------+----------------------------------------------------+
| | :raw-html:`One
` | Randomly applies a single step from the provided |
| | ``OneOf(steps, **kwargs)`` | transformation or quantifier steps. |
| +----------------------------------------------------+
| | | **Main parameters** |
| | | • None |
| +----------------------------------------------------+
| | **Notes**: This is a special case of the |
| | ``SomeOf`` quantifier, with :math:`n=1`. |
+-------------------------------------------------+----------------------------------------------------+
Using quantifiers
-----------------
.. py:class:: sigment.quantifiers.Quantifier(steps, [main params], random_order=False, random_state=None)
Base class representing a single quantifier.
.. note::
As ``Quantifier`` is a base class, it should **not** be directly instantiated – use one of the quantifier classes listed :ref:`above`.
:param steps: A collection of transformation or quantifier steps to apply.
:type steps: ``List[Transform, Quantifier]``
:param random_order: Whether or not to randomize the order of execution of `steps`.
:type random_order: ``bool``
:param random_state: A random state object or seed for reproducible randomness.
:type random_state: ``numpy.RandomState``, ``int`` or ``None``
.. py:function:: __call__(self, X, sr=None)
Runs the quantifier steps on a provided input signal.
:param X: The input signal to transform.
:type X: ``numpy.ndarray`` :math:`(T,)` or :math:`(1\times T)` for mono, :math:`(2\times T)` for stereo
:param sr: Sample rate. :raw-html:`
` If the steps of the quantifier do not depend on a sample rate, this should be ``None`` (which is the default). See the :ref:`transformations table` to determine whether you need a sample rate or not.
:type sr: ``int`` :math:`> 0` or ``None``
:return: The transformed signal.
:rtype: ``numpy.ndarray`` :math:`(T,)` for mono, :math:`(2\times T)` for stereo
**Example**:
.. code-block:: python
:linenos:
import numpy as np
from sigment.quantifiers import SomeOf
from sigment.transforms import GaussianWhiteNoise, PitchShift, EdgeCrop
# Create an example stereo signal.
X = np.array([
[0.325, 0.53 , 1.393, 1.211],
[1.21 , 0.834, 1.022, 0.38 ]
])
# Use the SomeOf quantifier to run only 1 to 2 of the transformations.
transform = SomeOf([
GaussianWhiteNoise(scale=(0.05, 0.15)),
PitchShift(n_steps=(-1., 1.)),
EdgeCrop(side='start', crop_size=(0.02, 0.05))
], n=(1, 2))
# Run the __call__ method on the quantifier object to transform X.
# NOTE: Pitch shifting requires a sample rate when called, therefore
# we must call the quantifier with a specified sample rate parameter.
X_transform = transform(X, sr=10)
.. py:function:: generate(self, X, n, sr=None)
Runs the quantifier steps on a provided input signal, producing multiple augmented copies of the input signal.
:param X: The input signal to transform.
:type X: ``numpy.ndarray`` :math:`(T,)` or :math:`(1\times T)` for mono, :math:`(2\times T)` for stereo
:param n: Number of augmented versions of `X` to generate.
:type n: ``int`` :math:`> 0`
:param sr: Sample rate. :raw-html:`
` If the steps of the quantifier do not depend on a sample rate, this should be ``None`` (which is the default). See the :ref:`transformations table` to determine whether you need a sample rate or not.
:type sr: ``int`` :math:`> 0` or ``None``
:return: The augmented versions (or version if `n=1`) of the signal `X`.
:rtype: ``List[numpy.ndarray]`` or ``numpy.ndarray``
**Example**:
.. code-block:: python
:linenos:
import numpy as np
from sigment.quantifiers import Sometimes, OneOf
from sigment.transforms import LinearFade, GaussianWhiteNoise
# Create an example stereo signal.
X = np.array([
[0.325, 0.53 , 1.393, 1.211],
[1.21 , 0.834, 1.022, 0.38 ]
])
# Use the Sometimes and OneOf quantifiers to sometimes (with probability 0.65)
# apply a Gaussian white noise transformation and either a fade-in or fade-out.
transform = Sometimes([
GaussianWhiteNoise(scale=(0.05, 0.15)),
OneOf([
LinearFade(direction='in', fade_size=(0.05, 0.1)),
LinearFade(direction='out', fade_size=(0.05, 0.1))
])
], p=0.65)
# Generate 5 augmented versions of X, using the quantifier object.
Xs_transform = transform.generate(X, n=5)
.. py:function:: apply_to_wav(self, source, out=None)
Runs the quantifier steps on a provided input WAV file and writes the resulting signal back to a WAV file.
.. warning:: If `out` is set to ``None`` (which is the default) or the same as `source`, the input WAV file **will** be overwritten!
:param source: Path to the input WAV file.
:type source: ``str``, ``Path`` or *path-like*
:param out: Output WAV path for the augmented signal.
:type out: ``str``, ``Path`` or *path-like*
**Example**:
.. code-block:: python
:linenos:
from librosa import load
from sigment import *
# Load the stereo WAV audio file
X, sr = load('audio.wav', mono=False)
# Create a complex augmentation pipeline
transform = Pipeline([
GaussianWhiteNoise(scale=(0.001, 0.0075), p=0.65),
ExtractLoudestSection(duration=(0.85, 0.95)),
OneOf([
RandomCrop(crop_size=(0.01, 0.04), n_crops=(2, 5)),
SomeOf([
EdgeCrop('start', crop_size=(0.05, 0.1)),
EdgeCrop('end', crop_size=(0.05, 0.1))
], n=(1, 2))
]),
Sometimes([
SomeOf([
LinearFade('in', fade_size=(0.1, 0.2)),
LinearFade('out', fade_size=(0.1, 0.2))
], n=(1, 2))
], p=0.5),
TimeStretch(rate=(0.8, 1.2)),
PitchShift(n_steps=(-0.25, 0.25)),
MedianFilter(window_size=(5, 10), p=0.5)
], random_state=seed)
# Apply the pipeline steps to the input WAV file and write it to the output file.
transform.apply_to_wav('in.wav', 'out.wav')
.. py:function:: generate_from_wav(self, source, n=1)
Runs the quantifier steps on a provided input WAV file and returns a ``numpy.ndarray``.
:param source: Path to the input WAV file.
:type source: ``str``, ``Path`` or *path-like*
:param n: Number of augmented versions of the `source` signal to generate.
:type n: ``int`` :math:`> 0`
:return: The augmented versions (or version if `n=1`) of the `source` signal.
:rtype: ``List[numpy.ndarray]`` or ``numpy.ndarray``
**Example**:
.. code-block:: python
:linenos:
from sigment import *
# Create a pipeline of multiple OneOf quantifiers.
transform = Pipeline([
OneOf([
EdgeCrop(side='start', crop_size=(0.04, 0.08)),
EdgeCrop(side='end', crop_size=(0.04, 0.08))
]),
OneOf([
LinearFade(direction='in', fade_size=(0.02, 0.05)),
LinearFade(direction='out', fade_size=(0.02, 0.05))
])
])
# Generate 5 augmented versions of the signal data from 'signal.wav' as numpy.ndarrays.
Xs_transform = transform.generate_from_wav('signal.wav', n=5)