.. _features: .. role:: raw-html(raw) :format: html Transformations =============== Audio signal transformations in Sigment are represented by classes that can be used to apply a specific type of transformation to the audio data. Some example transformation classes include ``GaussianWhiteNoise`` and ``TimeStretch``. A full list of available transformations and their details and parameters can be found :ref:`below`. Each of these transformation classes are a subclass of the more generic ``Transform`` base class, which provides a basic interface that can also be used to `write custom transformations `_. Sigment offers a familiar interface for transformations, taking inspiration from popular augmentation libraries such as `imgaug `_, `nlpaug `_, `albumentations `_ and `audiomentations `_. :raw-html:`
` .. contents:: Section contents :local: .. _transformations: Available transformations ------------------------- Identity ^^^^^^^^ .. autoclass:: sigment.transforms.Identity :members: Additive Gaussian White Noise ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autoclass:: sigment.transforms.GaussianWhiteNoise :members: Time Stretch ^^^^^^^^^^^^ .. autoclass:: sigment.transforms.TimeStretch :members: Pitch Shift ^^^^^^^^^^^ .. autoclass:: sigment.transforms.PitchShift :members: Edge Crop ^^^^^^^^^ .. autoclass:: sigment.transforms.EdgeCrop :members: Random Crop ^^^^^^^^^^^ .. autoclass:: sigment.transforms.RandomCrop :members: Linear Fade In/Out ^^^^^^^^^^^^^^^^^^ .. autoclass:: sigment.transforms.LinearFade :members: Normalize ^^^^^^^^^ .. autoclass:: sigment.transforms.Normalize :members: Pre-Emphasize ^^^^^^^^^^^^^ .. autoclass:: sigment.transforms.PreEmphasize :members: Extract Loudest Section ^^^^^^^^^^^^^^^^^^^^^^^ .. autoclass:: sigment.transforms.ExtractLoudestSection :members: Median Filter ^^^^^^^^^^^^^ .. autoclass:: sigment.transforms.MedianFilter :members: Using transformations --------------------- Each transformation class comes with a number of methods that can be used to apply the transformation to either a ``numpy.ndarray`` or WAV file. All transformations accept the `p` and `random_state` parameters, inherited from the ``Transform`` base class described below. .. autoclass:: sigment.transforms.Transform :members: :special-members: __call__ :inherited-members: Quantifiers =========== Quantifiers are used to specify rules for how a sequence of transformations or quantifiers should be applied. Each quantifier class is a subclass of the more generic ``Quantifier`` base class, which provides a basic interface that can also be used to write custom quantifiers, though there is rarely a need for this. As with transformations, Sigment offers a familiar interface for quantifiers, taking inspiration from popular augmentation libraries such as `imgaug `_ and `nlpaug `_. :raw-html:`
` .. contents:: Section contents :local: .. _Quantifiers: Available quantifiers --------------------- In the below table, the `steps` argument is of type ``List[Transform, Quantifier]``, specifying a sequence of transformations or quantifiers to be applied. +-------------------------------------------------+----------------------------------------------------+ | Quantifier | Summary | +=================================================+====================================================+ | | :raw-html:`

Quantifier (Base)

` | A base class representing a single quantifier. | | | ``Quantifier(steps, **kwargs)`` | | | +----------------------------------------------------+ | | | **Main parameters** | | | | • None | | +----------------------------------------------------+ | | **Notes**: As this is a base class, | | | it should **not** be initialized. | +-------------------------------------------------+----------------------------------------------------+ | | +-------------------------------------------------+----------------------------------------------------+ | | :raw-html:`

Pipeline

` | Sequentially executes each transformation or | | | ``Pipeline(steps, **kwargs)`` | quantifier step. | | +----------------------------------------------------+ | | | **Main parameters** | | | | • None | | +----------------------------------------------------+ | | **Notes**: None | +-------------------------------------------------+----------------------------------------------------+ | | +-------------------------------------------------+----------------------------------------------------+ | | :raw-html:`

Sometimes

` | Probabilistically applies the provided | | | ``Sometimes(steps, p, **kwargs)`` | transformation or quantifier steps. | | +----------------------------------------------------+ | | | **Main parameters** | | | | • `p`: ``0 <= float <= 1`` | | | | The probability of executing the | | | transformation or quantifier steps. | | +----------------------------------------------------+ | | **Notes**: None | +-------------------------------------------------+----------------------------------------------------+ | | +-------------------------------------------------+----------------------------------------------------+ | | :raw-html:`

Some

` | Randomly applies a number of the provided | | | ``SomeOf(steps, n, **kwargs)`` | transformation or quantifier steps. | | +----------------------------------------------------+ | | | **Main parameters** | | | | • `n`: ``tuple`` or ``int > 0`` | | | | The number of transformation or quantifier | | | steps to apply. | | +----------------------------------------------------+ | | **Notes**: The chosen steps will still be applied | | | in the same order they were defined by default. | +-------------------------------------------------+----------------------------------------------------+ | | +-------------------------------------------------+----------------------------------------------------+ | | :raw-html:`

One

` | Randomly applies a single step from the provided | | | ``OneOf(steps, **kwargs)`` | transformation or quantifier steps. | | +----------------------------------------------------+ | | | **Main parameters** | | | | • None | | +----------------------------------------------------+ | | **Notes**: This is a special case of the | | | ``SomeOf`` quantifier, with :math:`n=1`. | +-------------------------------------------------+----------------------------------------------------+ Using quantifiers ----------------- .. py:class:: sigment.quantifiers.Quantifier(steps, [main params], random_order=False, random_state=None) Base class representing a single quantifier. .. note:: As ``Quantifier`` is a base class, it should **not** be directly instantiated – use one of the quantifier classes listed :ref:`above`. :param steps: A collection of transformation or quantifier steps to apply. :type steps: ``List[Transform, Quantifier]`` :param random_order: Whether or not to randomize the order of execution of `steps`. :type random_order: ``bool`` :param random_state: A random state object or seed for reproducible randomness. :type random_state: ``numpy.RandomState``, ``int`` or ``None`` .. py:function:: __call__(self, X, sr=None) Runs the quantifier steps on a provided input signal. :param X: The input signal to transform. :type X: ``numpy.ndarray`` :math:`(T,)` or :math:`(1\times T)` for mono, :math:`(2\times T)` for stereo :param sr: Sample rate. :raw-html:`
` If the steps of the quantifier do not depend on a sample rate, this should be ``None`` (which is the default). See the :ref:`transformations table` to determine whether you need a sample rate or not. :type sr: ``int`` :math:`> 0` or ``None`` :return: The transformed signal. :rtype: ``numpy.ndarray`` :math:`(T,)` for mono, :math:`(2\times T)` for stereo **Example**: .. code-block:: python :linenos: import numpy as np from sigment.quantifiers import SomeOf from sigment.transforms import GaussianWhiteNoise, PitchShift, EdgeCrop # Create an example stereo signal. X = np.array([ [0.325, 0.53 , 1.393, 1.211], [1.21 , 0.834, 1.022, 0.38 ] ]) # Use the SomeOf quantifier to run only 1 to 2 of the transformations. transform = SomeOf([ GaussianWhiteNoise(scale=(0.05, 0.15)), PitchShift(n_steps=(-1., 1.)), EdgeCrop(side='start', crop_size=(0.02, 0.05)) ], n=(1, 2)) # Run the __call__ method on the quantifier object to transform X. # NOTE: Pitch shifting requires a sample rate when called, therefore # we must call the quantifier with a specified sample rate parameter. X_transform = transform(X, sr=10) .. py:function:: generate(self, X, n, sr=None) Runs the quantifier steps on a provided input signal, producing multiple augmented copies of the input signal. :param X: The input signal to transform. :type X: ``numpy.ndarray`` :math:`(T,)` or :math:`(1\times T)` for mono, :math:`(2\times T)` for stereo :param n: Number of augmented versions of `X` to generate. :type n: ``int`` :math:`> 0` :param sr: Sample rate. :raw-html:`
` If the steps of the quantifier do not depend on a sample rate, this should be ``None`` (which is the default). See the :ref:`transformations table` to determine whether you need a sample rate or not. :type sr: ``int`` :math:`> 0` or ``None`` :return: The augmented versions (or version if `n=1`) of the signal `X`. :rtype: ``List[numpy.ndarray]`` or ``numpy.ndarray`` **Example**: .. code-block:: python :linenos: import numpy as np from sigment.quantifiers import Sometimes, OneOf from sigment.transforms import LinearFade, GaussianWhiteNoise # Create an example stereo signal. X = np.array([ [0.325, 0.53 , 1.393, 1.211], [1.21 , 0.834, 1.022, 0.38 ] ]) # Use the Sometimes and OneOf quantifiers to sometimes (with probability 0.65) # apply a Gaussian white noise transformation and either a fade-in or fade-out. transform = Sometimes([ GaussianWhiteNoise(scale=(0.05, 0.15)), OneOf([ LinearFade(direction='in', fade_size=(0.05, 0.1)), LinearFade(direction='out', fade_size=(0.05, 0.1)) ]) ], p=0.65) # Generate 5 augmented versions of X, using the quantifier object. Xs_transform = transform.generate(X, n=5) .. py:function:: apply_to_wav(self, source, out=None) Runs the quantifier steps on a provided input WAV file and writes the resulting signal back to a WAV file. .. warning:: If `out` is set to ``None`` (which is the default) or the same as `source`, the input WAV file **will** be overwritten! :param source: Path to the input WAV file. :type source: ``str``, ``Path`` or *path-like* :param out: Output WAV path for the augmented signal. :type out: ``str``, ``Path`` or *path-like* **Example**: .. code-block:: python :linenos: from librosa import load from sigment import * # Load the stereo WAV audio file X, sr = load('audio.wav', mono=False) # Create a complex augmentation pipeline transform = Pipeline([ GaussianWhiteNoise(scale=(0.001, 0.0075), p=0.65), ExtractLoudestSection(duration=(0.85, 0.95)), OneOf([ RandomCrop(crop_size=(0.01, 0.04), n_crops=(2, 5)), SomeOf([ EdgeCrop('start', crop_size=(0.05, 0.1)), EdgeCrop('end', crop_size=(0.05, 0.1)) ], n=(1, 2)) ]), Sometimes([ SomeOf([ LinearFade('in', fade_size=(0.1, 0.2)), LinearFade('out', fade_size=(0.1, 0.2)) ], n=(1, 2)) ], p=0.5), TimeStretch(rate=(0.8, 1.2)), PitchShift(n_steps=(-0.25, 0.25)), MedianFilter(window_size=(5, 10), p=0.5) ], random_state=seed) # Apply the pipeline steps to the input WAV file and write it to the output file. transform.apply_to_wav('in.wav', 'out.wav') .. py:function:: generate_from_wav(self, source, n=1) Runs the quantifier steps on a provided input WAV file and returns a ``numpy.ndarray``. :param source: Path to the input WAV file. :type source: ``str``, ``Path`` or *path-like* :param n: Number of augmented versions of the `source` signal to generate. :type n: ``int`` :math:`> 0` :return: The augmented versions (or version if `n=1`) of the `source` signal. :rtype: ``List[numpy.ndarray]`` or ``numpy.ndarray`` **Example**: .. code-block:: python :linenos: from sigment import * # Create a pipeline of multiple OneOf quantifiers. transform = Pipeline([ OneOf([ EdgeCrop(side='start', crop_size=(0.04, 0.08)), EdgeCrop(side='end', crop_size=(0.04, 0.08)) ]), OneOf([ LinearFade(direction='in', fade_size=(0.02, 0.05)), LinearFade(direction='out', fade_size=(0.02, 0.05)) ]) ]) # Generate 5 augmented versions of the signal data from 'signal.wav' as numpy.ndarrays. Xs_transform = transform.generate_from_wav('signal.wav', n=5)