Automated Feature Extraction with Machine Learning and Image Processing

PD Stefan Bosse

University of Siegen - Dept. Maschinenbau
University of Bremen - Dept. Mathematics and Computer Science

1 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models -

Training and Validation of data-driven Models

Adapting dynamic parameters of a functional network is an iterative optimization problem

2 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models -

Training and Validation of data-driven Models

Adapting dynamic parameters of a functional network is an iterative optimization problem

Commonly the solution space is infinite, i.e., there is no one valid solution of the optimization problem.

3 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models -

Training and Validation of data-driven Models

Adapting dynamic parameters of a functional network is an iterative optimization problem

Commonly the solution space is infinite, i.e., there is no one valid solution of the optimization problem.

Basic training is demonstrated for an Artificial Neural Network

4 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - A simple Artificial Neuron

A simple Artificial Neuron

A simple neuron (perecptron) is a mapping function f(a model) that maps an n-dimensional input vector v on a scalar value u:

${f{{\left(\vec{{x}},\vec{{w}},{b}\right)}}}={g{{\left({\sum_{{{i}={1}}}^{{{n}}}}{w}_{{i}}{x}_{{i}}+{b}\right)}}}$

Here w is weight vector and b an offset (dynamic parameters). The function g is called transfer or activation function, normally not parametrized.

5 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - A simple Artificial Neuron

A simple Artificial Neuron

A single neuron with a single input p and an output o. w is a weighting factor (a weight for incoming p) and b is a bias (offset)

6 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - A Multi-input Artificial Neuron

A Multi-input Artificial Neuron

A single neuron with an input vector p and a scalar output o. w is a weighting factor vector (a weight for incoming p) and b is a bias (offset)

7 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - Artificial Neural Network

Artificial Neural Network

A ANN is a function graph consisting of interconnected neurons. It is a graph G(V,N) with a set of nodes (neurons) and vertices connecting the nodes.

Commonly neurons are arranged and grouped in layers, but this is not mandatory. There is always an input and one output layer. Hidden layers are between input and output layers.

8 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - Artificial Neural Network

Artificial Neural Network

The input layer (commonly) consists of n neurons for n input variables (attributes).
The output layer (commonly) consists of m neurons for m output variables (regression) or m target classes (classification)
Commonly, but not mandatory, each neuron of a layer i is connected with the outputs of all neurons of the previous layer i-1

9 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - Artificial Neural Network

Artificial Neural Network

Neural network with neurons arranged in one layer

10 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - Loss and Error Functions

Loss and Error Functions

Assume there is a set of data samples D, each sample contains the x input feature vector and output target feature vector y.

The goal of the model training is to find a model function that maps x on y with minimal error for all instances (at least averaged)
The loss or error function defines the mismatch of a training or test sample with the output of the function f(here for one scalar output y):

${y}={f{{\left(\vec{{x}}\right)}}}\\ {M}{A}{E}{\left({y},{y}_{{0}}\right)}={\left|{y}_{{0}}-{y}\right|}\\ {M}{B}{E}{\left({y},{y}_{{0}}\right)}={y}_{{0}}-{y}\\ {M}{S}{E}{\left({y},{y}_{{0}}\right)}={\left({y}_{{0}}-{y}\right)}^{{2}}$

11 / 56

PD Stefan Bosse - AFEML - Module F: Training and Validation of data-driven Models - Loss and Error Functions

Loss and Error Functions

For multiple outputs (y) we get:

$\vec{{y}}={f{{\left(\vec{{x}}\right)}}}\\ {M}{A}{E}{\left(\vec{{y}},\vec{{y}}_{{0}}\right)}=\frac{{{\sum_{{{i}={1}}}^{{n}}}{\left|{y}_{{i}}-{y}_{{{0},{i}}}\right|}}}{{n}}\\ {M}{B}{E}{\left(\vec{{y}},\vec{{y}}_{{0}}\right)}=\frac{{{\sum_{{{i}={1}}}^{{n}}}{y}_{{i}}-{y}_{{{0},{i}}}}}{{n}}\\ {M}{S}{E}{\left(\vec{{y}},\vec{{y}}_{{0}}\right)}=\frac{{{\sum_{{{i}={1}}}^{{n}}}{\left({y}_{{i}}-{y}_{{{0},{i}}}\right)}^{{2}}}}{{n}}$

12 / 56