Opto Engineering® - 20 years
主页 资源 基本用途
Advanced

Camera calibration

The coordinates on the sensor are expressed in a base of vectors called (u,v). The distortion of a camera-lens system can be expressed by an operator that transforms the coordinates with distortion (u,v) into coordinates without distortion (u’,v’) or vice versa. Therefore as a general rule we can write:

`((u'),(v'))=A((u),(v))`

With A, a generic operator.

Classic modelling requires A to be a function:

`((u'),(v'))=1/(1+kr^2)((u),(v))`

With

`r^2=x^2+y^2`

The factor can be developed in series to divide the contributions of the different degrees of r.

To also take spurious effects into account, the equation can be redefined as:

`((u'),(v'))=((u(1+k_1r^2+k_2r^4+...)+P_1(r^2+2u^2)+2P_2uv+...),(v(1+k_1r^2+k_2r^4+...)+P_2(r^2+2v^2)+2P_1uv+...))`

The effects of the main coefficients are graphically shown as follows:

Determining the transformation values makes it possible to calculate and correct the distortion.

Final remarks:

  • The model is applied to the camera and lens system. Any change in these components makes it necessary to recalculate the coefficients.
  • The parameters are intrinsic, therefore they do not depend on what is observed.

AI Neural Network Based Systems

Tech insights

In the traditional image processing approach, a machine vision algorithm provides the computer with instructions to acquire the image, sequentially analyze its characteristics, and extract the information needed to solve the vision application. Such algorithms are programmed by the user or extracted from an existing machine vision library.

Image analysis based on neural network-based systems approach the image interpretation problem from a different perspective - instead of providing all the instructions to the computer (i.e.: the algorithm), the user trains the computer by showing a dataset of images and corresponding labels that provide feedback on what should be considered a correct result. This approach is closer to how the human brain learns, hence why we say that these systems rely on Artificial Intelligence (i.e.: AI).

AI algorithms are trained on large datasets of labelled images, allowing them to learn the characteristics and features that define different objects, scenes, or structures within images. The algorithm can then identify patterns within the images of each sub-dataset that allow it to reliably determine the output results of a machine vision system.

AI software with learning capabilities is based on combinations of data vectors, whose algorithm architecture is commonly called a “neural network”.

Network Depth and deep learning

Tech insights

An artificial neural network is a computing model consisting of logical elements (artificial “neurons”) based on a simplified biological neural network model.

Neural networks can model complex transformations of an image with the goal of extracting relevant features and using them to solve a particular problem.

The number of network layers, often called Network Depth, describes the network complexity and its capability to create interconnections and relationships between inputs and outputs. This parameter is then referred to the ability to solve more complex problems.

Deep learning is a subset of machine learning methods based on artificial neural networks. The adjective "deep" refers to the use of multiple layers in the network.

Neural network

An artificial neural network is a computing model consisting of logical elements (artificial “neurons”) based on a simplified biological neural network model. The neurons can be considered as network nodes and divided into the following groups:

  • Input neurons, with a 1-1 relationship as featured in the sample (green nodes)
  • Hidden neurons, may be present or not, with a number of layers according to how much you wish the system to be capable of creating interconnections and therefore relationships between inputs (red nodes)
  • Output neurons which present the final outcomes of the calculation.

A learning process (either supervised or unsupervised) is necessary to make a neural network operational. Connections are moulded during learning (they all weighed the same in the previous image), some connections disappear, some connections become weaker while others become stronger.

Now the system is capable of solving the problems it was trained for.

Example:

Nicola and Paolo want to book a restaurant online. The system will show about a dozen restaurants for each one and will ask them to rate their preferences (which might just be based on selecting interesting restaurants). At that stage, based on inputs such as price, customer reviews, vicinity, etc. a popularity ranking will be created for each user (with only one output with a value between 0 and 10). Each user has their own neural network available which is specifically arranged to make Nicola or Paolo’s decision easier, although obviously Nicola’s rankings might not agree with what Paolo has in mind.

This example shows the advantage of neural networks: a generic software is written which adapts itself to a certain purpose only after learning.

Neural network learning process and main tasks

Tech insights

The learning process by the software uses the training image dataset to define the values of the coefficients and the relative weights of the combination of vectors that form the neural network.

This learning process allows the software to create the network model, and it is usually an iterative process of updating neural network weights based on the training dataset. Using machine learning software is typically divided into two main stages:

First is the training stage, wherein the software generates a model based on features learned from training samples.

Then comes the inference stage, where the model is applied on new images to perform the actual machine vision task.

Since the number of coefficients and weights to be assigned depends on the Network Depth, the training complexity and the required computational performance of the hardware increase as the Network Depth increases.

Common tasks that can be solved by neural networks are:

  • feature detection
  • object classification
  • instance segmentation
  • object location
  • reading characters

Machine learning

Machine learning

Machine learning includes the following main types:

  • supervised
  • unsupervised

The main difference between the two types is that supervised learning is performed using a basic truth. The operator, therefore, has prior knowledge of what the output values of our samples should be (e.g. right or wrong). Therefore, supervised learning is aimed at learning a function which, given a sample of desired data and outputs, is closer to the relationship between inputs and outputs which can be observed in the data.

Unsupervised learning has no labeled outputs. Therefore its goal is to deduce the structure within a group of data.

Supervised Learning

Supervised learning issues can be further grouped into regression and classifier issues.

Classifier: there is a classifier problem when the output variable is a category, like “red” and “blue” or “right” and “wrong”

Regression: a regression problem is when the output variable is a real value, like “dollars” for the estimate of a house, whose input parameters are “size”, “nearby schools”, etc.

The complexity of the model refers to the complexity of the function one is trying to learn.

Unsupervised Learning

Unsupervised learning is when the operator enters the inputs (such as images) with no matching output variable. The goal of unsupervised learning is to model the structure or the distribution underlying the data. Unlike supervised learning, there are no correct answers because there is no teacher. The algorithms discover the structures in the data and the laws governing them on their own.

Unsupervised learning issues are essentially clustering problems since one wishes to discover the inherent groupings in the data, for example, clustering objects according to their area. The dimensionality reduction of the problem is an aspect which cannot be overlooked, as significant features must be chosen (for example selecting objects according to the area might not be enough)

Machine learning advantages and disadvantages

Tech insights

Neural networks offer several advantages over traditional machine vision algorithms. They can adapt to changing input data due to their learning capabilities, making them more robust and reliable under varying conditions such as lighting, scale, and object orientation. Neural networks are also more accurate in detecting complex or irregular patterns, like reflections, defects, or textures.

Unlike traditional methods that require expert programming, neural networks simplify the process for users by allowing model training through labeled datasets, reducing the need for coding expertise.

However, neural networks come with disadvantages: they require powerful and expensive hardware, significant amounts of high-quality labeled data, and careful tuning of training parameters. Poor data quality or biased datasets can lead to incorrect results. Additionally, software licenses for neural networks tend to be more expensive than those for traditional vision systems.

Supervised, unsupervised and semi-supervised learning

Tech insights

Machine learning methods are categorized into three main types: supervised, unsupervised, and semi-supervised learning.

  • Supervised learning uses labeled data, where both input and desired output are known. The model learns to map inputs to outputs based on these labeled examples.
  • Unsupervised learning works only with input data, discovering patterns and structures without predefined output labels. It is commonly used for clustering tasks.
  • Semi-supervised learning combines both approaches, using a small set of labeled data along with a larger set of unlabeled data. This method helps reduce the effort required for data preparation, especially in deep learning applications that demand large datasets.

The main challenge across all types is the effort needed to create and label high-quality datasets, particularly for deeper networks.