Metadata (cookie, IP address, location) may be collected for the functioning of the site. If you do not want this data processed, then you should leave the site.
OK
How I can create my own dataset for AIC desktop?
Brief introduction to AIC data formats
Data loading and formats
"Raw" data
AIC supports a number of formats for loading “raw” (unlabeled, partially labeled) data (csv, edf+, wav, binary and others).

The ability to load this data is available in the raw data analysis module (feature extraction module, Fig. 1) and the neural network designer (Fig. 2).

Since each subject area of research has its own specifics, it is initially possible to collect a set of “raw” data only using third-party programs, saving them in one of the supported formats.

It is also possible to save data from specialized formats into a universal format (*.shv), which is specially created for quickly loading any data (Fig. 1, 2).

Processed data (features)
Descriptions of images in the form of feature vectors can be stored in one of the universal formats. Loading is done from the main module (statistical processing module, Fig. 3).

Four formats are supported:

  1. xml (full). The format is designed specifically for AIC and allows you to create complex descriptions of image classes in any feature space, regardless of the subject area. Not only downloading is supported, but also saving in this format. A description of the format and an example are given in the listing (at the end of this page).
  2. xml (shortened). It is completely similar to the previous one, except that the names of XML tags are written in abbreviated form (table). Loading and saving data is supported;
  3. csv;
  4. txt. Simplified format.

XML format (full)
This format can be used to compile your own data set, regardless of the subject area

Listing
File structure

<?xml version="1.1" encoding="UTF-8" ?>

<!-- make sure the encoding is correct -->


<Classes lang=”en”>

<!-- lang - localization, if the separator is a comma, it is equal to “ru”, if the period is “en” -->

<!-- Specification - text description of characteristics obtained under various conditions or from physically different factors -->

<Specification description="The experiment was carried out under normal conditions">

<!-- the Feature tag has an optional attribute unused, which can be equal to any value; if it is present, then the feature is not used (turned off)-->

<Feature id="1" description="Distance between eyes" />

<Feature id="2" description="Eye color" />

<Feature id="3" description="Relationship between head and forehead sizes" />

................................

</Specification >

<Features>

<!-- Class - user description -->

<Class name="User 1">

<!-- the Realization tag also has an optional attribute timeId - the time the implementation was received or its serial number (in any case, it is an unsigned long integer) -->

<Realization>

<Feature id="1" value="0,101" />

<Feature id="2" value="1" />

<Feature id="3" value="333" />

................................

</Realization>

<Realization>

<Feature id="1" value="0,254" />

<Feature id="2" value="1" />

<Feature id="3" value="342" />

................................

</Realization>

</Class>

<Class name="User 2">

<Realization>...................................</Realization>

<Realization>...................................</Realization>

...................................

</Class>

<Class .........................</Class>

...................................

</Features>

</Classes>

The simplest example

<?xml version="1.0" encoding="windows-1251"?>

<Classes lang="en">

<Specification description="The experiment on generating handwritten passwords was carried out under normal conditions">

<Feature id="1" description="Rxy" />

<Feature id="2" description="Ryp" />

<Feature id="3" description="Rxp" />

<Feature id="4" description="Rxx" />

<Feature id="5" description="Ryy" />

<Feature id="6" description="Rpp" />

<Feature id="7" description="Ap1" />

<Feature id="8" description="Ap2" />

<Feature id="9" description="Ap3" />

<Feature id="10" description="Ap4" />

</Specification >

<Features>

<Class name="User 1">

<Realization>

<Feature id="1" value="-0,417" />

<Feature id="2" value="0,09" />

<Feature id="3" value="0,284" />

<Feature id="4" value="0,675" />

<Feature id="5" value="0,309" />

<Feature id="6" value="0,493" />

<Feature id="7" value="0,00137828" />

<Feature id="8" value="9,798E-5" />

<Feature id="9" value="5,498E-5" />

<Feature id="10" value="4,214E-5" />

</Realization>

</Class>

<Class name="User 2">

<Realization>

<Feature id="1" value="-0,534" />

<Feature id="2" value="0,07" />

<Feature id="3" value="0,312" />

<Feature id="4" value="0,234" />

<Feature id="5" value="0,754" />

<Feature id="6" value="0,134" />

<Feature id="7" value="0,00754334" />

<Feature id="8" value="7,798E-5" />

<Feature id="9" value="9,498E-5" />

<Feature id="10" value="3,214E-5" />

</Realization>

</Class>

</Features>

</Classes>

Principles of file construction
1. Each implementation is one measurement of the values of all (or some) features (for example, one mouse movement between two elements, one signature, a photo of a face, etc.), i.e., one sample of data.
2. Each class is a classified object (user, person, subject, phenomenon, etc.).
3. Each class must have one or more implementations.
4. Implementations can have a different number of features and a different number of values for the same feature (i.e., features with the same id); features can be presented in any order, even different from implementation to implementation.
Explanations

<Feature id="1" description="Rxy" /> feature descriptions for each image file are specified once in the <Specification description> block.

<Class name="User 1"> name of the image class. The image class consists of <Realization> implementations, which include the feature values <Feature id="1" value="-0.417" />.

Make sure the encoding in the <?xml ?> tag is correct!

Handwritten signatures dataset №2

XML format (short)
The shortened format takes up less disk space. Replace the appropriate tags and attributes with their counterparts throughout the file

Table
Realization
Class
Feature
value
timeId
generated
description
name
Patterns
R
C
F
v
t
g
d
n
P
Handwritten signatures dataset №2

Made on
Tilda