Scrigroup - Documente si articole

     

HomeDocumenteUploadResurseAlte limbi doc
BulgaraCeha slovacaCroataEnglezaEstonaFinlandezaFranceza
GermanaItalianaLetonaLituanianaMaghiaraOlandezaPoloneza
SarbaSlovenaSpaniolaSuedezaTurcaUcraineana

AdministrationAnimalsArtBiologyBooksBotanicsBusinessCars
ChemistryComputersComunicationsConstructionEcologyEconomyEducationElectronics
EngineeringEntertainmentFinancialFishingGamesGeographyGrammarHealth
HistoryHuman-resourcesLegislationLiteratureManagementsManualsMarketingMathematic
MedicinesMovieMusicNutritionPersonalitiesPhysicPoliticalPsychology
RecipesSociologySoftwareSportsTechnicalTourismVarious

Ranked modelling on feature vectors with missing values

computers



+ Font mai mare | - Font mai mic



Ranked modelling on feature vectors with missing values



Abstract: Ranked models can reflect regularities in a given set of feature vectors enriched by a priori knowledge in the form of ranked relations between selected objects or events represented by these vectors. Ranked regression models have the form of linear transformations of multivariate feature vectors on the line which preserve in the best possible way given set of ranked relations. We pay attention to the situations when particular objects or events can be represented by feature vectors with different dimensionality. Different dimensionality of feature vectors might appear when values are missing or when successive changing of feature space occurs. The linear ranked transformations can be designed on the basis of feature vectors of different dimensionality via minimization of the convex and piecewise linear (CPL) criterion functions.

Key words feature vectors with missing values, ranked relations, ranked linear transformations, convex and piecewise linear criterion functions

Introduction

Exploratory data analysis or pattern recognition methods can be aimed at discovering regularities or trends in multivariate data sets [1], [2]. In a standard data representation, objects or events are represented in the form of feature vectors with the same number of numerical components (features) or as points in a feature space of fixed dimensionality. Assumption about equal dimensionality of feature vectors might be too restricted in many practical tasks. For example, the missing data often undermines the assumption about fixed dimensionality. Let us take the causal sequence of liver diseases formulated by medical doctors [3] as another example. It is natural to assume that the more serious cases in this sequence should be examined in a more comprehensive manner than patients with light diseases. As a result, the dimensionality of the feature vectors can increase in a successive manner in accordance with the causal sequence of diseases.

Prominent role in the exploratory data analysis is played by procedures originating from the regression analysis. Ranked regression models have the form of such linear transformations of multivariate feature vectors on a line which preserve given set of ranked relations between selected objects or events in the best possible way [4]. We are taking into consideration designing ranked regression models on the basis of given set of ranked relations between selected objects or events represented by feature vectors with different dimensionality.

The ranked regression model can be induced from a given set of feature vectors enriched by a set of ranked relations between some of these vectors through minimization of the convex and piecewise-linear (CPL) criterion functions defined on differential vectors [5]. Theoretical properties of the CPL approach with varied dimensionality of feature vectors are analyzed in the presented paper.

2. Ranked relations

Let us consider a family of m objects (events, patients) Oj (j = 1,., m). We assume that each object Oj may be entirely (completely) represented by the n-dimensional feature vector xj[n] = [xj1,,xjn]T. In this case, the vectors xj[n] belong to the n-dimensional feature space F[n] (xj[n]F[n]), and indices i of features xi belong to the set I0 = . The component (feature) xji of the vector xj[n] is the numerical result of the i-th examination   (i = 1,.., n) of a given object Oj. The feature vectors xj[n] can be of a mixed type, and can represent different types of measurements of a given object Oj (for example xiI or xiIR

Let us assume that for some reason, for example due to missing values, particular objects Oj are not fully represented. This means that the object Oj is not represented by the n-dimensional feature vector xj[n] but by the nj-dimensional reduced vector xj[nj] (nj n):

( j I xj[nj] = [xj,i(1),, xj,i(nj)]T,

where i(k) I Ij and Ij I0

In accordance with the above relation, each feature vector xj[nj] is characterized by its own set Ij of feature indices i(k). The reduced vector xj[nj] can be obtained from the n-dimensional feature vector xj[n] by neglecting of the n - nj features xj,i with the indices i not belonging to the set Ij. This means geometrically that the vector xj[nj] (xj[nj]Fj[nj]) results from a projection of the vector xj[n] on the feature subspace Fj[nj].

In some cases, the family of objects Oj (j = 1,., m) can be characterised not only by feature vectors xj[nj] but also by ranked relations 'Oj Ok' between some of these objects. The ranked relation 'Oj Ok' ('Ok follows Oj') is fulfilled within pairs of the objects with the indices (j, k) from some set Jp:

(j, k) I Jp) Oj Ok Ok follows Oj

The family (3) of ranked relations 'Oj Ok', where (j, k) I Jp represents additional knowledge about some objects Oj. Let us assume, that the ranked relation is transient. It means that the below implication is fulfilled:

if Oj Ok and Ok Ol then Oj Ol

The ranked relation Oj Ok can be represented by the transient ranked relation 'xj xk' between feature vectors xj[nj] and xk[nk].

(j, k) I Jp) xj xk xk follows xj

For example additional knowledge about m = 5 objects Oj is represented by the family of three ranked relations between feature vectors: 'x x x x ', and 'x x '. We are allowing a situation, where some of feature vectors xj[nj] (1) represented in the ranked relations have missing values

  1. Ranked linear transformations

Let us consider linear transformation of the feature vectors xj[n] (xj[n]F[n]) on the line::

j I ) yj = w[n]Txj[n]

where w[n] = [w ,,wn]T is the parameter (weight) vector.

We are considering the problem of how to design such transformations (the ranked line y wTx (5) which preserve the relations xj xk the best.

Definition 1: The family of the ranked relations 'xj xk' (4) defines the sequential pattern P of the vectors xj in the feature space F[n] (xjI F[n] if and only if there exists such n-dimensional weight vector w that the below implication takes place

(j, k) I Jp) xj xk T (w )Txj < (w )Txk

The procedure of the sequential patterns P discovering and the ranked line designing y = wTx can be based on the concept of the linear separability of the set R of the differential vectors  rjk = (xk xj), where (j, k) I Jp

R

Definition 2: The set R (7) is linearly separable in the n-dimensional feature space F[n] if and only if there exists such a weight vector w that the below inequalities hold

w ( rjkI R (w )Trjk >

The weight vector w defines the hyperplane H(w ) in the feature space:

H w

The hyperplane H(w ) passes through the point 0 in the feature space. If the inequalities (8) hold, then the hyperplane H(w ) separates the set R+ (7). It means that all the elements rjj of the set R+ (8) are located on the positive side of the hyperplane H(w

Lemma The family of the ranked relations (4) defines the sequential pattern P (6) in the n-dimensional feature space F[n] (Def. 1) if and only if the set R (7) is linearly separable (8) in this space.

Proof: If the hyperplane H(w ) (9) separates (8) the set R (7), then the below inequalities hold (4):

( rjkI R) (w )Txk > w )Txj

As a result, the implication (6) is true. O the other hand, if the implication (6) takes place, then (8)

(( (j, k) I Jp (w )Txj < (w )Txk) T ( rjkI R) (w )Trjk > 0)

4. Dimensions equalization in pairs of feature vectors with missing values

In accordance with Lemma 1, the linear separability (8) of the set R (7) in the complete feature space F[n] allows for designing entirely ranked (6) transformation y = (w Tx. Now we will explore the possibility of designing ranked transformations on the basis of relations xj[nj] xk[nk]' (4) between feature vectors xj[nj] (xj[nj]Fj[nj]) and xk[nk] (xk[nk]Fk[nk]) from different feature spaces Fj[nj] and Fk[nk].

For this purpose , it is useful to define the set R (7) differently. The differences rjk = (xk xj) of the feature vectors xj[nj] (xj[nj]Fj[nj]) and xk[nk] (xk[nk]Fk[nk]) could be defined for the vectors belonging to the same feature space. We will distinguish the types E and E of the feature space equalization:

Type E : The equalized feature space Fj,k [nj,k] for the relation 'xj[nj] xk[nk]' is defined as the intersection of the spaces Fj[nj] and Fk[nk]:

Fj,k [nj,k] = Fj[nj] Fk[nk]

If the space Fj[nj] is equal to Fk[nk], then also the equalized feature space Fj,k [nj,k] is equal to Fk[nk]. If the space Fj[nj] is disjoined with Fk[nk], then the equalized space Fj,k [nj,k] is equal to zero (empty).

Type E : The equalized feature space Fj,k [nj,k] is defined as the sum of the spaces Fj[nj] and Fk[nk]:

Fj,k [nj,k] = Fj[nj] Fk[nk]

It is necessary in this case to define missing values xj,i or xk,i for such features xi which belong to only one of the feature spaces Fj[nj] or Fk[nk]:

if (i I Ij) and (i Ik), then xk,i = ck,I

if (i Ij) and (i I Ik), then xj,I = cj,I

where Ij is the set of feature indices i of the vector xj[nj] (1) and ck,i is the value assigned to the missing value xk,i In this particular case, all missing values can be equalized to zero:

if (i I Ij ) and (i Ik), then xk,i =

if (i Ij ) and (I I Ik), then xj,i =

The rules (13) or (14) and (15) allow to equalize feature spaces Fj[nj] and Fk[nk] related to each ranked relations 'xj[nj] xk[nk]' (4), where xj[nj]IFj[nj] and xk[nk] I Fk[nk]. Equalization of the feature space gives the possibility to compute the differential vectors  rj,k[nj,k] = (xk~[nj,k] xj [nj,k])) (7), where xk~[nj,k] and xj [nj,k]I Fj,k [nj,k] (12) or xk~[nj,k] and xj~[nj,k] I Fj,k [nj,k] (13).

Let us remark that the equalization of the Type E (12) related to the relation  'xj[nj] xk[nk]' means reducing some features xi, but without introducing artificial values to any feature. During the equalization of the Type E (13), no value of any feature xi is lost, but artificial values may be introduced to some features in this way. Generally, introducing artificial values could be the source of the ranked models bias.

5. Ranked transformations of equalized and enlarged feature vectors

The difference rj,k[nj,k] = (xk~[nj,k] xj [nj,k]) of equalized feature vectors xk~[nj,k] and xj [nj,k] can be defined for each ranked relations Oj Ok (2) by using the rule (12) or (13).

( (j, k) I Jp Oj Ok T rj,k[nj,k] = xk~[nj,k] - xj~[nj,k]

The differential vector rj,k[nj,k] (16) belongs to the feature space Fj,k [nj,k] (10) or to the space Fj,k [nj,k] (11). The dimension nj,k of the vector rj,k[nj,k] (16), where rj,k[nj,k] I Fj,k [nj,k] (10) or rj,k[nj,k] I Fj,k [nj,k] (11) depends on the type of the feature spaces Fj[nj] and Fk[nk] equalization.

In order to design the ranked transformation (6) each differential vector rj,k[nj,k] (16) is enlarged to the full n-dimensional vector rj,k [n], where rj,k [n]F[n] and Fj,k [nj,k] Fj,k [nj,k] F[n]. The enlargement of the vector rj,k[nj,k] (16) to rj,k [n] is done by putting the values zero for all such components of the vector rj,k [n] which are not represented in the vector rj,k[nj,k] (zero-enlargement).

The set R (7) is now defined on the enlarged vectors rj,k [n]:

R [n]

where Jp is the set of such pairs of indices (j,k) for which the ranked relation Oj Ok (2) holds for the objects Oj and Ok.

We will examine the possibility of representation (6) on the line (5) of the ranked relations 'xj [n] xk [n]' between enlarged vectors xj [n] and xk [n].

( w [n]) ( (j, k) I Jp

xj^[n] xk [n] T (w [n])Txj^[n] < (w [n])Txk^[n]

In accordance with Lemma 1, linear separability (8) of the set R [n] (17) is the necessary and sufficient condition for the implication (18)

Lemma 2: If the set R [n] (17) of the enlarged vectors xj [n] is linearly separable (8) in the feature space F[n], then the implication (18) also holds for the vectors xk [nj,k] with equalized dimension nj,k

( w [n]) ( (j, k) I Jp

xj [n] xk [n] T (wj,k [nj,k])Txj~[nj,k < (wj,k [nj,k])Txk~[ nj,k

where xj~[nj,k]Fj,k [nj,k] (12) or xj~[nj,k]Fj,k [nj,k] (13), the parameters vector wj,k [nj,k] is obtained from the vector w [n] = [w1,,wn]T (18) by reducing such components wi, which are not represented by features xi in the equalized vector xj~[nj,k].

Proof: For each ranked relation 'Oj Ok' (2), the equalized feature vectors xk~[nj,k] and xj [nj,k] are enlarged to the vectors xj^[n] and xk^[n] by including components equal to zero. As a result, the below equalities hold:

( (j, k) I Jp

(w [n])Txj^[n] = (wj,k [nj,k])Txj~[nj,k

(w [n])Txk^[n] = (wj,k [nj,k])Txk~[ nj,k

From these equalities result the implications.

In accordance with Lemma 2, if the ranked relations 'xj^[n] xk [n]' ((j, k) I Jp) (4) form the sequential pattern P (6) of the enlarged vectors xj^[n], then the relations 'xj~[nj,k xk nj,k]' form the pattern P of the equalized vectors xj~[nj,k

Let us remark that given feature vector xj[nj] (xj[nj]Fj[nj]) can be equalized and enlarged in a different manner, depending on the ranked relation Oj Ok (2) considered, and on the type of the equalization (the Type E (12) or the Type E+ (13)).

The implication (18) in thesis of the Lemma 2 is fulfilled both for the Type E (12) as well as for the Type E (13) equalization. The Type E (12) equalization of the vectors xj[nj] and xk[nk] ( Oj Ok ) does not introduce the bias in the ranked model, but it can be a significant loss of information as a result of reducing features xi. The Type E (13) of the dimension equalization means preserving information contained in all measured features xi , but the bias is introduced as a result of artificial values ck,i (14).

Convex and piecewise linear CPL) criterion function defined on feature vectors with varied dimensionality

Let us define the penalty function jjk(w[n]) for each element (j,k) of the set Jp (2):

j,k I Jp

1 - (wj,k[nj,k Trjk[nj,k if (wj,k[nj,k Trjk[nj,k 1

j jk(w[n]) =

0 if (wj,k[nj,k])Trjk[nj,k] > 1

where rj,k[nj,k] = (xk~[nj,k] - xj~[nj,k]) is the difference of the equalized vectors xk~[nj,k] and xj~[nj,k]), and wj,k[nj,k] is the parameters vector obtained from the vector w[n] = [w1,,wn]T by reducing these components wi, which are not represented by features xi in the vector rj,k[nj,k].

The criterion function F(w[n]) is the weighted sum of the penalty functions jjk(w[n]):

F(w[n] S gjk jjk(w[n])

(j,k)IJp

where gjk gjk > 0) is a positive parameter (price) related to the ranked relation 'Oj Ok'.

The function F(w[n]) (22) has a similar structure to the perceptron criterion function used in the theory of neural network and pattern recognition [2], [5]. The criterion function F(w) (22) is convex and piecewise linear (CPL) as the sum of this type of penalty functions jjk(w[n]) (21). The basis exchange algorithms, similar to linear programming, allow one to find the minimum of such a function efficiently, even in the case of large multidimensional data sets [6]:

F F(w*[n] = min F(w[n] 0

w

Lemma 3: The minimal value F(w*[n]) (23) of the criterion function F(w[n]) is equal to zero if and only if the linear transformation y = (w [n])Tx[n] preserves (18) all the ranked relations 'xj~[nj,k xk nj,k]' between equalized vectors xk~[nj,k] and xj~[nj,k]).

Proof: In accordance with Lemma 2, the ranked relations 'xj~[nj,k] xk [nj,k]' between equalized vectors are equivalent (20) to the ranked relations 'xj^[nj,k] xk [nj,k]' between enlarged vectors xj^[n] and xk^[n]. We infer on the basis of Lemma 1 that the ranked relations xj^[n] xk^[n] ((j,k) I Jp (2)) (4) form the sequential pattern P (6) in the n-dimensional feature space F[n] (Def. 1) if and only if the set R^[n] (17) is linearly separable (8). In this case there exists such weight vector w [n], that the hyperplane H(w [n]) (9) separates (8) the set R^[n] (17) and all the ranked relations 'xj^[n] xk^[n]' ((j,k) I Jp (2)) are preserved (18) by the model yj w [n])Txj^[n]. By taking adequate large constant c (c > 0), we can assure that all the inequalities cwj,k[nj,k])Trjk[nj,k] > 1 (21) are fulfilled. As a result, all the penalty functions jjk(cw [n]) (21) are equal to zero in the point cw [n] and the value of the criterion function F(cw [n]) is also equal to zero.

On the other hand, if the minimal value F(w*[n]) (23) is greater than zero (F(w*[n]) > 0), then exists such value jjk(w*[n]) of at lest one penalty function (21) which is greater than zero (jjk(w*[n]) > 0) in the optimal point w*[n] (23). It means that the set R^[n] (17) is not linearly separable (8) and not all ranked relation xj [n] xk^[n] are preserved by the model yj w [n])Txj^[n]

In accordance with Lemma 3, if there is no possibility of preserving all the ranked relations 'xj[nj,k] xk[nj,k]' (18) by any of linear transformations y = (w)Tx then F > 0 (23). The linear transformation y = (w Tx defined by the optimal vector w (23) is called the ranked model.

Example: gradual enlargement of the feature space related to the causal sequence of liver diseases

A causal sequence of events can also provide the basis for ranked modeling. An example of such events is given by modeling of the causal sequence of chronic liver diseases k ):

K

The symbol 'i i+1' in the above sequence means that the disease i+1 of a given patient Oj resulted from his earlier disease i , or i+1 is a consequence of the disease i (i = 1,., K-1). The sequence (24) should be formed in accordance with medical knowledge [3].

The ranked model of the causal sequence (24) was built with the use of Hepar system database [3]. About 800 feature vectors xj(k) describing particular patients Oj(k) related to one of seven (K = 7) chronic liver diseases k have been extracted from this database: - non hepatitis patients; - hepatitis acuta; - hepatitis persisten; - hepatitis chronica activa; - cirrhosis hepatitis compensata; - cirrhosis decompensata; - carcinoma hepatis. The sets feature vectors (examples, prototypes) xj(k) ralated to particular diseases k formed the so called learning sets Ck:

k I ) Ck =

where Jk is the set of mk indices j of such feature vectors xj(k) which are related to the disease k: m1 = 16; m2 = 8; m3 = 44; m4 = 95; m5 = 38; m6 = 60; m7 = 11.

The feature vectors xj(k) in the database of the system Hepar are of the mixed, qualitative-quantitative type. They contain both symptoms and signs (xi) as well as the numerical results of laboratory tests (xiR). About 200 different features xi describe one patients case in this system. For the purpose of preliminary computations, each patient has been described by the feature vector xj(k) composed of about 40 features chosen as a standard by medical doctors.

The causal sequence (24) also allows to determine the ranked relation 'xj(k) xj (k between the feature vectors xj(k) (xj(k)ICk) representing patients Oj(k) assigned to particular diseases k

k, k I) ( xj(k) I Ck) and ( xj (k I Ck )

if k k' then xj(k) xj (k

Let us remark, that in accordance with the above rule, there is no ranked relations 'xj (k) xj (k)' between patients Oj(k) and Oj (k) assigned to the same disease k

The causal sequence (24) represents the process of liver diseases development and transformation from the most light to the most serious state. It is natural to assume that patients with more serious diseases should be examined more extensively than patients with light diseases. Basing o this, the following scheme of the gradual enlargement of the feature spaces Fk[nk] consistent with the sequence (24) is assumed here:

F [n1] F [n2] FK[nK] = F[n]

where Fk[nk] is the nk-dimensional feature space appropriate (standard) for the k-th disease k

Let us remark that in all ranked relations 'xj(k) xj (k consistent with the rules (26) and (27) the vector xj (k ) is represented by all the features xi of the vector xj(k) and possibly also by some other features

xj(k) xj'(k') T Fk[nk] Fk'[nk']

The Type E (12) of the vectors xj(k) and xj (k ) equalization is recommended for the case (28). The Type E (12) of the dimensionality equalization allows to define the differential vectors rj,j[nk] for all the relations 'xj(k) xj (k (26) in the below manner:

if xj(k) xj (k , then rj,j[nk] = xj [nk] - xj [nk]

where xj [nj] = xj(k) and xj [nj] is obtained from the vector xj (k ) by neglecting such features xi which are not represented in the vector xj(k) (i Ik). In the case (29) the equalized feature space is equal to Fk[nk] (28) with the dimension nk.

The rule (29) allows to define the positive penalty function jj,j(w[n]) (21) for each ranked relation xj(k) xj (k (26). The penalty functions jj,j(w[n]) (21) are defined by the enlargement of the the vector rj,j[nk] (29) to the vector rj,j [n] (18) in the n-dimensional space F[n]. The enlargement of the vector rj,j[nk] (29) to rj,j [n] is done by putting the values zero for all such components of the vector rj,j [n] which are not represented in the vector rj,j[nk] (19). The vector rj, j'[nk'] (31) is enlarged to rj,j [n] in a similar manner. As a consequence, the ranked linear model can be defined by the optimal vector w [n] (23) constituting the minimum of the criterion function F(w[n]) (22) in the n-dimensional feature space F[n]

yj(k) = (w [n])Txj (k)

where the enlarged vector xj^(k) (xj^(k) I F[n]) is obtained from the nk-dimensional feature vector xj(k) (xj(k) I Fk[nk]) by putting the values zero (xi = 0) for all such components xi of the vector xj^(k) which are not represented in the vector xj(k)

The ranked model (30) can be used among others for prognosis or classification purposes. Let us consider the n -dimensional vector x (x I F [n0]) representing patient O with unknown diseases k (k = 1,., K). The ranked model (30) allows to assign the point y to the vector x :

y = (w [n])Tx

where x is the enlarged vector obtained from the n -dimensional vector x

The K-nearest neighbors (K-NN) decision rule can be used on the ranked model (30) for assigning some disease k(0) to the patient O represented by to the vector x . For this purpose we are selecting the disease k(0) which is mainly represented among K such points yj(k) (30), which are nearest to y . The points yj(k) (30) nearest to y (31), are characterized by the smallest absolute values | yj(k) - y |

The K-nearest neighbors (K-NN) decision rule based on the points yj(k) (30) and y (31) can be generally applied to feature vectors with missing values xj[nj] (1) and not only to the vectors xj(k) (28) from the successively enlarged feature spaces Fk[nk] (27).

8. Concluding remarks

The ranked linear models (30) can be designed on the basis of additional knowledge in the form of the ranked relations Oj Ok (3), which are determined within selected pairs of the objects (patients) Oj. Objects Oj can be represented by feature vectors xj[nj] (1) taken from varied feature spaces Fj[nj] (xj[nj]Fj[nj]). A variability of feature spaces Fj[nj] may result first of all from missing values in feature vectors xj[nj]. The ranked relations Oj Ok between objects Oj and Ok represented by the feature vectors xj[nj] and xk[nk]. In some cases the relations 'xj[nj] xk[nk] can be well preserved (6) by the ranked model in the form of the linear transformation y = w[n]Tx[n] (5) from the n-dimensional feature space F[n] (x[n]F[n]) on the line (yR ). In accordance with the Lemma 1, the linear transformation y = w [n]Tx[n] (5) preserves (6) all the ranked relations 'xj[n] xk[n] if and only if the set R (7) of the differential vectors rjk[n] = xk[n] xj[n] is separated (8) by the hyperplane H(w [n] (9) in the n-dimensional feature space F[n]. Designing of the ranked linear model (5) can be performed by the minimization of the (CPL) criterion function F(w[n]) (22).

The ranked relations Oj Ok (3) can be represented in some cases by the relations xj[nj] xk[nk] (4) between feature vectors from different feature spaces Fj[nj] (xj[nj]Fj[nj]) and Fk[nk] (xk[nk]Fk[nk]). The two-stage procedure has been proposed in the paper for the purpose of designing ranked models in such case. During the first stage, the equalization of the feature spaces Fj[nj] and Fk[nk] is carry out separately for each relation Oj Ok (j, k) I Jp (2)). The common feature space Fj,k [nj,k] (12) or Fj,k [nj,k] (13) with the equalized feature vectors xi~[nj,k] can result from this stage. The common feature space allows to define the differential vectors rj,k[nj,k] = xk~[nj,k] - xj~[nj,k] (14). Each differential vector rj,k[nj,k] (14) is enlarged to the full n-dimensional vector rj,k [n]. The enlargement of the vector rj,k[nj,k] (14) to rj,k [n] is done by putting the values zero for all such components of the vector rj,k [n] which are not represented in the vector rj,k[nj,k] (zero-enlargement). Preservation (6) of the relations xj [nj,k xk [nj,k] by the equalized feature vectors xj~[nj,k] and xk~[nj,k] by the ranked model yj(k) = (w [n])Txj (k) (30) has been linked to the linear separabilty (8) of the set R[n] (15) composed of the enlarged vectors rj,k [n].

The ranked models (30) designed on the basis of the feature vectors xj[nj] with varied dimensionality nj can be used among others for the purpose of decision (diagnosis) support. The K-NN decision rule based the point y (31) of unknown origin and the nearest points yj(k) (30) assigned to particular learning sets Ck (25) can be used for this purpose. Statistical properties of the ranked models (30) based on the ranked relation 'xj[nj] xk[nk] (2) between feature vectors xj[nj] with varied dimensionality nj need further study.

Acknowledgement: The author would like to thank Professor Jan Bemmel from the Erasmus University in Rotterdam for his intriguing question.

Bibliography

1. Johnson R. A., Wichern D. W.: Applied Multivariate Statistical Analysis, Prentice-Hall,

Inc., Englewood Cliffs, New York, 1991

2. Duda O. R. and Hart P. E., Stork D. G.: Pattern Classification, J. Wiley, New

3. Bobrowski L, Łukaszuk T., Wasyluk H.: Ranked modeling of liver diseases sequence,

European Journal of Biomedical Informatics

4. Bobrowski L.: Ranked modelling with feature selection based on the CPL criterion

functions, in: Machine Learning and Data Mining in Pattern Recognition, eds. P. Perner et al., Lecture Notes in Computer Science vol. 3587, Springer Verlag, Berlin, 2005

5. Bobrowski L.: Eksploracja danych oparta na wypukłych i odcinkowo-liniowych funkcjach

kryterialnych (Data mining based on convex and piecewise linear (CPL) criterion

functions) (in Polish), Białystok Technical University, 2005.

6. Bobrowski L.: Design of piecewise linear classifiers from formal neurons by some basis

exchange technique, Pattern Recognition, 24(9), pp. 863-870, 1991



This work is a part of the Polish - Romanian agreement on Scientific Cooperation between Romanian Academy and Polish Academy of Sciences. The work was partially financed by the KBN grant 3T11F01130, by the grant 16/St/2008 from the Institute of Biocybernetics and Biomedical Engineering PAS, and by the grant W/II/1/2008 from the Białystok University of Technology.



Politica de confidentialitate | Termeni si conditii de utilizare



DISTRIBUIE DOCUMENTUL

Comentarii


Vizualizari: 973
Importanta: rank

Comenteaza documentul:

Te rugam sa te autentifici sau sa iti faci cont pentru a putea comenta

Creaza cont nou

Termeni si conditii de utilizare | Contact
© SCRIGROUP 2024 . All rights reserved