Skip to Content

Research

PhD Results

The theme of my PhD dissertation was "Application of Discrete Predicting Structures in an Early Warning Expert System for Financial Distress".
The PhD disputation toke place in December 2004.

Brief description:
Dissertation's main idea was to use Kalman filters to estimate dynamical classifiers' parameters. A problem of financial distress
forecasting was used to test developed classifier. Previous researches in this field were limited to ML (Machine Learning) methods
assuming static behaviour of crisis process. Efficiency test were performed to compare this approach with existing ones.

Analysis of data pointed out strong noise existence, thus noise aware method - UKF (Unscented Kalman Filter) and discrete dynamic
systems - were chosen as a machine learning method. The accuracy of classification achieved using this new attempt was compared with
result the other methods, well known in this domain i.e. Neural Networks, Discriminant Analysis, Nearest Neighbour and Nearest Mean.
The results of various experiments, conducted in different space dimensions i.e. different number of attributes, proved that taking
dynamic into account one can build more efficient classifiers.

Ancillary problem undertaken in PhD researches was a reduction of attributes space. Appropriate feature selection algorithm was
proposed, in which selection wasn't done solely on features significance but also on their dependencies.
This feature selection algorithm is available in form of MATLAB code.

Introduction to the proposed method:
Discrete dynamical systems are widely known and used in technical problems, however their application to financial
problems are rarely. Using a state-space model, we can model process as:

Where uk are inputs, xk state variables, yk output, vk and nk are process and observation noise respectively. It is a generic
representation of dynamic systems. The first equation is known as process equation, second one as a observation equation.

Fundamental problem concerning application of above models is extension of these both equations in precise functional
form. In problem chosen as a testing environment, these functions cannot be derived from any other dependencies, thus
system identification procedure was required.
Identification (parameters estimation) was performed using a UKF (Uncented Kalman Filter). This is a nonlinear version
of Kalman filter, basing on an assumption that transformation of set of chosen points via nonlinear system allows to
calculate statistical moments of a random variable more precisely than e.g. linearization used in EKF (Extended Kalman
Filter). This transformation of random variable x is known as Unscented Transformation. More info on UKF or UT transformation
can be found here.

A generic form of proposed dynamic classifier is shown at this schema:

The developed method was called DDS – Discrete Dynamical System while classifier with open feedback loop was denoted as DSS –
Discrete Static System. During learning phase feedback loop should be opened (feedback ratio equal to 0) and parameters ought
to be calculated to minimalize mean square ratio (MSE classifier). An basic idea of MSE classification is shown on picture
below (solid line is a classification function, constant B was set to 1 in this case):

After this first phase of learning, feedback gain ratio should be calculated to maximalize accuracy ratio for train
set, as shown on chart below (charts comes from experiments described later at this page):

Experiments:
The problem of financial crisis forecasting was chosen to perform test for developed dynamical classifier. There were
many classification methods tested on this subject in previous researches, however all of them assumed static characteristic
of crisis process.

A dataset used in experiments contained 240 financial statements (112 from bankrupt firms and 128 from existing ones) a set of 30 financial ratios
was build as possible crisis indicators. Classification task was to decide to which group firm should belong ? to bankrupt
or existing. More info about this dataset can be found here.

Ancillary problem undertaken in PhD researches was attributes space selection. This selection should take into account
not only attributes significance ratios but also their inter-dependencies (desired for most of problems). Moreover
this requirement was very important in presented case of financial ratios dataset, where many of ratios are very similar
to the others due to their construction.

Appropriate feature selection algorithm was proposed, in which selection wasn?t done solely on features significance but
also on their dependencies. Experiments were performed to check if space dimensionality reduction lead to increase of the
classification accuracy. Results are show at chart below, where 'Significant' means features selected only from their
significance point of view, while 'With reduced dependencies' denotes set of significant variables but with reduced dependencies.

Increase of accuracy can be easily noticed in both cases, that leads to conclusion that proposed algorithm was efficient.

The classification accuracy of the proposed DSS method was compared to the results achieved using methods previously
applied to this problem ? Nearest Neighbour, Nearest Mean, Artificial Neural Network and Discriminant Analysis.
Parameters were estimated via UKF filter in 200 epochs (160 of cases in each epoch), while simulated annealing of noise
covariance matrix was used to ensure that filter converges fast at the beginning and parameters will not change rapidly
later on. Chart below show an example of parameters estimation for one of the experiments (3 inputs + 1 free parameter).

Classification accuracy was checked on a test set, using a 3-fold cross validation. It can be noticed on brief results that
DDS was the most accurate method in all cases.

Overall results allow to state that dynamic classification may increase the accuracy. A chart below contains comparison of
average classification accuracy for all methods.

Any questions and comments - send me an email.
Dataset used in experiments is available.