Fully-Connected Neural Networks

This pages is updated mirror of http://www.gorodnichy.ca/archives/phd/PINN/.

Candidate of Science (Ph.D.) Dissertation

In speciality 01.05.03 - Mathematical and software development for computer systems. Institute of Mathematical Machines and Systems, Glushkov Cybernetics Center of the National Academy of Sciences of Ukraine, Kiev, 10 September 1997.

"Исследование и разработка высокопроизводительных полных нейросетей" , Манускрипт, 130 стр., Original dissertation. 130 pages. In Russian .

"Дослiдження та розробка високопродуктивних нейромереж" Автореферат, 10 стр., Extended Summary. 10 pages, In Ukrainian.

"Investigation and Design of High Performance Fully-Connected Neural Networks", Abstract and related publications. In English

In the manuscript we consider the problem of designing high capacity neural networks with enhanced associative capability. Fully connected neural networks of binary neurons are considered and the pseudo-inverse learning rule is shown to be the most efficient for the memory capacity of these networks. We show that the attraction radius of the network is a function of the synaptic weight matrix of the network. We investigate the nature of the dynamic attractors of the network and derive the factors that affect their occurrence. We propose an approach based on the flood-fill neuroprocessing technique which efficiently detects the dynamic attractors. We introduce a modification to the pseudo-inverse rule based on partial reduction of the self-connection weights. This modification, termed the desaturation, is shown to practically double the attraction radius of the network. This is shown both theoretically and by simulations. In particular, we show that the desaturation increases the capacity of the auto-associative memory of the network up to 80% of the number of neurons, which is two to four times better than that of other known networks of the considered type.

Table on Content

  • Contents and Chapter 1. "Introduction"
  • Chapter 2. "Fully-connected Neural Networks": overview, theorem about cycles, flood-fill neuroprocessing technique
  • Chapter 3. "Pseudoinverse Learning Rule": properties, obtaining the formula for attraction radius
  • Chapter 4. "Desaturated Pseudoinverse Rule": introducing the desaturating coefficient, theory of desaturation (attraction radius, cycles, energy) and simulations
  • Bibliography
  • Appendices: code and data obtained by Monte-Carlo simulations
  • Sources:
    • Code of the program which simulates the Desaturated Pseudo-Inverse Neural Network: pi_ff.c
    • Bibliography: bibdisser.tex

There is another person, who defended PhD dissertation on Dynamics of pseudoinverse neural networks - Rolf Henkel from Institute for Neurophysics in Bremen. His dissertation is also not in English - it's in German and can be found here.

This research was awarded "Best Presentation" award at International Joint Conference on Neural Networks in 1999 (IJCNN'99).
It was also show-cased at the USA-NIS Neurocomputing Opportunities Workshop (NOW'99) that was organized jointly with the conference. For the first time ever, the research of leading neuro-scientists from former USSR) has been presented to International audience in all its diversity. I was one of the workshop organizers - Here are some pictures and names from this workshop.
See also:
- Adaptive Learning Neural Networks and Reinforcement Learning for Autonomous Robot Navigation (done at UofA with my advisor William W. Armstrong)
- Applying Fully-connected Neural Networks to Face Recognition in Video (done at NRC)

Publications

"The Optimal Value of Self-connection", (Dmitry O. Gorodnichy), Proc. of IJCNN'99, Washington, July 12-17, 1999. "Best presentation" award (at IJCNN'99).

Abstract: The fact that reducing self-connections improves the performance of the autoassociative networks built by the pseudo-inverse learning rule is known already for quite a while, but is not studied completely yet. In particular, it is known that decreasing of self-connection increases the direct attraction radius of the network, but it also known that it increases the number of spurious dynamic attractors. Thus, it has been concluded that the optimal value of the coefficient of self-connection reduction D lies somewhere in the range (0;0.5). This paper gives an explicit answer to the question what is the optimal value of the self-connection reduction. It shows how the indirect attraction radius increases with the decrease of D. The summary of the results pertaining to the phenomenon is presented.

https://www.researchgate.net/publication/2395474_The_Optimal_Value_of_Self-connection

See also: one-pager, slides.

"Designing High-Capacity Neural Networks for Storing, Retrieving and Forgetting Patterns Real-Time" (D.O. Gorodnichy, A.M. Reznik), Presentnation at IJCNN'99 NOW workshop

Abtract: In designing neural networks for pattern recognition the most challenging problems are the following.

        • 1) How to learn a network so that a) it can retrieve as many patterns as possible, and b) it can retrieve them from as much noise as possible;
        • 2) How to make learning fast so that patterns can be stored on-line;
        • 3) How to make retrieval fast;
        • 4) How to get rid of useless data, i.e. how to continuously update the memory when new data are coming.

The solutions to these problems were found at the Institute of Mathematical Machines and Systems of Ukrainian National Academy of Sciences, where a neurocomputer capable of storing and retrieving data in real-time was designed. The neurocomputer uses a non-iterative learning technique based on the Desaturated Pseudo-Inverse rule. This technique allows one to store in real-time up to 80%N patterns (as attractors with non-zero attraction basins), where N is the size of the neural network. When the number of patterns exceeds the capacity of the network, the Dynamic Desaturation rule is applied. This rule allows the neurocomputer to store patterns partially and also to remove from memory obsolete data. In retrieval, the Update Flow neuroprocessing technique is used. This technique is known to be very efficient for neural networks which evolve in time. It also automatically detects spurious dynamic attractors.

In the talk, we will describe in detail each technique contributing to the success of the project. The emphasis will be given to the description of non-iterative learning techniques which provides a valid alternative to the conventional time-consuming iterative learning methods.

Slides for the talk in Powerpoint and in Postcript.

See also:"Non-iterative learning rules for neural networks" (By A.M.Reznik) at the same conference.

"Static and Dynamic Attractors of Autoassociative Neural Networks" (D.O. Gorodnichy, A.M. Reznik) Lecture Notes in Computer Science, Vol 1311 (Proc. of 9th Intern. Conf. on Image Analysis and Processing (ICIAP'97), Florence, Italy, Sept. 1997, Vol. II), pp. 238-245, Springer

Abstract: In this paper we study the problem of the occurrence of cycles in autoassociative neural networks. We call these cycles {\em dynamic attractors}, show when and why they occur and how they can be identified. Of particular interest is the pseudo-inverse network with reduced self-connection. We prove that it has dynamic attractors, which occur with a probability proportional to the number of prototypes and the degree of weight reduction. We show how to predict and avoid them.

Poster ("Neural Networks for Pattern Recognition and Computer Vision"): attractors_slides.ps

https://www.researchgate.net/publication/221356634_Static_and_Dynamic_Attractors_of_Auto-associative_Neural_Networks

"Increasing Attraction of Pseudo-Inverse Autoassociative Networks" (D.O. Gorodnichy, A.M. Reznik), Neural Processing Letters, volume 5, issue 2, pp. 123-127, 1997, Kluwer Academic Publishers

Abstract: We show {\em how} partial reduction of self-connections of the network designed with the pseudo-inverse learning rule increases the direct attraction radius of the network. Theoretical formula is obtained. Data obtained by simulation are presented.

https://www.researchgate.net/publication/220578247_Increasing_Attraction_of_Pseudo-Inverse_Autoassociative_Networks

"Desaturating Coefficient for Projection Learning Rule" (Dmitry O. Gorodnichy), Lecture Notes in Computer Science, Vol. 1112 (Proc. of Intern. Conf. on Artificial Neural Networks (ICANN'96), Bochum, Germany, July 1996), pp.469-476, Springer.

Abstract: A Hopfield-like neural network designed with projection learning rule is considered. The relationship between the weight values and the number of prototypes is obtained. A coefficient of self-connection reduction, termed the desaturating coefficient, is introduced and the technique which allows the network to exhibit complete error correction for learning ratios up to 75\% is suggested. The paper presents experimental data and provides theoretical background explaining the results.

https://www.researchgate.net/publication/221079730_Desaturating_Coefficient_for_Projection_Learning_Rule

The Influence of Self-Connection on the Performance of Pseudo-Inverse Autoassociative Networks (Dmitry Gorodnichy). Published in Informatika journal, July 1998, Kiev, Ukraine

Abstract: Within the last decade there has been a considerable amount of interest in pseudo-inverse autoassociative neural networks (PINNs), which are networks designed with the pseudo-inverse learning rule. This interest is attributed to their high capacity and retrieval capability: the limit of 0.5N for the associative capacity (N being the number of neurons), obtained by Personnaz et al. and Kanter and Sompolinsky [1], [2], is probably the most referred to result concerning the networks. Recently though, it has been shown that the network capacity is higher than 0.5N when the reduction of self-connections is taken into account. The fact that self-connections affect the performance of the networks has been observed by many researchers, but no rigorous investigation of this phenomenon seems to have been done. In the paper we summarize the results obtained on the phenomenon and show that by partially reducing selfconnections the capacity of the PINN can be increased up to 0.8N , with the attract...

https://www.researchgate.net/publication/2605454_The_Influence_of_Self-Connection_on_the_Performance_of_Pseudo-Inverse_Autoassociative_Networks

"A Way to Improve Error Correction Capability of Hopfield Associative Memory in the Case Of Saturation" (Dmitry O. Gorodnichy), HELNET International Workshop on Neural Neworks Proceedings (HELNET 94-95), Vol. I/II, pp.198-212, VU University Press, Amsterdam

Abstract: A fully connected neural network with binary self-connected neurons is considered. The properties of such networks trained by the projection learning rule are investigated. The questions of prototype attractivity are studied, especially for the case of network saturation (when $M > N/2$, where $M$ is the number of prototypes, and $N$ - the number of neurons). The formula for the attraction radius is derived and the relationship between the weight coefficients and the attraction radius is obtained. It is shown that it is possible to increase the attraction radius and improve retrieval capability of the network, by introducing a coefficient of self-connection reduction, which is termed the desaturating coefficient. For example, it is demonstrated that even for the case of $M=0.75N$ it is possible by appropriately choosing the desaturating coefficient to achieve error correction. The paper presents theoretical results and provides data obtained by simulation of the model.

Postscript file (140Kb): HELNET'95.ps

"NEUTRAM - A Transputer Based Neural Network Simulator" (Dmitry O. Gorodnichy, Alexander M. Reznik), Proc. of Second Intern. Conf. on Software for Multiprocessors and Supercomputers Theory, Practice, Experience (SMS TPE'94), Sept. 1994, pp.136-142, Moscow, Russia

Abstract: Researchers in Neural Networks (NN) often require a simulation of the model under development. Conventional computer systems often do not yield the required performance for Parallel Distributed Processing even with 100 units. The problems are that available memory is not sufficient and processing iteration rate is far from desired; it is not uncommon for such simulations to take many hours to process simple network. NEUTRAM - is a tool designed for simulation of NNs of various sizes and configurations on a net of transputers. Currently working version can process full-connected NN with up to 600 neurons (i.e. up to 360000 connections) on the net of four transputers. With its aid a lot of research dealing with investigation of NNs (such as exploration the dependence of the behavior of a NN from its size and configuration, from learning rule) has been being carried out now in the Cybernetics Center of Ukrainian Academy of Sciences in Kiev. NEUTRAM uniformly allocates neurons and all data, upon which neurons depend, onto the available processors, and then process them using the strategy of flood data processing (also called update flow technique); that means that we keep only changes at the current iteration and process only those data that depend on these changes. It is this strategy that allows NEUTRAM to achieve high performance iteration rate. In the paper we describe it in details. We also describe the work of NEUTRAM itself: how it organizes the interaction of distributed units-neurons and data exchange among them. Our observations concerning for the above are presented.