Outline For Neural Networking Using Artificial Intelligence

  1. Introduction to Neural Networking
    1. Real Word Examples
    2. Layering of Neural Networks
  2. Diagramming of Neural Networks and Comparison to Human Intelligence
  3. WSU Research on Neural Networking and Back Propagation
    1. Training and Testing with Industrial Robots
    2. Breast Cancer Recurrence Prediction
    3. Standard and Poor's 500 Stock Market Index Prediction
  4. Assignments on Neural Networks
    1. Input/Output Layers Involving Identifiable Parameters
    2. Normalizing Data
  5. Project Involving Software - BrainMaker
    1. Tutorial - Real Estate Appraisals
    2. Self-taught Tutorial - Medical Forecasting
    3. Football Game Predictions

Introduction to Artificial Intelligence and Neural Networks

I. Artificial Intelligence (AI) is the ability of an artificial mechanism to exhibit intelligent behavior. AI programs have developed from the primitive stage to the point where they include computer programs that perform medical diagnoses, mineral prospecting, speech understanding, and vision interpretation. The term Artificial Intelligence was coined in 1956 when a group of interested scientists met for an initial summer workshop.

Early work in Artificial Intelligence consisted of attempts to simulate the neural networks of the brain with numerically modeled nerve cells. Success was very limited due to the great complexity of the problem and primitive state of computers. Interest was revived in the 1980's and has continued into the 1990's because of the advances in computer technology. Early and current systems are based on systems that manipulate numbers and symbols. For example, if an AI system is told "If x is a bird, then x can fly." If such an AI system determines that a robin is a bird, then it can also fly.

The first knowledge-based expert program was written in 1967 called Dendral. It could predict the structures of unknown chemical compounds based on routine analysis. More sophisticated rules-based expert systems where subsequently developed, notably the Mycin program. It uses rules derived from the medical domain to reason backwards (deduce) from a list of symptoms to a particular disease.

Some recent uses of Artificial Intelligence using neural networks include the following:

  1. Ford Motor Company is developing a neural network that reads sensory data from automobile engines and determines probable cause of existing problems.
  2. The US Airforce is using a neural network to train new pilots in a simulator with examples illustrating expert pilot performance.
  3. General Devices Space Systems Division is using a neural network to monitor the opening and closing of valves on the Atlas Rocket. The network does this by watching fluctuations on the power bus which is less expensive and more reliable than having sensors on 150 valves.
  4. A bomb detector uses a neural network at the TWA terminal at New York's JFK Airport.

Artificial Intelligence systems makes decisions by using a "neural network." A neural network is a computer with an internal structure that imitates the human brain's interconnected system of neurons. In a neural network, transistor circuits (gates on a computer chip) are the electronic analog of neurons. Neural networks do not follow rigidly programmed rules as more conventional digital computers do. Rather, they build a knowledge base through a trial and error method. A programmer, for instance, will digitally input a photographic image for a neural network to identify and it will "guess" for which circuits to "fire" (activate). It will then identify the photograph and output the correct answer.

Pathways between individual circuits are "strengthened" (resistance turned down) when a task is performed correctly and "weakened" (resistance turned up) if performed incorrectly. In this way a neural network "learns" from it mistakes and gives a more accurate output with each repetition of a task. At a fundamental level, all networks learn by association. For example, a neural network can learn to identify a pumpkin by associating the inputs "large, round, orange and vegetable" with the output "pumpkin."

The neurons in a neural network are usually organized in three layers: input, hidden, and output. Sometimes more than one hidden layer is used. In Diagram 1, each circle represents a neuron. Each column of neurons is a layer and every neuron in one layer is connected to every neuron in the next layer.

The network shown in the diagram is trained to recognize vegetables from their descriptions. Information flows from the input layer through the hidden layer to the output layer. The hidden layer makes the associations between the inputs and outputs. It is called a hidden layer because it has no direct connection to the outside world. You present information to the input layer and the network gives you an answer in the output layer.

There are many different ways a network can learn. The most popular learning method is by example and repetition, also called Back Propagation. The vegetable neural network is trained by this method. Many example pairs of inputs and outputs are collected and presented to the network. Each time any input ("orange, round, large and vegetable") is presented to the network, it guesses what the output is supposed to be. When a network is brand new and has not learned anything yet, it will probably make a wrong guess.

Suppose our untrained network initially decides that a large, round, orange vegetable is a zucchini. The training example which has the correct output indicates the vegetable is really a pumpkin. The network compares the network output to the training example output and makes changes to its internal connections so that the next time it sees the same inputs it will be more likely to produce the correct answer. The connection adjusts so that the inputs are associated more strongly with the pumpkin output and less strongly with the zucchini output. This training is repeated for a set of examples until the network learns the correct answer. Once the network is trained using pre-selected inputs and outputs, we can run it on new input information (without any supplied outputs) and have it recognize, generalize or predict the answer for us.

Diagramming of Neural Networks and Comparison to Human Intelligence

II. The human brain is a complex biological network of billions of special cells called neurons. These neurons send information back and forth to each other through connections; the result is an intelligent being capable of learning, analysis, prediction and recognition. Artificial neural networks are formed from hundreds of thousands of simulated neurons that are connected in much the same way as the brain's neurons and are thus able to learn in a similar manner to people. Diagram 2 illustrates biological and artificial neurons.

Some early neural network systems used individual electronic devices. Now we can use a neural network simulator to test neural network theories or to make useful applications. A neural network simulator is a program (a set of computer instructions) that creates a model of neurons and the connections between them and then trains this model.

There are many types of neural networks, but all have three things in common. A neural network can be described in terms of its individual neurons, the connections between them (topology) and its learning rule. While neural networks can do some impressive things, they cannot replicate all aspects of a human brain. The brain is far too massive and complex for even a super-computer to fully simulate. There are two important types of simulation a neural network can do: modeling brain processes and modeling brain capabilities.

A brain process model tests theories about brain function. For example, the human brain is able to recognize speech but only with a certain temperature range. If the brain temperature goes below 80 degrees or above 110 degrees, the brain is unable to recognize speech at all. Thus, a neural network that models brain processes might well include a temperature factor. The purpose of the brain capability model is to perform some of the brain's functions though not necessarily in the same way that the brain does. Thus, a neural network that is used to model speech recognition capability would probably be designed without a temperature factor and the inter-connections between neurons and the learning method might be simplified. Most neural networks attempt to model only brain capabilities.

Since the human brain is a complex network of billions of highly inter-connected cells or neurons these cells receive information from as many as 10,000 other cells. A neuron in the brain has four basic parts: the body, the incoming channel, the outgoing channel and the connecting points between the neurons which are called synapses. These are shown in Diagram 3.

The synapses attach "weights" to incoming signals so that each of the signals will have a different effect on the neuron. A synapse can cause a signal to "turn on" (excite) or "turn off" (inhibit) the neuron. A highly excited neuron sends out an output signal, an inhibited one does not. The job of the neuron body is to add up all the incoming signals and decide if the total is enough to send out a signal. Each neuron detects and sends out only one simple thing. It is the job of the inter-connected neurons to determine such things as judging the speed of an oncoming car. Such a group of inter-connected neurons is called a neural network. Learning occurs in the brain in the form of changes to the synapses.

In an artificial neural network, each neuron also receives the output signals from many other neurons. A neuron calculates its own output by finding the weighted sum of it inputs. The point where two neurons communicate is call a connection (analogous to a synapse). The strength of the connection between two neurons is called a weight. As described previously, the neurons are usually connected in three layers which were input, hidden and output. An artificial neural network built with today's technology has very few connections compared to the number in the brain. The human brain has about one hundred billion neurons and ten million billion connections. Nettalk, which converts printed text into speech has about 325 neurons and 20,000 connections. A neural network learns by the system of back propagation in which an error signal is fed back through the network, altering weights as it goes, to prevent the same error from happening again. The network is trained by presenting it with input and output pairs. The weights are changed so that the network will eventually produce the matching output pattern when given the corresponding input pattern of the pair.

Neural networks are best known for their pattern recognition ability. If you need to recognize or classify something, in some instances, a neural network can do it faster and more accurately than a person. A neural network can look at something and identify even (sometimes) with missing or invalid data. Neural networks can recognize cancer from image analysis, aircraft from radar returns and the type of sex of insects from win beat frequencies. They are not known for precision. If you ask a neural network for the sum of 2.01 and 2.02 it will probably give an answer of 4. If we wonder how smart neural networks can get, Diagram 4 shows the level of technology today. The number of neurons (as a log) is on the vertical axis and the compute speed (as a log) in connections per second is on the horizontal axis. The latest chip technology (Intel's 80170NX) has about the compute speed of a cockroach. If we project that neural network performance will double every three years as have the performances of memory and microprocessors, we are about 120 years away from electronic devices with the same performance potential as a person. Although neural networks cannot "think" as fast, they can organize data and results into meaningful categories to an extent which no human is capable of.


WSU Research on Neural Networking and Back Propagation

III. Much of our summer research consisted of training and testing sessions on neural networks under the direction of our mentor, Zoran Obradovic, and graduate assistants Tim Chenoweth and Radu Drossu. These neural networks were prepared by following five basic steps.

  1. Define the problem. Decide what information to use and what the network will do.
  2. Decide how to represent the information and gather it.
  3. Define the network. Select network inputs and specify the outputs. Once you complete the first two steps, this step is nearly automatic.
  4. Train the network.
  5. Test the trained network. This involves presenting new inputs to the network and comparing the networks results to reality.

A. Our first task was to become familiar with the UNIX operating system and practice on a routine network with data describing the features of industrial robots. There are three standard benchmark problems for performance comparison of different learning algorithms. The inputted data was very simplistic, consisting of 16 indicators all being 0's and 1's. The 0's and 1's identified the features of the robots. For example: large-1, small-0, lifting ability above 5kg-1, not-0, spot weld capability-1, not-0. The outputs were also 0's and 1's which is true of many neural networks. They simply mean acceptable for a company's use or not acceptable.

We trained the network on the known data (124, 169 and 122 robots for problems 1, 2 and 3 respectively) and then tested it on new data with expected high success rates. Again, the back propagation program uses a weighted sum which it keeps changing by trial and error until it can separate data by multi-layered planes in order to categorize robots with different features into acceptable and not acceptable groups.

B. For the breast cancer recurrence prediction, our mentor, Zoran, had a file of "real live data" which we trained and tested on neural network simulator Ver 1.0 which was written by one of our instructors, Radu Drossu. The data consisted of vital statistics from 286 breast cancer patients from University Medical Center, Ljubljeue, Slovenia. We were supplied with inputs such as pulse, blood pressure, blood cell counts, etc. There were nine indicators and an output for each patient. The output was 0 or 1 indicating that cancer reoccurred within some interval of time.

The task was from this known data. We wanted to train the program to predict whether cancer would reoccur in a former patient knowing their nine vital statistics. With the 286 cases we split the data using 4/5 of it to train the program and the other 1/5 to test it. We also used five different splits. The results are listed in an included table. Using Radu's program is a trial and error tedious process. The program has essentially four main settings which the operator can alter to obtain better prediction results. The network must repeat its iterations or "epochs" which usually number in the thousands. The other three settings are learning rate, momentum, and tolerance.

Radu, who wrote the program, uses the analogy of a ball rolling along an undulating path and the program must learn the path. If the learning rate is set too high it takes big jumps and misses parts of the curve. If it come to a hill after a plateau, it must have increased momentum to climb it. However, just setting these two parameters, low and high respectively, does not always work. These parameters are described in Diagram 5.

Our first attempts yielded dismal results. Then we realized that the data was not "normalized." This means that the inputs were not in the range (0,1). Many neural networks only accept normalized data as inputs and then use this and a weighted sum in the decision function. Input data can be readily normalized with a linear function and an exercise on this is included in this module.

By adjusting the parameters and using many iterations and repeating this process many times, we were able to obtain an average correct prediction rate of over 70%. Keep in mind that we used known results for testing so this means that in predicting cancer reoccurrence in new patient the program would be correct about 70% of the time. This result is comparable to previously reported generalizations obtained using different data modeling techniques and is obtained in a relatively short time period.

C. We also worked on a program, the feature selection for predictive models of the stock market, that was developed at WSU which given suitable inputs predicts the future movements in the Standard & Poor Composite Index which is an average of the value of 500 selected stocks as stated by the school of engineering and science.

"It is well known that the stock market does a very good job of reflecting the actual value of the underlying stock. However, as recently indicated, it is still possible that there are nonlinear relationships between market information and the value of the stock that so far have not been identified and therefore, not reflected in stock prices. Our aim is to explore if their nonlinear relationships can be captured using a machine learning approach of problem tailored artificial neural networks."

The input data for this program were 32 financial indicators such as the consumer price index, the US treasury T bill rate, etc. Many of the inputs were the same indicator, but they would go back several time intervals. The writer's of the program felt it contained "noise", i.e. it had too many inputs which did not appreciably affect the predicted outcome. A complex algorithm was devised which would sequentially remove the indicator which least changed the outcome. By using this process we decreased the number of inputs to the eight "best" which was the desired amount. Working with this program gave us a better understanding of neural networks and an appreciation for the complexity of UNIX.

This system is not PC based so it will be left to "BrainMaker" to teach neural networking to our students.

IV. Pre-Programming Neural Network Assignments

  1. Normalizing Data (included)
  2. Network Inputs and Outputs (included)

V. Project Involving Software-BrainMaker

A. In this example real estate problem, before we can predict the selling house prices for the next day there are certain steps to follow to get your neural network in operating condition. You are going to be creating input and output files, making BrainMaker files, training the network, evaluating the results and finally, changing data to predict the selling price of any house based on this data. This will give you an estimated price of a house if it would go on the market tomorrow.


The training network has 217 examples of houses and their individual data which consists of the following:

Name Description Range
SALEPRIC actual sale price of home $103,000-250,000
DWLUN number of dwelling units 1-3
RDOS reverse data of sale (months since sale) 0-23
YRBLT year built 1850-1986
TOTFIXT number of plumbing fixtures 5-17
HEATING heating system type coded as 2 or 3
WBFPSTKS wood burning fireplace stacks 0-1
BMNTGAR basement/garage 0-2
ATTFRGAR attached frame garage area
TOTLIVAR total living area 714-4185
DECK/OFP deck/open porch area 0-738
ENCLPOR enclosed porch area 0-452
NBHDGRP neighborhood group coded as 1 or 2
RECROOM recreation room area 0-672
FINBSMT finished basement area 0-810
GRADE% grade factors 0.85-1.08
CDU condition/desirability/usefulness 3-5
TOTOBY total other value (building and yard) 0-16400


Open NetMaker (click icon, data files need to be made in this program while the work is done in BrainMaker)

  1. Click on Read in data file, specify RE.Dat.
  2. Click on Manipulate Data
  3. Click SalePrice column, LABEL, Mark Column as Pattern (this is what the computer is to learn from all other inputs)
  4. Go to LABEL again, All Unmarked Columns->Inputs. (The computer will read these as data then try to fit it to the SalePrice column of known data)
  5. FILE, Save Netmaker File. Call it RE2.Dat. (This saves all your data to use in BrainMaker)
  6. FILE, Preferences, Network Display to Number (As an extra precaution, this will change all data to numbers)
  7. FILE, Create BrainMaker Files, press enter, OK to overwrite.
  8. FILE, exit

    Now we need to let BrainMaker make use of the data in NetMaker.

  9. Click BrainMaker icon, or click Read Data Files, RE2.Def.
  10. CONNECTIONS, Change Network Size (We need to let the program know how many inputs or characteristics to the house to use)
  11. PARAMETERS, Training Control Flow, click .2 on training and testing tolerance (This lets you choose the rate at which you want it to learn the data and tolerance, or the % range to which the answers must be correct. You can choose between .1 and .5, but a good number to choose for this particular problem is .2)
  12. PARAMETERS, Training Control Flow, click at Test Every N Runs and type 1 (This turns on testing file)
  13. Now click at Save Every N Runs and change to 4, click OK
  14. FILE, Training Statistics, type RE2.Sts, OK to overwrite
  15. FILE, Testing Statistics, click OK, also OK to overwrite

    The last four steps have made it possible to look at the steps as the program learns the data and the statistics tell you which test was the best.

  16. FILE, Save Network, RE2.Net, this is OK (saves network)
  17. OPERATE, Get Next Fact (This will train our data files and the data should be shown on the screen)
  18. OPERATE, Train Network (The data should be change on the screen before your eyes. It is done when it reads Bad: 0, Good: 195. To make the training go faster, choose DISPLAY, Enable Display.)
  19. When it is done learning FILE, Save Network, specify RE2.Net, OK to overwrite.
  20. FILE, exit.

    Now we have to evaluate training (the statistics)

  21. Click NetMaker icon, Read Data File, type Re2.Sta, OK.
  22. Manipulate Data. Now look at AvgError column. Find the runs with the lowest average error. There may be a couple, example 16 and 25.
  23. FILE, Read in Data File, type RE2. Sts. Look at Ave Error column for the ones that tested well (16 and 25) and pick the better of the two. Suppose it is 25, write it down to remember.
  24. FILE, exit.

    Now we want to retrieve the network that was saved just before the best run. If it was 25, the last saved would be 24.

  25. Click BrainMaker icon.
  26. FILE, Read Network, type Run00024.Net.
  27. PARAMETERS, Training Control Flow. In the Stop When box, check Run Number box and change number of runs to the best run number. For example, 25, click OK.
  28. OPERATE, Train Network.
  29. When training stops, FILE, Save Network as RE3.Net, OK to overwrite.
  30. FILE, exit.

    Now, here is where we can enter new data, or a description of a house and our training file will tell us the price of the new home.

  31. Click NetMaker icon, Read in Data File, REIN.Dat.
  32. Manipulate Data
  33. LABEL, All Unmarked Columns->Inputs.
  34. FILE, Create Running Fact File, REIN.Dat is fine, OK to overwrite
  35. FILE, exit
  36. Click BrainMaker icon, Read in Data File, RE3.Net
  37. FILE, Select Fact Files, type or select REIN.Dat
  38. FILE, Write Facts to File, RE3.Out. Make sure Running is checked in the Write Fact to File During. All other defaults are fine. This writes inputs and outputs from running the network into a file you can read.


  1. As you see presented on the screen what is the current selling price of the house?
  2. Click on the piece of data RECROOM. Change the current number to 25. What is the new selling price of the house?
  3. Change YRBLT to 1900, ENCLPOR to 400, and ATTFRGAR to 150. What is the new selling price of the house?
  4. Identify 2 pieces of data and change the values according to their range. What is the new selling price of the house?
  5. Which piece of data seems to make the biggest difference on the selling price of a house: TOTLIVAR, TOTOBY, or TOTFIXT?
  6. By changing the values of the pieces of data what is the most expensive house you can predict and what is the most affordable house you can predict (show what values you changed).



B. This neural network will predict the length of stay for hospital patients. This type of neural network is one that is used in a quality improvement and cost reduction medical system at Anderson Memorial Hospital in South Carolina.

Since knowing the length of stay at a hospital is another way of stating the severity of an illness, a treatment program can be planned with this in mind and new patients can be more easily compared to past similar cases. This network will try to predict from the current data of a patient just how long they will stay at the hospital which in turn helps the hospital figure out the "costs" that that patient will bring them. It is important for hospitals to try predict what costs certain patients will bring them.

  1. The input data for the columns consist of several factors such as primary diagnosis, the number of diagnosed conditions, admittance category, rehabilitation or disability, hypertension, smoker, family support, age, sex, and inherited tendencies toward major illness.
  2. From the BrainMaker tutorial, follow the step by step instructions on Medical Forecasting starting with step 1.
  3. This is a lengthy process so take each step slowly. This is where having a partner is vital!! Help each other through the steps!!
  4. Upon completion of the tutorial answer the following question.


  1. For patient 1-5, what are the predicted lengths of stay at the hospital?
  2. Click on patient 1. Change the status of age to 70 and write down the new predicted length of stay.
  3. Click on patient 3. Change the smoker indicator, age to 35 and family support to 1 and give to new length of stay.
  4. Which indicator will change to predicted length of stay the most: age, sex or inherited tendencies toward major illness?
  5. For patient 4, which indicator affects his/her length of stay in the hospital the most?



C. We might want to predict the winner of the annual Apple Cup game based on the UW's and WSU's performance in the Pac 10 leading up to the final game. Actually, we would use the league results for each team considering both home and away games as inputs to train the network. This does involve entering time consuming files.

Typical statistical values for each game played are listed in the table below. Note the ranges which must be normalized for the network to train on. As outputs, we could have the point spread (0-10), and win (1), loss (0) or tie (.5). For each contest played we would have 22 input neurons. When the network has trained on past statistics and outcomes, we would present the Cougars and Huskies pre-Apple Cup statistics. Note: the first game of the season would be trained on last years statistics.

The pattern 1 input table shows typical and normalized values which would be presented to the network. The ranges are taken from the statistical value table for normalization. Notice that in the example, team B's 80 average yards per game allowed (sounds like the Cougs) registered as a 0 because it is outside the specified 100 to 500 range. Presented with the training facts and then testing on a small percentage of the inputs, we could refine the system and have it predict the outcome.


Statistical Values for Each Game Played
Inputs (one set for each team):
Avg. yards gained per game 100 to 500
Avg. yards allowed per game 100 to 500
Avg. points scored per game 0 to 50
Avg. points allowed per game 0 to 50
Percentage wins at home 0 to 100
Percentage wins away 0 to 100
Net turnovers -30 to 30
Avg. penalties per game 2 to 15
Avg. penalty yards per game 10 to 150
Avg. point spread per game 0 to 10
Home/visit team 0 or 1
Point spread this game 0 to 10
Team A win/loss 0(loss), .5(tie), 1(win)
Team B win/loss 0(loss), .5(tie), 1(win)
Pattern 1 Input
Neuron Inputs Actual Normalized
1 Team A avg. yards gained per game 250 0.375
2 Team A avg. yards allowed per game 200 0.250
3 Team A avg. points scored per game 29 0.580
4 Team A avg. points allowed per game 15 0.300
5 Team A % wins at home 73 0.730
6 Team A % wins away 52 0.520
7 Team A net turnovers 2 0.533
8 Team A avg. penalties per game 5 0.231
9 Team A avg. penalty yards per game 20 0.071
10 Team A avg. point spread per game 4 0.400
11 Team A home/visit team 1 1.000
12 Team B avg. yards gained per game 220 0.300
13 Team B avg. yards allowed per game 80 (0)
14 Team B avg. points scored per game 23 0.460
15 Team B avg. points allowed per game 10 0.200
16 Team B % wins at home 65 0.650
17 Team B % wins away 61 0.61
18 Team B net turnovers -3 0.450
19 Team B avg. penalties per game 8 0.462
20 Team B avg. penalty yards per game 80 0.500
21 Team B avg. point spread per game 2 0.200
22 Team B home/visit team 0 0.000
Pattern 1 Output
Neurons Outputs Actual Normalized
1 Point Spread 7 0.700
2 Team A win/loss 0 0.000
3 Team B win/loss 1 1.000



Diagram 1: Simple Food Recognition Network



Diagram 2: Neural Drawings



Diagram 3: A Single Synapse



Diagram 4: Number of Neurons vs. Connections Per Second



Diagram 5: Radu Drossy's Back Propagation Rolling Ball Analogy to Learn a Prediction Path



Cancer Recurrance Prediction Results


Data Normalization Method

Data Normalization Assignment

NAME: ___________________________


Assuming the interval range for the data is [4,64], normalize the following input values on (0, 1), f(4) = 0 ; f(64) = 1

  1. X = 12

  2. X = 20

  3. X = 8

  4. X = 44

  5. X = 56

Food Recognition Network Assignment 1

NAME: ____________________________

  1. Give 7 descriptive inputs that would identify the fruit outputs.

  1. Give 4 vegetable outputs that could be identified with these inputs. Draw in the neural connections. How many are there?


Network Assignment 2

NAME: ____________________________

  1. For output units, choose 4 sports balls, e.g. football, golfball, ... Then pick 7 input characteristics, e.g. smooth, dimpled, dewn, ... In which 3 or more would identify these outputs.
    How many connections are in this neural network?

  1. On the back of this sheet (or another sheet), create a neural network as follows:
    Use 5 automobiles as teh outputs. As the input units, use 8 characteristics which would identify these autos, e.g., pre 1950, 2-door, American made, Classic might yield, "Little Deuce Coupe."