创建神经网络并进行训练和查询

2019-08-03   30 次阅读


摘要:通过对神经网络编程这本书的内容,模仿其中的代码,对初次学习神经网络进行熟悉,并逐步完成神经网络的搭建,完成对手写数字的识别。

神经网络的三大步骤

  1. 初始化函数--设定输入节点、隐藏节点和输出节点的数量。

  2. 训练--学习给定的训练集样本后,优化权重。

  3. 查询--给定输入,从输出的节点给出答案。

初始化网络--输入

def __init__(self , inputnodes , hiddennodes , outputnodes , learningrate):   
        #set number of nodes in each input, hidden, output layer
    
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes
        
        
        #链接权重矩阵
        #link weight matrices, wih and who 注释:此处wih的意思是 w:权重  i:input  h:hidden ,后面的who同理
        #weights inside the arrays are w_i_j,where link is from node i to node j in the next layter
        #w11 w21
        #w12 w22 etc
        self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
        self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)
    
        #learning rate
        self.lr = learningrate
        
        
        #导入scipy.special 才可以用
        #activation function is the sigmoid function
        #使用lambda来创建函数, 函数接受了X,返回了scipy.special.expit(x),这就是S函数,使用lambda创建的匿名函数
        self.activation_function = lambda x: scipy.special.expit(x)
            
        pass

其中,需要注意的地方有:

  • 函数名 __init__是前后两条“_” ,如果只有一条下划线的话会报错TypeError : object() takes no parameters

参考链接:https://blog.csdn.net/qq_26489165/article/details/80595864

  • 链接权重矩阵:此处wih的意思是 w:权重 i:input h:hidden ,后面的whoy也是同理
self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)
  • 使用lambda来创建函数, 函数接受了X,返回了scipy.special.expit(x),这就是S函数,使用lambda创建的匿名函数
self.activation_function = lambda x: scipy.special.expit(x)

初始化网络--查询

#convert inouts list to 2d array
        inputs = numpy.array(inputs_list,ndmin= 2 ) .T
        
        #calculate signals into hidden layer
        hidden_inputs = numpy.dot(self.wih , inputs)
        #calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        #calculate signals into final output layer
        final_inputs = numpy.dot(self.who , hidden_outputs)
        #calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        return final_outputs
        pass

初始化网络--训练

    #train the neural network
    def train(self, input_list, targets_list):
        #可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
        #cober inputs list to 2d array
        inputs = numpy.array(input_list,ndmin=2).T
        targets = numpy.array(targets_list,ndmin=2).T
        
        #calculate signals into hidden layer
        hidden_inputs = numpy.dot(self.wih , inputs)
        #calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
         #calculate signals into final output layer
        final_inputs = numpy.dot(self.who , hidden_outputs)
        #calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        
        #error is the (target -actual),即为反向传播的误差
        output_errors = targets - final_outputs
        #hidden layer error is the output_errors , split by weights,recombined at hodden nodes
        hidden_errors = numpy.dot(self.who.T , output_errors)
        #update the weights for the links between the hidden and output layers
        #其中,学习率是self.lr  利用numpy.dot进行矩阵的乘法
        self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))
        #update the weights for the links between the intput and hidden layers
        self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))
        pass

其中,需要注意的是:

  1. 可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
  2. 而error中error is the (target -actual),即为反向传播的误差
  3. 学习率是self.lr 利用numpy.dot进行矩阵的乘法

创建神经网络对象

#number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 100
output_nodes = 10

#learn rate is 0.3 (学习率)
learning_rate = 0.3

#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

为什么选择784个输入节点呢?请记住这是28*28的结果,即是组成手写数字图像的像素个数。

选择100个隐藏节点并没有固定规定,书中认为神经网络可以发现在输入中的特征或模式,这些模式或者特征可以使用比输入本身更简短的表达,因此没有和选择比784大的数字。选择比输入节点小的数量来强制网络尝试总结输入的主要特点。但选择太少的隐藏节点就限制了网络的能力。给定10个输出层的节点对应的是10个标签。

强调一点:对于一个问题,选择多少个隐藏层节点并不存在一份最佳方法。最好的办法就是进行实验,直到找到适合你解决问题的一个数字。

测试网络

  • 打开文件
#load the mnist training data CSV file into a list
training_data_file = open("TestandTrain/mnist_train_100.csv",'r')
training_data_list = training_data_file.readlines()
training_data_file.close()

将文件放在同目录下即可直接调用

  • 训练网络
# train the neural network
# go through all records in the training data set 
for record in training_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
    targets = numpy.zeros(output_nodes) + 0.01
    # all_values[0] is the target label for this record
    targets[int(all_values[0])] = 0.99
    n.train(inputs, targets)
    pass

查询网络

  • 打开文件
#load the mnist test data CSV file into a list
test_data_file = open("TestandTrain/mnist_test_10.csv",'r')
test_data_list = test_data_file.readlines()
test_data_file.close()
  • 打印标签

并进行matplotlib进行图形化显示并查看测试概率

#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])

image_array = numpy.asfarray(all_values[1:]).reshape((28,28))
matplotlib.pyplot.imshow(image_array,cmap='Greys',interpolation='None')


n.query((numpy.asfarray(all_values[1:])/ 255.0 * 0.99) + 0.01)

完整代码

import numpy
#scipy.special for the sigmoid function expit()
import scipy.special
#添加绘图库
import matplotlib.pyplot
%matplotlib inline

#neural network classdefinition
class neuralNetwork:
    #初始化网络
    #initialise the neural netwoak
    #注意!----->此处init前后均是双下划线,否则报错
    #TypeError: object() takes no parameters
    #参考链接:https://blog.csdn.net/qq_26489165/article/details/80595864
    def __init__(self , inputnodes , hiddennodes , outputnodes , learningrate):   
        #set number of nodes in each input, hidden, output layer
    
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes
        
        
        #链接权重矩阵
        #link weight matrices, wih and who 注释:此处wih的意思是 w:权重  i:input  h:hidden ,后面的who同理
        #weights inside the arrays are w_i_j,where link is from node i to node j in the next layter
        #w11 w21
        #w12 w22 etc
        self.wih = (numpy.random.rand(self.hnodes, self.inodes) - 0.5)
        self.who = (numpy.random.rand(self.onodes, self.hnodes) - 0.5)
    
        #learning rate
        self.lr = learningrate
        
        
        #导入scipy.special 才可以用
        #activation function is the sigmoid function
        #使用lambda来创建函数, 函数接受了X,返回了scipy.special.expit(x),这就是S函数,使用lambda创建的匿名函数
        self.activation_function = lambda x: scipy.special.expit(x)
            
        pass
    
    #train the neural network
    def train(self, input_list, targets_list):
        #可以发现下面的代码和query中的几乎完全一样。因为所使用的从输入层前馈信号到最终输出层完全一样。而多处理的targets 是用来训练样本的。
        #cober inputs list to 2d array
        inputs = numpy.array(input_list,ndmin=2).T
        targets = numpy.array(targets_list,ndmin=2).T
        
        #calculate signals into hidden layer
        hidden_inputs = numpy.dot(self.wih , inputs)
        #calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
         #calculate signals into final output layer
        final_inputs = numpy.dot(self.who , hidden_outputs)
        #calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        
        #error is the (target -actual),即为反向传播的误差
        output_errors = targets - final_outputs
        #hidden layer error is the output_errors , split by weights,recombined at hodden nodes
        hidden_errors = numpy.dot(self.who.T , output_errors)
        #update the weights for the links between the hidden and output layers
        #其中,学习率是self.lr  利用numpy.dot进行矩阵的乘法
        self.who += self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)), numpy.transpose(hidden_outputs))
        #update the weights for the links between the intput and hidden layers
        self.wih += self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), numpy.transpose(inputs))
        pass
    
    #query the neural network
    def query(self, inputs_list):
        #convert inouts list to 2d array
        inputs = numpy.array(inputs_list,ndmin= 2 ) .T
        
        #calculate signals into hidden layer
        hidden_inputs = numpy.dot(self.wih , inputs)
        #calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        #calculate signals into final output layer
        final_inputs = numpy.dot(self.who , hidden_outputs)
        #calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        return final_outputs
        pass
    
#number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 100
output_nodes = 10

#learn rate is 0.3
learning_rate = 0.3

#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)
#load the mnist training data CSV file into a list
training_data_file = open("TestandTrain/mnist_train_100.csv",'r')
training_data_list = training_data_file.readlines()
training_data_file.close()
# train the neural network
# go through all records in the training data set 
for record in training_data_list:
    # split the record by the ',' commas
    all_values = record.split(',')
    # scale and shift the inputs
    inputs = (numpy.asfarray(all_values[1:]) / 255.0 * 0.99) + 0.01
    targets = numpy.zeros(output_nodes) + 0.01
    # all_values[0] is the target label for this record
    targets[int(all_values[0])] = 0.99
    n.train(inputs, targets)
    pass
    #load the mnist test data CSV file into a list
test_data_file = open("TestandTrain/mnist_test_10.csv",'r')
test_data_list = test_data_file.readlines()
test_data_file.close()
#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])
#get the first test record
all_values = test_data_list[2].split(',')
#print the lable
#打印标签
print('label:',all_values[0])
image_array = numpy.asfarray(all_values[1:]).reshape((28,28))
matplotlib.pyplot.imshow(image_array,cmap='Greys',interpolation='None')
n.query((numpy.asfarray(all_values[1:])/ 255.0 * 0.99) + 0.01)

记得给代码加上头文件

import numpy
#scipy.special for the sigmoid function expit()
import scipy.special
#添加绘图库
import matplotlib.pyplot
%matplotlib inline

本文由 hongCYu 创作,如果您觉得本文不错,请随意赞赏
采用 知识共享署名4.0 国际许可协议进行许可
原文链接:https://hongcyu.cn/posts/neural-network.html
最后更新于:2020-12-03 10:11:12

Coffee