{"id":989,"date":"2019-01-22T10:55:17","date_gmt":"2019-01-22T02:55:17","guid":{"rendered":"http:\/\/www.max-shu.com\/blog\/?p=989"},"modified":"2019-01-22T10:57:00","modified_gmt":"2019-01-22T02:57:00","slug":"%e3%80%8a%e7%a5%9e%e7%bb%8f%e7%bd%91%e7%bb%9c%e4%b8%8e%e6%b7%b1%e5%ba%a6%e5%ad%a6%e4%b9%a0%e3%80%8b%e4%b8%ad%e7%9a%84python-3-x-%e4%bb%a3%e7%a0%81network3-py","status":"publish","type":"post","link":"http:\/\/www.max-shu.com\/blog\/?p=989","title":{"rendered":"\u300a\u795e\u7ecf\u7f51\u7edc\u4e0e\u6df1\u5ea6\u5b66\u4e60\u300b\u4e2d\u7684Python 3.x \u4ee3\u7801network3.py"},"content":{"rendered":"<header class=\"entry-header\">\n<div class=\"entry-meta\" style=\"padding-left: 30px;\">\u300a\u795e\u7ecf\u7f51\u7edc\u4e0e\u6df1\u5ea6\u5b66\u4e60\u300b\uff08<span class=\"fontstyle2\">Michael Nielsen<\/span><span class=\"fontstyle0\">\u8457<\/span>\uff09\u4e2d\u7684\u4ee3\u7801\u662f\u57fa\u4e8epython2.7\u7684\uff0c\u4e0b\u9762\u4e3a\u79fb\u690d\u5230python3\u4e0b\u7684\u4ee3\u7801\uff1a<\/div>\n<\/header>\n<div class=\"entry-content\">\n<p><span style=\"color: #ff0000;\"><strong>expand_mnist.py\u4ee3\u7801\uff1a<\/strong><\/span><\/p>\n<p><span style=\"color: #ff0000;\"><strong>\u70b9\u51fb\u4e0b\u8f7d\uff1a<a href=\"http:\/\/www.max-shu.com\/blog\/wp-content\/uploads\/2019\/01\/expand_mnist.zip\">expand_mnist<\/a><\/strong><\/span><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div>\n<div>&#8220;&#8221;&#8221;expand_mnist.py<\/div>\n<div>~~~~~~~~~~~~~~~~~~<\/div>\n<div>Take the 50,000 MNIST training images, and create an expanded set of<\/div>\n<div>250,000 images, by displacing each training image up, down, left and<\/div>\n<div>right, by one pixel. Save the resulting file to<\/div>\n<div>..\/data\/mnist_expanded.pkl.gz.<\/div>\n<div>Note that this program is memory intensive, and may not run on small<\/div>\n<div>systems.<\/div>\n<div>&#8220;&#8221;&#8221;<\/div>\n<div>from __future__ import print_function<\/div>\n<div>#### Libraries<\/div>\n<div># Standard library<\/div>\n<div>import _pickle as cPickle<\/div>\n<div># import cPickle<\/div>\n<div>import gzip<\/div>\n<div>import os.path<\/div>\n<div>import random<\/div>\n<div># Third-party libraries<\/div>\n<div>import numpy as np<\/div>\n<div>print(&#8220;Expanding the MNIST training set&#8221;)<\/div>\n<div>if os.path.exists(&#8220;.\/mnist_expanded.pkl.gz&#8221;):<\/div>\n<div>print(&#8220;The expanded training set already exists. Exiting.&#8221;)<\/div>\n<div>else:<\/div>\n<div>f = gzip.open(&#8220;.\/mnist.pkl.gz&#8221;, &#8216;rb&#8217;)<\/div>\n<div>training_data, validation_data, test_data = cPickle.load(f, encoding=&#8217;iso-8859-1&#8242;)<\/div>\n<div>f.close()<\/div>\n<div>expanded_training_pairs = []<\/div>\n<div>j = 0 # counter<\/div>\n<div>for x, y inzip(training_data[0], training_data[1]):<\/div>\n<div>expanded_training_pairs.append((x, y))<\/div>\n<div>image = np.reshape(x, (-1, 28))<\/div>\n<div>j += 1<\/div>\n<div>if j %1000==0: print(&#8220;Expanding image number&#8221;, j)<\/div>\n<div># iterate over data telling us the details of how to<\/div>\n<div># do the displacement<\/div>\n<div>for d, axis, index_position, index in [<\/div>\n<div>(1, 0, &#8220;first&#8221;, 0),<\/div>\n<div>(-1, 0, &#8220;first&#8221;, 27),<\/div>\n<div>(1, 1, &#8220;last&#8221;, 0),<\/div>\n<div>(-1, 1, &#8220;last&#8221;, 27)]:<\/div>\n<div>new_img = np.roll(image, d, axis)<\/div>\n<div>if index_position ==&#8221;first&#8221;:<\/div>\n<div>new_img[index, :] = np.zeros(28)<\/div>\n<div>else:<\/div>\n<div>new_img[:, index] = np.zeros(28)<\/div>\n<div>expanded_training_pairs.append((np.reshape(new_img, 784), y))<\/div>\n<div>random.shuffle(expanded_training_pairs)<\/div>\n<div>expanded_training_data = [list(d) for d in zip(*expanded_training_pairs)]<\/div>\n<div>print(&#8220;Saving expanded data. This may take a few minutes.&#8221;)<\/div>\n<div>f = gzip.open(&#8220;.\/mnist_expanded.pkl.gz&#8221;, &#8220;w&#8221;)<\/div>\n<div>cPickle.dump((expanded_training_data, validation_data, test_data), f)<\/div>\n<div>f.close()<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<div class=\"entry-content\">\n<p><span style=\"color: #ff0000;\"><strong>network3.py\u4ee3\u7801\uff1a<\/strong><\/span><\/p>\n<p><span style=\"color: #ff0000;\"><strong>\u70b9\u51fb\u4e0b\u8f7d\uff1a<a href=\"http:\/\/www.max-shu.com\/blog\/wp-content\/uploads\/2019\/01\/network3.zip\">network3<\/a><\/strong><\/span><\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div>\n<div>&#8220;&#8221;&#8221;network3.py<\/div>\n<div>~~~~~~~~~~~~~~<\/div>\n<div>A Theano-based program for training and running simple neural<\/div>\n<div>networks.<\/div>\n<div>Supports several layer types (fully connected, convolutional, max<\/div>\n<div>pooling, softmax), and activation functions (sigmoid, tanh, and<\/div>\n<div>rectified linear units, with more easily added).<\/div>\n<div>When run on a CPU, this program is much faster than network.py and<\/div>\n<div>network2.py. However, unlike network.py and network2.py it can also<\/div>\n<div>be run on a GPU, which makes it faster still.<\/div>\n<div>Because the code is based on Theano, the code is different in many<\/div>\n<div>ways from network.py and network2.py. However, where possible I have<\/div>\n<div>tried to maintain consistency with the earlier programs. In<\/div>\n<div>particular, the API is similar to network2.py. Note that I have<\/div>\n<div>focused on making the code simple, easily readable, and easily<\/div>\n<div>modifiable. It is not optimized, and omits many desirable features.<\/div>\n<div>This program incorporates ideas from the Theano documentation on<\/div>\n<div>convolutional neural nets (notably,<\/div>\n<div>http:\/\/deeplearning.net\/tutorial\/lenet.html ), from Misha Denil&#8217;s<\/div>\n<div>implementation of dropout (https:\/\/github.com\/mdenil\/dropout ), and<\/div>\n<div>from Chris Olah (http:\/\/colah.github.io ).<\/div>\n<div>Written for Theano 0.6 and 0.7, needs some changes for more recent<\/div>\n<div>versions of Theano.<\/div>\n<div>&#8220;&#8221;&#8221;<\/div>\n<div>#### Libraries<\/div>\n<div># Standard library<\/div>\n<div>import _pickle as cPickle<\/div>\n<div># import cPickle<\/div>\n<div>import gzip<\/div>\n<div># Third-party libraries<\/div>\n<div>import numpy as np<\/div>\n<div>import theano<\/div>\n<div>import theano.tensor as T<\/div>\n<div>from theano.tensor.nnet import conv<\/div>\n<div>from theano.tensor.nnet import softmax<\/div>\n<div>from theano.tensor import shared_randomstreams<\/div>\n<div>from theano.tensor.signal.pool import pool_2d<\/div>\n<div># from theano.tensor.signal import downsample<\/div>\n<div># Activation functions for neurons<\/div>\n<div>def linear(z): return z<\/div>\n<div>def ReLU(z): return T.maximum(0.0, z)<\/div>\n<div>from theano.tensor.nnet import sigmoid<\/div>\n<div>from theano.tensor import tanh<\/div>\n<div>#### Constants<\/div>\n<div>GPU = True<\/div>\n<div>if GPU:<\/div>\n<div>print(&#8220;Trying to run under a GPU. If this is not desired, then modify &#8220;+\\<\/div>\n<div>&#8220;network3.py\\nto set the GPU flag to False.&#8221;)<\/div>\n<div>try: theano.config.device =&#8217;gpu&#8217;<\/div>\n<div>except: pass# it&#8217;s already set<\/div>\n<div>theano.config.floatX = &#8216;float32&#8217;<\/div>\n<div>else:<\/div>\n<div>print(&#8220;Running with a CPU. If this is not desired, then the modify &#8220;+\\<\/div>\n<div>&#8220;network3.py to set\\nthe GPU flag to True.&#8221;)<\/div>\n<div>#### Load the MNIST data<\/div>\n<div>def load_data_shared(filename=&#8221;.\/mnist.pkl.gz&#8221;):<\/div>\n<div>f = gzip.open(filename, &#8216;rb&#8217;)<\/div>\n<div>training_data, validation_data, test_data = cPickle.load(f, encoding=&#8217;iso-8859-1&#8242;)<\/div>\n<div>f.close()<\/div>\n<div># \u51cf\u5c11\u6570\u636e\u91cf\uff0c\u4ee5\u4fbf\u6d4b\u8bd5\u65f6\u4e34\u65f6\u5feb\u901f\u89c2\u5bdf\u6d41\u7a0b\u3002<\/div>\n<div>RealCount = 20<\/div>\n<div>temp_data = []<\/div>\n<div>count = 0<\/div>\n<div>for x, y inzip(training_data[0], training_data[1]):<\/div>\n<div>temp_data.append((x, y))<\/div>\n<div>count += 1<\/div>\n<div>if count &gt;= RealCount :<\/div>\n<div>break<\/div>\n<div>training_data = [list(d) for d in zip(*temp_data)]<\/div>\n<div>temp_data = []<\/div>\n<div>count = 0<\/div>\n<div>for x, y inzip(validation_data[0], validation_data[1]):<\/div>\n<div>temp_data.append((x, y))<\/div>\n<div>count += 1<\/div>\n<div>if count &gt;= RealCount :<\/div>\n<div>break<\/div>\n<div>validation_data = [list(d) for d in zip(*temp_data)]<\/div>\n<div>temp_data = []<\/div>\n<div>count = 0<\/div>\n<div>for x, y inzip(test_data[0], test_data[1]):<\/div>\n<div>temp_data.append((x, y))<\/div>\n<div>count += 1<\/div>\n<div>if count &gt;= RealCount :<\/div>\n<div>break<\/div>\n<div>test_data = [list(d) for d in zip(*temp_data)]<\/div>\n<div>defshared(data):<\/div>\n<div>&#8220;&#8221;&#8221;Place the data into shared variables. This allows Theano to copy<\/div>\n<div>the data to the GPU, if one is available.<\/div>\n<div>&#8220;&#8221;&#8221;<\/div>\n<div>shared_x = theano.shared(<\/div>\n<div>np.asarray(data[0], dtype=theano.config.floatX), borrow=True)<\/div>\n<div>shared_y = theano.shared(<\/div>\n<div>np.asarray(data[1], dtype=theano.config.floatX), borrow=True)<\/div>\n<div>return shared_x, T.cast(shared_y, &#8220;int32&#8221;)<\/div>\n<div>return [shared(training_data), shared(validation_data), shared(test_data)]<\/div>\n<div>#### Main class used to construct and train networks<\/div>\n<div>class Network(object):<\/div>\n<div>def__init__(self, layers, mini_batch_size):<\/div>\n<div>&#8220;&#8221;&#8221;Takes a list of `layers`, describing the network architecture, and<\/div>\n<div>a value for the `mini_batch_size` to be used during training<\/div>\n<div>by stochastic gradient descent.<\/div>\n<div>&#8220;&#8221;&#8221;<\/div>\n<div>self.layers = layers<\/div>\n<div>self.mini_batch_size = mini_batch_size<\/div>\n<div>self.params = [param for layer inself.layers for param in layer.params]<\/div>\n<div>self.x = T.matrix(&#8220;x&#8221;)<\/div>\n<div>self.y = T.ivector(&#8220;y&#8221;)<\/div>\n<div>init_layer = self.layers[0]<\/div>\n<div>init_layer.set_inpt(self.x, self.x, self.mini_batch_size)<\/div>\n<div>for j inrange(1, len(self.layers)):<\/div>\n<div>prev_layer, layer = self.layers[j-1], self.layers[j]<\/div>\n<div>layer.set_inpt(<\/div>\n<div>prev_layer.output, prev_layer.output_dropout, self.mini_batch_size)<\/div>\n<div>self.output =self.layers[-1].output<\/div>\n<div>self.output_dropout =self.layers[-1].output_dropout<\/div>\n<div>defSGD(self, training_data, epochs, mini_batch_size, eta,<\/div>\n<div>validation_data, test_data, lmbda=0.0):<\/div>\n<div>&#8220;&#8221;&#8221;Train the network using mini-batch stochastic gradient descent.&#8221;&#8221;&#8221;<\/div>\n<div>training_x, training_y = training_data<\/div>\n<div>validation_x, validation_y = validation_data<\/div>\n<div>test_x, test_y = test_data<\/div>\n<div># compute number of minibatches for training, validation and testing<\/div>\n<div>num_training_batches = int(size(training_data)\/mini_batch_size)<\/div>\n<div>num_validation_batches = int(size(validation_data)\/mini_batch_size)<\/div>\n<div>num_test_batches = int(size(test_data)\/mini_batch_size)<\/div>\n<div># define the (regularized) cost function, symbolic gradients, and updates<\/div>\n<div>l2_norm_squared = sum([(layer.w**2).sum() for layer in self.layers])<\/div>\n<div>cost = self.layers[-1].cost(self)+\\<\/div>\n<div>0.5*lmbda*l2_norm_squared\/num_training_batches<\/div>\n<div>grads = T.grad(cost, self.params)<\/div>\n<div>updates = [(param, param-eta*grad)<\/div>\n<div>for param, grad inzip(self.params, grads)]<\/div>\n<div># define functions to train a mini-batch, and to compute the<\/div>\n<div># accuracy in validation and test mini-batches.<\/div>\n<div>i = T.lscalar() # mini-batch index<\/div>\n<div>train_mb = theano.function(<\/div>\n<div>[i], cost, updates=updates,<\/div>\n<div>givens={<\/div>\n<div>self.x:<\/div>\n<div>training_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],<\/div>\n<div>self.y:<\/div>\n<div>training_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]<\/div>\n<div>})<\/div>\n<div>validate_mb_accuracy = theano.function(<\/div>\n<div>[i], self.layers[-1].accuracy(self.y),<\/div>\n<div>givens={<\/div>\n<div>self.x:<\/div>\n<div>validation_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],<\/div>\n<div>self.y:<\/div>\n<div>validation_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]<\/div>\n<div>})<\/div>\n<div>test_mb_accuracy = theano.function(<\/div>\n<div>[i], self.layers[-1].accuracy(self.y),<\/div>\n<div>givens={<\/div>\n<div>self.x:<\/div>\n<div>test_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size],<\/div>\n<div>self.y:<\/div>\n<div>test_y[i*self.mini_batch_size: (i+1)*self.mini_batch_size]<\/div>\n<div>})<\/div>\n<div>self.test_mb_predictions = theano.function(<\/div>\n<div>[i], self.layers[-1].y_out,<\/div>\n<div>givens={<\/div>\n<div>self.x:<\/div>\n<div>test_x[i*self.mini_batch_size: (i+1)*self.mini_batch_size]<\/div>\n<div>})<\/div>\n<div># Do the actual training<\/div>\n<div>best_validation_accuracy = 0.0<\/div>\n<div>for epoch inrange(epochs):<\/div>\n<div>for minibatch_index inrange(num_training_batches):<\/div>\n<div>iteration = num_training_batches*epoch+minibatch_index<\/div>\n<div># if iteration % 1000 == 0:<\/div>\n<div>if iteration %10==0:<\/div>\n<div>print(&#8220;Training mini-batch number {0}&#8221;.format(iteration))<\/div>\n<div>cost_ij = train_mb(minibatch_index)<\/div>\n<div>if (iteration+1) % num_training_batches ==0:<\/div>\n<div>validation_accuracy = np.mean(<\/div>\n<div>[validate_mb_accuracy(j) for j in range(num_validation_batches)])<\/div>\n<div>print(&#8220;Epoch {0}: validation accuracy {1:.2%}&#8221;.format(<\/div>\n<div>epoch, validation_accuracy))<\/div>\n<div>if validation_accuracy &gt;= best_validation_accuracy:<\/div>\n<div>print(&#8220;This is the best validation accuracy to date.&#8221;)<\/div>\n<div>best_validation_accuracy = validation_accuracy<\/div>\n<div>best_iteration = iteration<\/div>\n<div>if test_data:<\/div>\n<div>test_accuracy = np.mean(<\/div>\n<div>[test_mb_accuracy(j) for j in range(num_test_batches)])<\/div>\n<div>print(&#8216;The corresponding test accuracy is {0:.2%}&#8217;.format(<\/div>\n<div>test_accuracy))<\/div>\n<div>print(&#8220;Finished training network.&#8221;)<\/div>\n<div>print(&#8220;Best validation accuracy of {0:.2%} obtained at iteration {1}&#8221;.format(<\/div>\n<div>best_validation_accuracy, best_iteration))<\/div>\n<div>print(&#8220;Corresponding test accuracy of {0:.2%}&#8221;.format(test_accuracy))<\/div>\n<div>#### Define layer types<\/div>\n<div>class ConvPoolLayer(object):<\/div>\n<div>&#8220;&#8221;&#8221;Used to create a combination of a convolutional and a max-pooling<\/div>\n<div>layer. A more sophisticated implementation would separate the<\/div>\n<div>two, but for our purposes we&#8217;ll always use them together, and it<\/div>\n<div>simplifies the code, so it makes sense to combine them.<\/div>\n<div>&#8220;&#8221;&#8221;<\/div>\n<div>def__init__(self, filter_shape, image_shape, poolsize=(2, 2),<\/div>\n<div>activation_fn=sigmoid):<\/div>\n<div>&#8220;&#8221;&#8221;`filter_shape` is a tuple of length 4, whose entries are the number<\/div>\n<div>of filters, the number of input feature maps, the filter height, and the<\/div>\n<div>filter width.<\/div>\n<div>`image_shape` is a tuple of length 4, whose entries are the<\/div>\n<div>mini-batch size, the number of input feature maps, the image<\/div>\n<div>height, and the image width.<\/div>\n<div>`poolsize` is a tuple of length 2, whose entries are the y and<\/div>\n<div>x pooling sizes.<\/div>\n<div>&#8220;&#8221;&#8221;<\/div>\n<div>self.filter_shape = filter_shape<\/div>\n<div>self.image_shape = image_shape<\/div>\n<div>self.poolsize = poolsize<\/div>\n<div>self.activation_fn=activation_fn<\/div>\n<div># initialize weights and biases<\/div>\n<div>n_out = (filter_shape[0]*np.prod(filter_shape[2:])\/np.prod(poolsize))<\/div>\n<div>self.w = theano.shared(<\/div>\n<div>np.asarray(<\/div>\n<div>np.random.normal(loc=0, scale=np.sqrt(1.0\/n_out), size=filter_shape),<\/div>\n<div>dtype=theano.config.floatX),<\/div>\n<div>borrow=True)<\/div>\n<div>self.b = theano.shared(<\/div>\n<div>np.asarray(<\/div>\n<div>np.random.normal(loc=0, scale=1.0, size=(filter_shape[0],)),<\/div>\n<div>dtype=theano.config.floatX),<\/div>\n<div>borrow=True)<\/div>\n<div>self.params = [self.w, self.b]<\/div>\n<div>defset_inpt(self, inpt, inpt_dropout, mini_batch_size):<\/div>\n<div>self.inpt = inpt.reshape(self.image_shape)<\/div>\n<div>conv_out = conv.conv2d(<\/div>\n<div>input=self.inpt, filters=self.w, filter_shape=self.filter_shape,<\/div>\n<div>image_shape=self.image_shape)<\/div>\n<div># pooled_out = downsample.max_pool_2d(<\/div>\n<div># input=conv_out, ds=self.poolsize, ignore_border=True)<\/div>\n<div>pooled_out = pool_2d(<\/div>\n<div>input=conv_out, ds=self.poolsize, ignore_border=True)<\/div>\n<div>self.output =self.activation_fn(<\/div>\n<div>pooled_out + self.b.dimshuffle(&#8216;x&#8217;, 0, &#8216;x&#8217;, &#8216;x&#8217;))<\/div>\n<div>self.output_dropout =self.output # no dropout in the convolutional layers<\/div>\n<div>class FullyConnectedLayer(object):<\/div>\n<div>def__init__(self, n_in, n_out, activation_fn=sigmoid, p_dropout=0.0):<\/div>\n<div>self.n_in = n_in<\/div>\n<div>self.n_out = n_out<\/div>\n<div>self.activation_fn = activation_fn<\/div>\n<div>self.p_dropout = p_dropout<\/div>\n<div># Initialize weights and biases<\/div>\n<div>self.w = theano.shared(<\/div>\n<div>np.asarray(<\/div>\n<div>np.random.normal(<\/div>\n<div>loc=0.0, scale=np.sqrt(1.0\/n_out), size=(n_in, n_out)),<\/div>\n<div>dtype=theano.config.floatX),<\/div>\n<div>name=&#8217;w&#8217;, borrow=True)<\/div>\n<div>self.b = theano.shared(<\/div>\n<div>np.asarray(np.random.normal(loc=0.0, scale=1.0, size=(n_out,)),<\/div>\n<div>dtype=theano.config.floatX),<\/div>\n<div>name=&#8217;b&#8217;, borrow=True)<\/div>\n<div>self.params = [self.w, self.b]<\/div>\n<div>defset_inpt(self, inpt, inpt_dropout, mini_batch_size):<\/div>\n<div>self.inpt = inpt.reshape((mini_batch_size, self.n_in))<\/div>\n<div>self.output =self.activation_fn(<\/div>\n<div>(1-self.p_dropout)*T.dot(self.inpt, self.w) + self.b)<\/div>\n<div>self.y_out = T.argmax(self.output, axis=1)<\/div>\n<div>self.inpt_dropout = dropout_layer(<\/div>\n<div>inpt_dropout.reshape((mini_batch_size, self.n_in)), self.p_dropout)<\/div>\n<div>self.output_dropout =self.activation_fn(<\/div>\n<div>T.dot(self.inpt_dropout, self.w) + self.b)<\/div>\n<div>defaccuracy(self, y):<\/div>\n<div>&#8220;Return the accuracy for the mini-batch.&#8221;<\/div>\n<div>return T.mean(T.eq(y, self.y_out))<\/div>\n<div>class SoftmaxLayer(object):<\/div>\n<div>def__init__(self, n_in, n_out, p_dropout=0.0):<\/div>\n<div>self.n_in = n_in<\/div>\n<div>self.n_out = n_out<\/div>\n<div>self.p_dropout = p_dropout<\/div>\n<div># Initialize weights and biases<\/div>\n<div>self.w = theano.shared(<\/div>\n<div>np.zeros((n_in, n_out), dtype=theano.config.floatX),<\/div>\n<div>name=&#8217;w&#8217;, borrow=True)<\/div>\n<div>self.b = theano.shared(<\/div>\n<div>np.zeros((n_out,), dtype=theano.config.floatX),<\/div>\n<div>name=&#8217;b&#8217;, borrow=True)<\/div>\n<div>self.params = [self.w, self.b]<\/div>\n<div>defset_inpt(self, inpt, inpt_dropout, mini_batch_size):<\/div>\n<div>self.inpt = inpt.reshape((mini_batch_size, self.n_in))<\/div>\n<div>self.output = softmax((1-self.p_dropout)*T.dot(self.inpt, self.w) +self.b)<\/div>\n<div>self.y_out = T.argmax(self.output, axis=1)<\/div>\n<div>self.inpt_dropout = dropout_layer(<\/div>\n<div>inpt_dropout.reshape((mini_batch_size, self.n_in)), self.p_dropout)<\/div>\n<div>self.output_dropout = softmax(T.dot(self.inpt_dropout, self.w) +self.b)<\/div>\n<div>defcost(self, net):<\/div>\n<div>&#8220;Return the log-likelihood cost.&#8221;<\/div>\n<div>return-T.mean(T.log(self.output_dropout)[T.arange(net.y.shape[0]), net.y])<\/div>\n<div>defaccuracy(self, y):<\/div>\n<div>&#8220;Return the accuracy for the mini-batch.&#8221;<\/div>\n<div>return T.mean(T.eq(y, self.y_out))<\/div>\n<div>#### Miscellanea<\/div>\n<div>def size(data):<\/div>\n<div>&#8220;Return the size of the dataset `data`.&#8221;<\/div>\n<div>return data[0].get_value(borrow=True).shape[0]<\/div>\n<div>def dropout_layer(layer, p_dropout):<\/div>\n<div>srng = shared_randomstreams.RandomStreams(<\/div>\n<div>np.random.RandomState(0).randint(999999))<\/div>\n<div>mask = srng.binomial(n=1, p=1-p_dropout, size=layer.shape)<\/div>\n<div>return layer*T.cast(mask, theano.config.floatX)<\/div>\n<div>def test():<\/div>\n<div># expanded_training_data, _, _ = load_data_shared(&#8220;.\/mnist_expanded.pkl.gz&#8221;)<\/div>\n<div>training_data, validation_data, test_data = load_data_shared()<\/div>\n<div>mini_batch_size = 10<\/div>\n<div>net = Network( [<\/div>\n<div>ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28), filter_shape=(20, 1, 5, 5), poolsize=(2, 2), activation_fn=ReLU),<\/div>\n<div>ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 12), filter_shape=(40, 20, 5, 5), poolsize=(2, 2), activation_fn=ReLU),<\/div>\n<div>FullyConnectedLayer(n_in=40*4*4, n_out=1000, activation_fn=ReLU, p_dropout=0.5),<\/div>\n<div>FullyConnectedLayer(n_in=1000, n_out=1000, activation_fn=ReLU, p_dropout=0.5),<\/div>\n<div>SoftmaxLayer(n_in=1000, n_out=10, p_dropout=0.5)<\/div>\n<div>], mini_batch_size)<\/div>\n<div># net.SGD(expanded_training_data, 40, mini_batch_size, 0.03, validation_data, test_data)<\/div>\n<div># net.SGD(expanded_training_data, 3, mini_batch_size, 0.03, validation_data, test_data)<\/div>\n<div>net.SGD(training_data, 3, mini_batch_size, 0.03, validation_data, test_data)<\/div>\n<div>if __name__ == &#8216;__main__&#8217;:<\/div>\n<div>test()<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u300a\u795e\u7ecf\u7f51\u7edc\u4e0e\u6df1\u5ea6\u5b66\u4e60\u300b\uff08Michael Nielsen\u8457\uff09\u4e2d\u7684\u4ee3\u7801\u662f\u57fa\u4e8epython2.7\u7684\uff0c\u4e0b\u9762\u4e3a\u79fb\u690d\u5230py &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[443],"tags":[728,108,446,447],"class_list":["post-989","post","type-post","status-publish","format-standard","hentry","category-443","tag-network3","tag-python","tag-446","tag-447"],"views":1936,"_links":{"self":[{"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/989","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=989"}],"version-history":[{"count":3,"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/989\/revisions"}],"predecessor-version":[{"id":994,"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/989\/revisions\/994"}],"wp:attachment":[{"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=989"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=989"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.max-shu.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=989"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}