Skip to content

Commit c45ddf6

Browse files
Mark LindermanMark Linderman
Mark Linderman
authored and
Mark Linderman
committed
ex4 part 1 solution
1 parent c351093 commit c45ddf6

File tree

2 files changed

+30
-10
lines changed

2 files changed

+30
-10
lines changed

machine-learning-ex3/ex3/predict.m

+7-2
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,9 @@
3636

3737
% had to ponder this for quite some time and then I
3838
% had to write it down to visualize what I wanted:
39-
% basically, you're creating a new X with
40-
% 25 rows and 401 columns (from 5000 rows and 401 columns)
39+
% basically, you're creating a new X (a2) with
40+
% 5000 rows and 25 columns (from 5000 rows and 401 columns)
41+
% and then a new X (a3) with 5000 rows and 10 cols (the outputs)
4142

4243
% Theta1:
4344
% 43, 54, 65, 67, 86 ........... 401
@@ -55,6 +56,10 @@
5556
% .
5657
% 5000
5758

59+
% so, to line up theta1 and X up for multiplication,
60+
% transpose X so that the 1st Theta row will match up with
61+
% the X columns
62+
5863

5964
% seems the desired result will reduce
6065
% the inputs from 401 features down to 25 (the number of inputs).

machine-learning-ex4/ex4/nnCostFunction.m

+23-8
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
num_labels, (hidden_layer_size + 1));
2424
size(Theta1)
2525
size(Theta2)
26+
size(X)
2627

2728
% Setup some useful variables
2829
m = size(X, 1);
@@ -64,25 +65,39 @@
6465
% and Theta2_grad from Part 2.
6566
%
6667

67-
% add bias column of 1's to Theta1 and Theta2
68-
a1 = [ones(m,1), X];
69-
a2 = [ones(m,1), sigmoid(a1*transpose(Theta1))];
70-
a3 = sigmoid(a2*transpose(Theta2));
68+
% add bias column of 1's to X and call it a1
69+
% a1 = 5000 x 401
70+
% Theta1 = 25 x 401
71+
% Theta2 = 10 x 26
72+
73+
% Note: multiplication of the Theta matrices with a(n) matrices
74+
% can be done two ways, the one implemented below requiring
75+
% fewer transformation
7176

72-
% compute the cost by looping over a3 values and accumulating the costs for each output node (10 in this case), for each y
77+
%a1 = [ones(m,1), X];
78+
%a2 = [ones(m,1), transpose(sigmoid(Theta1*transpose(a1)))];
79+
%a3 = transpose(sigmoid(Theta2*transpose(a2)));
80+
81+
a1 = [ones(m,1), X];
82+
a2 = [ones(m,1), sigmoid(a1 * transpose(Theta1))];
83+
a3 = sigmoid(a2 * transpose(Theta2));
84+
size(a2)
85+
size(a3)
86+
size(y)
87+
88+
% compute the cost by looping over a3 values and accumulating the
89+
% costs for each output node (10 in this case), for each y
7390
for i = 1:m
7491
% map y values to a row vector of num_labels length so that you can compare to output nodes while computing cost
7592
yvector = zeros(1,num_labels);
7693
yvector(1, y(i)) = 1;
7794
for j = 1:num_labels
78-
% log(a3(j))
7995
% yvector and a3 should both be a row vector of size num_labels
80-
J += -yvector(j)*log(a3(j)) - (1 - yvector(j))*log(1 - a3(j));
96+
J += -yvector(j)*log(a3(i,j)) - (1 - yvector(j))*log(1 - a3(i,j));
8197
end
8298
end
8399

84100
J = (1/m) * J;
85-
86101

87102

88103

0 commit comments

Comments
 (0)