Logistic Regression 이해와 코드 구현
이항분포를 따르는 최대가능우도 추정을 통해 구한 것을 코드로 구현함.
sklearn 에서 제공하는 Logistic Regression 과 이 데이터셋에서 성능이 동일하다.
궁금한 점 : 추정 후 Z 검정을 통해 가설을 검정하는데 이항분포에서 추정한 p 는 sigmoid(xb) 다. 가중치 b 가 표준정규 분포를 따른다는 가정이 합리적인 것인가?
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import load_iris from collections import OrderedDict data = load_iris() X = data['data'][:, :2] Y = data['target'] X = np.array(X[:100]) Y = np.reshape(np.array(Y[:100]), (-1,1)) """ clf = LogisticRegression() clf.fit(X,Y) print(clf.score(X,Y)) print("predict",clf.predict(X)) print("label ", Y) """ def sigmoid(x): return 1. / (1. + np.exp(-x)) class Sigmoid: def __init__(self): self.out = None def forward(self, x): out = sigmoid(x) self.out = out return out def backward(self, dout): dx = dout * self.out * (1.0 - self.out) return dx class Affine: def __init__(self, W, b): self.W = W self.b = b self.original_x_shape = None self.x = None self.dW = None self.db = None def forward(self, x): self.original_x_shape = x.shape x = x.reshape(x.shape[0], -1) self.x = x out = np.dot(self.x, self.W) + self.b return out def backward(self, dout): dx = np.dot(dout, self.W.T) self.dW = np.dot(self.x.T, dout) self.db = np.sum(dout, axis=0) # 편향 공유 데이터끼리 #self.db = np.mean(dout, axis=0) # 이게 맞는 거 같음 dx = dx.reshape(*self.original_x_shape) return dx class Last_loss: def __init__(self): self.x = None self.t = None self.loss = None def forward(self, x, t): self.x = np.reshape(x, (-1, 1)) t = np.reshape(t, (-1, 1)) self.t = t self.loss = (-np.sum(t * np.log(self.x)) -np.sum((1. -t) * np.log(1. - self.x))) / self.t.shape[0] return self.loss def backward(self, dout=1): size = self.t.shape[0] dx = (((self.t - 1) / (self.x - 1)) -((self.t) / (self.x))) / size return dx class logistic_regression: def __init__(self): self.params = OrderedDict() # 가중치 초기화 self.__init_weight() # # 계층 생성 self.layers = OrderedDict() self.layers['Affine1'] = Affine(self.params['W1'], self.params['b1']) self.layers['Sigmoid1'] = Sigmoid() self.last_layer = Last_loss() def __init_weight(self): self.params['W1'] = np.random.randn(2, 1) self.params['b1'] = np.zeros(1) def predict(self, x): for layer in self.layers.values(): x = layer.forward(x) return x def loss(self, x, t): y = self.predict(x) return self.last_layer.forward(y, t) def accuracy(self, x, t): t = np.array(np.reshape(t, (-1, 1)), dtype=np.float32) y = self.predict(x) y_copy = np.zeros_like(y) y_copy[np.where(y > 0.5)] = 1. accuracy = np.sum(y_copy == t) / float(len(x)) return accuracy def gradient(self, x, t): # forward self.loss(x, t) # backward dout = 1 dout = self.last_layer.backward(dout) layers = list(self.layers.values()) # list로 만들어줘야 가능 layers.reverse() for layer in layers: dout = layer.backward(dout) # 결과 저장 grads = OrderedDict() grads['W1'] = self.layers['Affine1'].dW grads['b1'] = self.layers['Affine1'].db return grads lr = logistic_regression() key_list = ['W1', 'b1'] print("loss:",lr.loss(X,Y)) mod_num = 100 iters_num = 1000 for i in range(iters_num): grad = lr.gradient(X, Y) # 갱신 for key in key_list: lr.params[key] -= 0.1 * grad[key] if i % mod_num == 0: loss = lr.loss(X, Y) print(i,"loss:",loss) P_x = np.zeros_like(Y) P_x1 = (lr.predict(X) > 0.5) P_x[P_x1] = 1 print("predict", np.reshape(P_x, (-1,))) print("label", np.reshape(Y, (-1,))) print("acc", lr.accuracy(X,Y)) |
결과
