keras로 간단하게 neural network 만들기

6 분 소요

keras가 뭔가요?

“Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.” 라고 합니다.
보통 machine-learning을 한다고 하면, tensorflow를 많이 씁니다. 그런데 저처럼 약간 기존에 있는 것들을 활용하는 일이 훨씬 많다고 하면 꼭 tensorflow로 할 필요는 없는 것 같아요. tensorflow가 low-level이라면 keras는 high-level interface라서 기존에 있는 함수들을 그대로 이용할 수 있다는 장점? 정도가 있는 것 같아요.
물론 tensorflow에서도 이러한 인터페이스를 편하게 제공한다고 하던데, 저는 keras가 더 편하더라고요.

neural network를 만듭시다.

뉴럴넷을 만들어줄 때 고려해야 하는 것은 개별 component들을 어떻게 쌓을지, 그리고 몇 층이나 쌓을지, 정도라고 말할 수 있습니다. 정리하면 대략 다음과 같다고 말할수 있네요.
- 몇 층의 layer 를 쌓을 것인가? 각 층 별로 node의 수는 몇 개인가?
- activation function은 무엇을 쓸 것인가?
- optimizer는 무엇을 쓸 것인가?
- dropout은 할 것인가? 하면 어느 정도의 rate를 할 것인가?
분석할 데이터는 모두가 한번쯤은 해본 mnist를 사용할 예정입니다.
일단 대충 만들겁니다. 아래 사양을 갖춘 nn은 아래 코드를 활용해서 간단하게 만들어졌습니다.
- layer size: [784, 32, 10]
- activation function: relu
- optimizer: sgd
- dropout: nope

# import data 
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x_train = mnist.train.images
y_train = mnist.train.labels

x_test = mnist.test.images
y_test = mnist.test.labels

# make neural network
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
from keras import metrics

model = Sequential([
    Dense(32, input_shape=(784,)),
    Activation('relu'),
    Dense(10),
    Activation('softmax'),
])

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=[metrics.categorical_accuracy])

model.fit(x_train, y_train, epochs=10, batch_size=500)

keras의 경우 fitting 과정에서 결과를 보여주는데, 지금 [784,32,10]의 뉴럴넷만으로 10 epoch만 했는데도 90%이상의 accuracy를 보이는 것을 알 수 있습니다. 물론 이것만으로는 overfitting 상황인지 아닌지 알 수 없지만, 아무튼 기분은 좋네요.
팁이지만, 많은 수업 들에서 overfitting을 방지해야 한다고 말하고 있습니다. 그런데, 일단은 train-data에 대해서 overfitting을 시켜보는 것이 중요합니다. 트레인데이터에 대해서도 맞추지 못하는데 test-data에 맞을 가능성은 없습니다.

Epoch 1/10
55000/55000 [==============================] - 1s - loss: 1.0616 - categorical_accuracy: 0.6935     
Epoch 2/10
55000/55000 [==============================] - 0s - loss: 0.4483 - categorical_accuracy: 0.8753     
Epoch 3/10
55000/55000 [==============================] - 0s - loss: 0.3720 - categorical_accuracy: 0.8956     
Epoch 4/10
55000/55000 [==============================] - 0s - loss: 0.3368 - categorical_accuracy: 0.9044     
Epoch 5/10
55000/55000 [==============================] - 0s - loss: 0.3139 - categorical_accuracy: 0.9106     
Epoch 6/10
55000/55000 [==============================] - 0s - loss: 0.2963 - categorical_accuracy: 0.9156     
Epoch 7/10
55000/55000 [==============================] - 0s - loss: 0.2818 - categorical_accuracy: 0.9193     
Epoch 8/10
55000/55000 [==============================] - 0s - loss: 0.2693 - categorical_accuracy: 0.9233     
Epoch 9/10
55000/55000 [==============================] - 0s - loss: 0.2582 - categorical_accuracy: 0.9268     
Epoch 10/10
55000/55000 [==============================] - 0s - loss: 0.2483 - categorical_accuracy: 0.9294     
53312/55000 [============================>.] - ETA: 0s0.931072727273
 8864/10000 [=========================>....] - ETA: 0s0.9344

아마 내부적으로 빠르게 accuracy를 측정해주는 방법이 있을텐데 나는 그냥 직접 코딩했다.
- 아래 verbose의 경우는 위처럼 progress bar 같은 프린트가 나타나는 것을 없애주는 것. 0을 넘기면 조용히 피팅하고 결과를 보여준다.
아무튼 이렇게 뉴럴넷을 만들어도 꽤 높은 정확도를 보여주는 것을 알 수 있다. 그렇다면 더 복잡하게 만들어보면 더 좋을까!

y_predict = model.predict_classes(x_train, verbose=0)
y_true = [ np.argmax(y) for y in y_train]
accuracy = np.sum([y_comp[0]==y_comp[1] for y_comp in zip(y_predict, y_true)])
print("train accuracy: {}".format(accuracy/len(y_predict)))

y_predict = model.predict_classes(x_test, verbose=0)
y_true = [ np.argmax(y) for y in y_test]
accuracy = np.sum([y_comp[0]==y_comp[1] for y_comp in zip(y_predict, y_true)])
print("test accuracy: {}".format(accuracy/len(y_predict)))

train accuracy: 0.9332181818181818
test accuracy: 0.9323

neural network를 복잡하게 만듭시다.

아래 뉴럴넷은 훨씬 복잡하게 만들었지만, 오히려 정확도가 70퍼센트 중반에 머무르는 것을 알 수 있다. 매 epoch마다 더 빠르게 올라가는 것이 아니라 오름폭이 감소하는데, 레이어 사이즈들을 바꿔주거나,

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
from keras import metrics
import numpy as np

model = Sequential([
    Dense(250, input_shape=(784,)),
    Activation('relu'),
    Dense(125),
    Activation('relu'),
    Dense(50),
    Activation('relu'),
    Dense(10),
    Activation('relu'),
    Activation('softmax'),
])

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=[metrics.categorical_accuracy])

model.fit(x_train, y_train, epochs=10, batch_size=500)


y_predict = model.predict_classes(x_train, verbose=0)
y_true = [ np.argmax(y) for y in y_train]
accuracy = np.sum([y_comp[0]==y_comp[1] for y_comp in zip(y_predict, y_true)])
print("train accuracy: {}".format(accuracy/len(y_predict)))

y_predict = model.predict_classes(x_test, verbose=0)
y_true = [ np.argmax(y) for y in y_test]
accuracy = np.sum([y_comp[0]==y_comp[1] for y_comp in zip(y_predict, y_true)])
print("test accuracy: {}".format(accuracy/len(y_predict)))

Epoch 1/10
55000/55000 [==============================] - 3s - loss: 1.4708 - categorical_accuracy: 0.5451     
Epoch 2/10
55000/55000 [==============================] - 2s - loss: 0.7717 - categorical_accuracy: 0.7314     
Epoch 3/10
55000/55000 [==============================] - 2s - loss: 0.6963 - categorical_accuracy: 0.7456     
Epoch 4/10
55000/55000 [==============================] - 2s - loss: 0.6606 - categorical_accuracy: 0.7538     
Epoch 5/10
55000/55000 [==============================] - 2s - loss: 0.6372 - categorical_accuracy: 0.7600     
Epoch 6/10
55000/55000 [==============================] - 2s - loss: 0.6190 - categorical_accuracy: 0.7643     
Epoch 7/10
55000/55000 [==============================] - 2s - loss: 0.6039 - categorical_accuracy: 0.7677     
Epoch 8/10
55000/55000 [==============================] - 2s - loss: 0.5913 - categorical_accuracy: 0.7705     
Epoch 9/10
55000/55000 [==============================] - 2s - loss: 0.5811 - categorical_accuracy: 0.7732     
Epoch 10/10
55000/55000 [==============================] - 3s - loss: 0.5719 - categorical_accuracy: 0.7753     
train accuracy: 0.7769090909090909
test accuracy: 0.7744

하지만, 재밌는 것은 위의 코드를 다시 돌리면 아래처럼, 정확도가 훨씬 높게 나온다.
다양한 이유가 있지만, 이를 방지하기 위해서는 dropout을 추가하거나, weight init을 하는 방식들이 있을 수 있다(그냥 좋은 컴퓨터만 있으면 여러 모델을 여러 번 돌려서 그중에 제일 나은 놈을 찾으면 된다 소근소근)
“뉴럴넷은 늘 이거 되기는 되는데 왜 되는지 모르게어요” 같은 느낌이 있다.

Epoch 1/10
55000/55000 [==============================] - 3s - loss: 1.2598 - categorical_accuracy: 0.6334     
Epoch 2/10
55000/55000 [==============================] - 2s - loss: 0.5708 - categorical_accuracy: 0.8192     
Epoch 3/10
55000/55000 [==============================] - 2s - loss: 0.3857 - categorical_accuracy: 0.8819     
Epoch 4/10
55000/55000 [==============================] - 2s - loss: 0.2486 - categorical_accuracy: 0.9283     
Epoch 5/10
55000/55000 [==============================] - 2s - loss: 0.2127 - categorical_accuracy: 0.9386     
Epoch 6/10
55000/55000 [==============================] - 3s - loss: 0.1863 - categorical_accuracy: 0.9466     
Epoch 7/10
55000/55000 [==============================] - 3s - loss: 0.1652 - categorical_accuracy: 0.9522     
Epoch 8/10
55000/55000 [==============================] - 3s - loss: 0.1480 - categorical_accuracy: 0.9573     
Epoch 9/10
55000/55000 [==============================] - 3s - loss: 0.1341 - categorical_accuracy: 0.9612     
Epoch 10/10
55000/55000 [==============================] - 3s - loss: 0.1220 - categorical_accuracy: 0.9646     
train accuracy: 0.9677636363636364
test accuracy: 0.9621

convolution neural network

mnist는 이미지이며, 이미지 분류를 위해서는 cnn을 쓰는 것이 좋다.
그냥 nn에서는 28 by 28 매트릭스를 flattern하여 784 사이즈의 어레이로 고려하여 진행했으나, cnn에서는 이미지를 filter로 찍어야 하므로 이를 다시 reshape 하는 것이 필요하다.

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x_train = mnist.train.images
y_train = mnist.train.labels

x_test = mnist.test.images
y_test = mnist.test.labels

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Conv2D, Flatten, MaxPooling2D
from keras.optimizers import SGD
from keras import metrics
import numpy as np

model = Sequential()
model.add(Conv2D(32, kernel_size=(5, 5), strides=(1, 1), activation='relu', input_shape=(28,28,1)))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=[metrics.categorical_accuracy])
model.fit(x_train, y_train, epochs=10, batch_size=500)


y_predict = model.predict_classes(x_train, verbose=0)
y_true = [ np.argmax(y) for y in y_train]
accuracy = np.sum([y_comp[0]==y_comp[1] for y_comp in zip(y_predict, y_true)])
print("train accuracy: {}".format(accuracy/len(y_predict)))

y_predict = model.predict_classes(x_test, verbose=0)
y_true = [ np.argmax(y) for y in y_test]
accuracy = np.sum([y_comp[0]==y_comp[1] for y_comp in zip(y_predict, y_true)])
print("test accuracy: {}".format(accuracy/len(y_predict)))

게산 시간은 상당히 오래 걸리지만(10 epoch를 돌리는데, 10분 이상), 그래도 nn에 비해서 월등히 정확도가 높아진 것을 알 수 있다. 물론 그냥 nn의 경우도 오래 돌렸다면 꽤 높아질 수는 있었을 거긴 함….

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Epoch 1/10
55000/55000 [==============================] - 77s - loss: 0.9359 - categorical_accuracy: 0.7145    
Epoch 2/10
55000/55000 [==============================] - 79s - loss: 0.1680 - categorical_accuracy: 0.9502    
Epoch 3/10
55000/55000 [==============================] - 74s - loss: 0.1129 - categorical_accuracy: 0.9659    
Epoch 4/10
55000/55000 [==============================] - 72s - loss: 0.0886 - categorical_accuracy: 0.9738    
Epoch 5/10
55000/55000 [==============================] - 78s - loss: 0.0742 - categorical_accuracy: 0.9779    
Epoch 6/10
55000/55000 [==============================] - 74s - loss: 0.0651 - categorical_accuracy: 0.9803    
Epoch 7/10
55000/55000 [==============================] - 80s - loss: 0.0580 - categorical_accuracy: 0.9821    
Epoch 8/10
55000/55000 [==============================] - 72s - loss: 0.0515 - categorical_accuracy: 0.9842    
Epoch 9/10
55000/55000 [==============================] - 73s - loss: 0.0471 - categorical_accuracy: 0.9856    
Epoch 10/10
55000/55000 [==============================] - 79s - loss: 0.0439 - categorical_accuracy: 0.9867    
train accuracy: 0.9882363636363637
test accuracy: 0.9874

reference

http://adventuresinmachinelearning.com/keras-tutorial-cnn-11-lines/

Twitter Facebook LinkedIn

frhyme

keras로 간단하게 neural network 만들기

keras가 뭔가요?

neural network를 만듭시다.

neural network를 복잡하게 만듭시다.

convolution neural network

reference

공유하기

댓글남기기