개발 노트/머신러닝

DNN 모델 생성부터 저장 후 사용까지

LeeInGyu 2022. 3. 2. 04:55
'''
FIFA21 스탯을 통한 포메이션 분류기
파일 불러오기 단계
'''

import numpy as np
import pandas as pd

csv_file = pd.read_csv('FIFA21_official_data.csv')
csv_file.head(5)

ID Name Age Photo Nationality Flag Overall Potential Club Club Logo ... SlidingTackle GKDiving GKHandling GKKicking GKPositioning GKReflexes Best Position Best Overall Rating Release Clause DefensiveAwareness
0 176580 L. Suárez 33 https://cdn.sofifa.com/players/176/580/20_60.png Uruguay https://cdn.sofifa.com/flags/uy.png 87 87 Atlético Madrid https://cdn.sofifa.com/teams/240/light_30.png ... 38.0 27.0 25.0 31.0 33.0 37.0 ST 87.0 €64.6M 57.0
1 192985 K. De Bruyne 29 https://cdn.sofifa.com/players/192/985/20_60.png Belgium https://cdn.sofifa.com/flags/be.png 91 91 Manchester City https://cdn.sofifa.com/teams/10/light_30.png ... 53.0 15.0 13.0 5.0 10.0 13.0 CAM 91.0 €161M 68.0
2 212198 Bruno Fernandes 25 https://cdn.sofifa.com/players/212/198/20_60.png Portugal https://cdn.sofifa.com/flags/pt.png 87 90 Manchester United https://cdn.sofifa.com/teams/11/light_30.png ... 55.0 12.0 14.0 15.0 8.0 14.0 CAM 88.0 €124.4M 72.0
3 194765 A. Griezmann 29 https://cdn.sofifa.com/players/194/765/20_60.png France https://cdn.sofifa.com/flags/fr.png 87 87 FC Barcelona https://cdn.sofifa.com/teams/241/light_30.png ... 49.0 14.0 8.0 14.0 13.0 14.0 ST 87.0 €103.5M 59.0
4 224334 M. Acuña 28 https://cdn.sofifa.com/players/224/334/20_60.png Argentina https://cdn.sofifa.com/flags/ar.png 83 83 Sevilla FC https://cdn.sofifa.com/teams/481/light_30.png ... 79.0 8.0 14.0 13.0 13.0 14.0 LB 83.0 €46.2M 79.0

5 rows × 65 columns

'''
필요없는 데이터 드롭, 결측값 삭제
'''

data = csv_file.drop([
    'ID','Name','Age','Photo','Nationality','Flag','Overall','Potential','Club','Club Logo',
    'Value','Wage','Special','Preferred Foot','International Reputation','Work Rate','Body Type',
    'Real Face','Position','Jersey Number','Loaned From','Contract Valid Until','Height','Weight',
    'Weak Foot','Skill Moves','Joined',
    'Best Overall Rating','Release Clause','DefensiveAwareness',
    'Marking'
], axis=1)
data = data.dropna(axis=0)
data.shape
(16821, 34)
'''
X값과 y값 설정
'''

X = data.drop(['Best Position'], axis=1)
cateogory_y = data['Best Position']

X = X.astype('float')
cateogory_y = cateogory_y.astype('category')

X.shape, cateogory_y.shape
((16821, 33), (16821,))
'''
카테고리 타입을 원-핫 인코딩을 위해 정수로 변환
'''

category = dict((c, i) for i, c in enumerate(set(cateogory_y)))

y = [category[i] for i in cateogory_y]
'''
데이터 전처리 단계
X는 MinMax 방식을 사용
y는 원-핫 인코딩 사용
'''

import sklearn.preprocessing
import tensorflow

ScaleType = sklearn.preprocessing.MinMaxScaler()
Classfier = len(set(y))

X_standardized = ScaleType.fit_transform(X)
y_onehotencoding = tensorflow.keras.utils.to_categorical(y, Classfier)

X_standardized.shape, y_onehotencoding.shape
((16821, 33), (16821, 15))
'''
데이터 스플릿 단계
훈련 데이터: 64%
테스트 데이터: 16%
검증 데이터: 20%
'''

from sklearn.model_selection import train_test_split

X_train, X_val, y_train, y_val = train_test_split(X_standardized, y_onehotencoding, stratify=y_onehotencoding, test_size=0.2)
X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, stratify=y_train, test_size=0.2)
X_train.shape, X_test.shape, X_val.shape, y_train.shape, y_test.shape, y_val.shape
((10764, 33), (2692, 33), (3365, 33), (10764, 15), (2692, 15), (3365, 15))
'''
사용할 DNN 모델 구성
매개변수: 노드 수, 은닉층 레이어 수, 학습률, 드롭아웃, 활성화 함수
'''

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras import optimizers
from sklearn.model_selection import GridSearchCV

def model_build(node, layers, lr, dropout, activation):
    model = Sequential()
    model.add(Dense(X.shape[1]))
    for _ in range(layers):
        model.add(Dense(node, activation=activation))
        if dropout:
            model.add(Dropout(dropout))
    model.add(Dense(Classfier, activation='softmax'))
    model.compile(
        loss='categorical_crossentropy',
        optimizer=optimizers.Adam(lr),
        metrics=['accuracy'])
    return model
'''
생성한 모델을 케라스 분류기 객체에 저장
'''

from tensorflow.keras.wrappers.scikit_learn import KerasClassifier

model = KerasClassifier(build_fn=model_build, verbose=0)
'''
그리드 서치를 위한 매개변수 설정
'''

params_grid = {
    'node' : [32,64,128],
    'layers' : [1,2],
    'lr' : [0.005,0.001,0.005],
    'dropout' : [0],
    'epochs' : [100,200],
    'batch_size' : [0],
    'activation' : ['relu','sigmoid']
}
'''
그리드 서치 실행
warnings 라이브러리로 제안 출력 제어
'''

from sklearn.model_selection import GridSearchCV
import warnings

warnings.filterwarnings(action='ignore')
# warnings.filterwarnings(action='default')

grid_search = GridSearchCV(
    estimator=model,
    param_grid=params_grid,
    verbose=1,
    n_jobs=16,
)

grid_results = grid_search.fit(X_train, y_train)
Fitting 5 folds for each of 72 candidates, totalling 360 fits


[Parallel(n_jobs=16)]: Using backend LokyBackend with 16 concurrent workers.
[Parallel(n_jobs=16)]: Done  18 tasks      | elapsed:  6.7min
[Parallel(n_jobs=16)]: Done 168 tasks      | elapsed: 55.1min
[Parallel(n_jobs=16)]: Done 360 out of 360 | elapsed: 116.5min finished
'''
최적의 매개변수 저장
'''

best_node = grid_results.best_params_['node']
best_layers = grid_results.best_params_['layers']
best_lr = grid_results.best_params_['lr']
best_dropout = grid_results.best_params_['dropout']
best_epochs = grid_results.best_params_['epochs']
best_activation = grid_results.best_params_['activation']
'''
최적의 매개변수를 가진 best_model 생성
검증 데이터를 통해서 학습
'''

best_model = model_build(best_node, best_layers, best_lr, best_dropout, best_activation)
result = best_model.fit(X_val, y_val, epochs=best_epochs)
Epoch 1/100
106/106 [==============================] - 0s 1ms/step - loss: 2.3050 - accuracy: 0.2695
Epoch 2/100
106/106 [==============================] - 0s 1ms/step - loss: 1.8991 - accuracy: 0.4556
Epoch 3/100
106/106 [==============================] - 0s 1ms/step - loss: 1.5970 - accuracy: 0.5040
Epoch 4/100
106/106 [==============================] - 0s 1ms/step - loss: 1.3935 - accuracy: 0.5468
Epoch 5/100
106/106 [==============================] - 0s 1ms/step - loss: 1.2721 - accuracy: 0.5774
Epoch 6/100
106/106 [==============================] - 0s 1ms/step - loss: 1.1929 - accuracy: 0.5947
Epoch 7/100
106/106 [==============================] - 0s 1ms/step - loss: 1.1414 - accuracy: 0.6006
Epoch 8/100
106/106 [==============================] - 0s 1ms/step - loss: 1.1003 - accuracy: 0.6196
Epoch 9/100
106/106 [==============================] - 0s 1ms/step - loss: 1.0683 - accuracy: 0.6241
Epoch 10/100
106/106 [==============================] - 0s 1ms/step - loss: 1.0383 - accuracy: 0.6440
Epoch 11/100
106/106 [==============================] - 0s 1ms/step - loss: 1.0129 - accuracy: 0.6443
Epoch 12/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9855 - accuracy: 0.6505
Epoch 13/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9634 - accuracy: 0.6588
Epoch 14/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9381 - accuracy: 0.6633
Epoch 15/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9164 - accuracy: 0.6722
Epoch 16/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8993 - accuracy: 0.6764
Epoch 17/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8786 - accuracy: 0.6802
Epoch 18/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8667 - accuracy: 0.6826
Epoch 19/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8523 - accuracy: 0.6900
Epoch 20/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8423 - accuracy: 0.6939
Epoch 21/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8240 - accuracy: 0.6978
Epoch 22/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8123 - accuracy: 0.7034
Epoch 23/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7989 - accuracy: 0.7079
Epoch 24/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7906 - accuracy: 0.7025
Epoch 25/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7804 - accuracy: 0.7031
Epoch 26/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7711 - accuracy: 0.7135
Epoch 27/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7658 - accuracy: 0.7040
Epoch 28/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7582 - accuracy: 0.7126
Epoch 29/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7522 - accuracy: 0.7168
Epoch 30/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7431 - accuracy: 0.7207
Epoch 31/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7389 - accuracy: 0.7198
Epoch 32/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7378 - accuracy: 0.7156
Epoch 33/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7262 - accuracy: 0.7227
Epoch 34/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7238 - accuracy: 0.7239
Epoch 35/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7173 - accuracy: 0.7218
Epoch 36/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7149 - accuracy: 0.7239
Epoch 37/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7110 - accuracy: 0.7272
Epoch 38/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7098 - accuracy: 0.7212
Epoch 39/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7036 - accuracy: 0.7272
Epoch 40/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7033 - accuracy: 0.7260
Epoch 41/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6995 - accuracy: 0.7290
Epoch 42/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6968 - accuracy: 0.7287
Epoch 43/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6939 - accuracy: 0.7284
Epoch 44/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6963 - accuracy: 0.7239
Epoch 45/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6848 - accuracy: 0.7367
Epoch 46/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6888 - accuracy: 0.7325
Epoch 47/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6822 - accuracy: 0.7314
Epoch 48/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6815 - accuracy: 0.7340
Epoch 49/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6769 - accuracy: 0.7311
Epoch 50/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6747 - accuracy: 0.7367
Epoch 51/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6776 - accuracy: 0.7340
Epoch 52/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6694 - accuracy: 0.7364
Epoch 53/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6712 - accuracy: 0.7412
Epoch 54/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6698 - accuracy: 0.7379
Epoch 55/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6685 - accuracy: 0.7376
Epoch 56/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6684 - accuracy: 0.7376
Epoch 57/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6707 - accuracy: 0.7406
Epoch 58/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6618 - accuracy: 0.7388
Epoch 59/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6586 - accuracy: 0.7438
Epoch 60/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6574 - accuracy: 0.7453
Epoch 61/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6636 - accuracy: 0.7364
Epoch 62/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6543 - accuracy: 0.7471
Epoch 63/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6576 - accuracy: 0.7447
Epoch 64/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6529 - accuracy: 0.7423
Epoch 65/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6496 - accuracy: 0.7453
Epoch 66/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6479 - accuracy: 0.7465
Epoch 67/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6451 - accuracy: 0.7426
Epoch 68/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6445 - accuracy: 0.7474
Epoch 69/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6430 - accuracy: 0.7453
Epoch 70/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6367 - accuracy: 0.7471
Epoch 71/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6386 - accuracy: 0.7498
Epoch 72/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6418 - accuracy: 0.7462
Epoch 73/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6393 - accuracy: 0.7453
Epoch 74/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6376 - accuracy: 0.7530
Epoch 75/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6360 - accuracy: 0.7501
Epoch 76/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6318 - accuracy: 0.7483
Epoch 77/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6346 - accuracy: 0.7465
Epoch 78/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6352 - accuracy: 0.7516
Epoch 79/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6363 - accuracy: 0.7504
Epoch 80/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6283 - accuracy: 0.7477
Epoch 81/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6367 - accuracy: 0.7486
Epoch 82/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6242 - accuracy: 0.7530
Epoch 83/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6216 - accuracy: 0.7554
Epoch 84/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6288 - accuracy: 0.7519
Epoch 85/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6340 - accuracy: 0.7483
Epoch 86/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6267 - accuracy: 0.7510
Epoch 87/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6196 - accuracy: 0.7519
Epoch 88/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6165 - accuracy: 0.7614
Epoch 89/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6272 - accuracy: 0.7578
Epoch 90/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6182 - accuracy: 0.7554
Epoch 91/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6200 - accuracy: 0.7525
Epoch 92/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6135 - accuracy: 0.7584
Epoch 93/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6100 - accuracy: 0.7599
Epoch 94/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6157 - accuracy: 0.7608
Epoch 95/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6212 - accuracy: 0.7548
Epoch 96/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6082 - accuracy: 0.7620
Epoch 97/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6127 - accuracy: 0.7587
Epoch 98/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6132 - accuracy: 0.7590
Epoch 99/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6094 - accuracy: 0.7605
Epoch 100/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6068 - accuracy: 0.7581
'''
최적의 모델 요약
'''

best_model.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_9 (Dense)              (None, 33)                1122      
_________________________________________________________________
dense_10 (Dense)             (None, 64)                2176      
_________________________________________________________________
dense_11 (Dense)             (None, 15)                975       
=================================================================
Total params: 4,273
Trainable params: 4,273
Non-trainable params: 0
_________________________________________________________________
'''
모든 내용을 정리한 내용 출력
'''

from sklearn import metrics

train_pred = grid_results.predict(X_train)
train_pred = tensorflow.keras.utils.to_categorical(train_pred, Classfier)
rst_train = metrics.accuracy_score(y_train, train_pred)
rst_train = round(rst_train*100, 2)

test_pred = grid_results.predict(X_test)
test_pred = tensorflow.keras.utils.to_categorical(test_pred, Classfier)
rst_test = metrics.accuracy_score(y_test, test_pred)
rst_test = round(rst_test*100, 2)

rst_val = result.history['accuracy'][-1]
rst_val = round(rst_val*100, 2)

print(f'훈련 데이터 정확도: {rst_train}%')
print(f'테스트 데이터 정확도: {rst_test}%')
print(f'분할 데이터 정확도: {rst_val}%')
print(f'최적 매개변수: {grid_results.best_params_}')
훈련 데이터 정확도: 77.51%
테스트 데이터 정확도: 76.49%
분할 데이터 정확도: 75.81%
최적 매개변수: {'activation': 'sigmoid', 'batch_size': 0, 'dropout': 0, 'epochs': 100, 'layers': 1, 'lr': 0.001, 'node': 64}
'''
생성한 최적의 모델 저장
'''

from tensorflow.keras.models import load_model

model_name = f'{rst_train}-{rst_test}-{rst_val}'

best_model.save(f'{model_name}.h5')
'''
모델을 사용하고 싶을때 불러와서 사용
'''

save_model = load_model(f'{model_name}.h5')
save_model.predict_classes(X_val)
array([ 7, 13, 12, ...,  7, 11, 10])
728x90
반응형