개발 노트/머신러닝
DNN 모델 생성부터 저장 후 사용까지
LeeInGyu
2022. 3. 2. 04:55
'''
FIFA21 스탯을 통한 포메이션 분류기
파일 불러오기 단계
'''
import numpy as np
import pandas as pd
csv_file = pd.read_csv('FIFA21_official_data.csv')
csv_file.head(5)
ID | Name | Age | Photo | Nationality | Flag | Overall | Potential | Club | Club Logo | ... | SlidingTackle | GKDiving | GKHandling | GKKicking | GKPositioning | GKReflexes | Best Position | Best Overall Rating | Release Clause | DefensiveAwareness | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 176580 | L. Suárez | 33 | https://cdn.sofifa.com/players/176/580/20_60.png | Uruguay | https://cdn.sofifa.com/flags/uy.png | 87 | 87 | Atlético Madrid | https://cdn.sofifa.com/teams/240/light_30.png | ... | 38.0 | 27.0 | 25.0 | 31.0 | 33.0 | 37.0 | ST | 87.0 | €64.6M | 57.0 |
1 | 192985 | K. De Bruyne | 29 | https://cdn.sofifa.com/players/192/985/20_60.png | Belgium | https://cdn.sofifa.com/flags/be.png | 91 | 91 | Manchester City | https://cdn.sofifa.com/teams/10/light_30.png | ... | 53.0 | 15.0 | 13.0 | 5.0 | 10.0 | 13.0 | CAM | 91.0 | €161M | 68.0 |
2 | 212198 | Bruno Fernandes | 25 | https://cdn.sofifa.com/players/212/198/20_60.png | Portugal | https://cdn.sofifa.com/flags/pt.png | 87 | 90 | Manchester United | https://cdn.sofifa.com/teams/11/light_30.png | ... | 55.0 | 12.0 | 14.0 | 15.0 | 8.0 | 14.0 | CAM | 88.0 | €124.4M | 72.0 |
3 | 194765 | A. Griezmann | 29 | https://cdn.sofifa.com/players/194/765/20_60.png | France | https://cdn.sofifa.com/flags/fr.png | 87 | 87 | FC Barcelona | https://cdn.sofifa.com/teams/241/light_30.png | ... | 49.0 | 14.0 | 8.0 | 14.0 | 13.0 | 14.0 | ST | 87.0 | €103.5M | 59.0 |
4 | 224334 | M. Acuña | 28 | https://cdn.sofifa.com/players/224/334/20_60.png | Argentina | https://cdn.sofifa.com/flags/ar.png | 83 | 83 | Sevilla FC | https://cdn.sofifa.com/teams/481/light_30.png | ... | 79.0 | 8.0 | 14.0 | 13.0 | 13.0 | 14.0 | LB | 83.0 | €46.2M | 79.0 |
5 rows × 65 columns
'''
필요없는 데이터 드롭, 결측값 삭제
'''
data = csv_file.drop([
'ID','Name','Age','Photo','Nationality','Flag','Overall','Potential','Club','Club Logo',
'Value','Wage','Special','Preferred Foot','International Reputation','Work Rate','Body Type',
'Real Face','Position','Jersey Number','Loaned From','Contract Valid Until','Height','Weight',
'Weak Foot','Skill Moves','Joined',
'Best Overall Rating','Release Clause','DefensiveAwareness',
'Marking'
], axis=1)
data = data.dropna(axis=0)
data.shape
(16821, 34)
'''
X값과 y값 설정
'''
X = data.drop(['Best Position'], axis=1)
cateogory_y = data['Best Position']
X = X.astype('float')
cateogory_y = cateogory_y.astype('category')
X.shape, cateogory_y.shape
((16821, 33), (16821,))
'''
카테고리 타입을 원-핫 인코딩을 위해 정수로 변환
'''
category = dict((c, i) for i, c in enumerate(set(cateogory_y)))
y = [category[i] for i in cateogory_y]
'''
데이터 전처리 단계
X는 MinMax 방식을 사용
y는 원-핫 인코딩 사용
'''
import sklearn.preprocessing
import tensorflow
ScaleType = sklearn.preprocessing.MinMaxScaler()
Classfier = len(set(y))
X_standardized = ScaleType.fit_transform(X)
y_onehotencoding = tensorflow.keras.utils.to_categorical(y, Classfier)
X_standardized.shape, y_onehotencoding.shape
((16821, 33), (16821, 15))
'''
데이터 스플릿 단계
훈련 데이터: 64%
테스트 데이터: 16%
검증 데이터: 20%
'''
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X_standardized, y_onehotencoding, stratify=y_onehotencoding, test_size=0.2)
X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, stratify=y_train, test_size=0.2)
X_train.shape, X_test.shape, X_val.shape, y_train.shape, y_test.shape, y_val.shape
((10764, 33), (2692, 33), (3365, 33), (10764, 15), (2692, 15), (3365, 15))
'''
사용할 DNN 모델 구성
매개변수: 노드 수, 은닉층 레이어 수, 학습률, 드롭아웃, 활성화 함수
'''
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras import optimizers
from sklearn.model_selection import GridSearchCV
def model_build(node, layers, lr, dropout, activation):
model = Sequential()
model.add(Dense(X.shape[1]))
for _ in range(layers):
model.add(Dense(node, activation=activation))
if dropout:
model.add(Dropout(dropout))
model.add(Dense(Classfier, activation='softmax'))
model.compile(
loss='categorical_crossentropy',
optimizer=optimizers.Adam(lr),
metrics=['accuracy'])
return model
'''
생성한 모델을 케라스 분류기 객체에 저장
'''
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
model = KerasClassifier(build_fn=model_build, verbose=0)
'''
그리드 서치를 위한 매개변수 설정
'''
params_grid = {
'node' : [32,64,128],
'layers' : [1,2],
'lr' : [0.005,0.001,0.005],
'dropout' : [0],
'epochs' : [100,200],
'batch_size' : [0],
'activation' : ['relu','sigmoid']
}
'''
그리드 서치 실행
warnings 라이브러리로 제안 출력 제어
'''
from sklearn.model_selection import GridSearchCV
import warnings
warnings.filterwarnings(action='ignore')
# warnings.filterwarnings(action='default')
grid_search = GridSearchCV(
estimator=model,
param_grid=params_grid,
verbose=1,
n_jobs=16,
)
grid_results = grid_search.fit(X_train, y_train)
Fitting 5 folds for each of 72 candidates, totalling 360 fits
[Parallel(n_jobs=16)]: Using backend LokyBackend with 16 concurrent workers.
[Parallel(n_jobs=16)]: Done 18 tasks | elapsed: 6.7min
[Parallel(n_jobs=16)]: Done 168 tasks | elapsed: 55.1min
[Parallel(n_jobs=16)]: Done 360 out of 360 | elapsed: 116.5min finished
'''
최적의 매개변수 저장
'''
best_node = grid_results.best_params_['node']
best_layers = grid_results.best_params_['layers']
best_lr = grid_results.best_params_['lr']
best_dropout = grid_results.best_params_['dropout']
best_epochs = grid_results.best_params_['epochs']
best_activation = grid_results.best_params_['activation']
'''
최적의 매개변수를 가진 best_model 생성
검증 데이터를 통해서 학습
'''
best_model = model_build(best_node, best_layers, best_lr, best_dropout, best_activation)
result = best_model.fit(X_val, y_val, epochs=best_epochs)
Epoch 1/100
106/106 [==============================] - 0s 1ms/step - loss: 2.3050 - accuracy: 0.2695
Epoch 2/100
106/106 [==============================] - 0s 1ms/step - loss: 1.8991 - accuracy: 0.4556
Epoch 3/100
106/106 [==============================] - 0s 1ms/step - loss: 1.5970 - accuracy: 0.5040
Epoch 4/100
106/106 [==============================] - 0s 1ms/step - loss: 1.3935 - accuracy: 0.5468
Epoch 5/100
106/106 [==============================] - 0s 1ms/step - loss: 1.2721 - accuracy: 0.5774
Epoch 6/100
106/106 [==============================] - 0s 1ms/step - loss: 1.1929 - accuracy: 0.5947
Epoch 7/100
106/106 [==============================] - 0s 1ms/step - loss: 1.1414 - accuracy: 0.6006
Epoch 8/100
106/106 [==============================] - 0s 1ms/step - loss: 1.1003 - accuracy: 0.6196
Epoch 9/100
106/106 [==============================] - 0s 1ms/step - loss: 1.0683 - accuracy: 0.6241
Epoch 10/100
106/106 [==============================] - 0s 1ms/step - loss: 1.0383 - accuracy: 0.6440
Epoch 11/100
106/106 [==============================] - 0s 1ms/step - loss: 1.0129 - accuracy: 0.6443
Epoch 12/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9855 - accuracy: 0.6505
Epoch 13/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9634 - accuracy: 0.6588
Epoch 14/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9381 - accuracy: 0.6633
Epoch 15/100
106/106 [==============================] - 0s 1ms/step - loss: 0.9164 - accuracy: 0.6722
Epoch 16/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8993 - accuracy: 0.6764
Epoch 17/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8786 - accuracy: 0.6802
Epoch 18/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8667 - accuracy: 0.6826
Epoch 19/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8523 - accuracy: 0.6900
Epoch 20/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8423 - accuracy: 0.6939
Epoch 21/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8240 - accuracy: 0.6978
Epoch 22/100
106/106 [==============================] - 0s 1ms/step - loss: 0.8123 - accuracy: 0.7034
Epoch 23/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7989 - accuracy: 0.7079
Epoch 24/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7906 - accuracy: 0.7025
Epoch 25/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7804 - accuracy: 0.7031
Epoch 26/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7711 - accuracy: 0.7135
Epoch 27/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7658 - accuracy: 0.7040
Epoch 28/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7582 - accuracy: 0.7126
Epoch 29/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7522 - accuracy: 0.7168
Epoch 30/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7431 - accuracy: 0.7207
Epoch 31/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7389 - accuracy: 0.7198
Epoch 32/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7378 - accuracy: 0.7156
Epoch 33/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7262 - accuracy: 0.7227
Epoch 34/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7238 - accuracy: 0.7239
Epoch 35/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7173 - accuracy: 0.7218
Epoch 36/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7149 - accuracy: 0.7239
Epoch 37/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7110 - accuracy: 0.7272
Epoch 38/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7098 - accuracy: 0.7212
Epoch 39/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7036 - accuracy: 0.7272
Epoch 40/100
106/106 [==============================] - 0s 1ms/step - loss: 0.7033 - accuracy: 0.7260
Epoch 41/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6995 - accuracy: 0.7290
Epoch 42/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6968 - accuracy: 0.7287
Epoch 43/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6939 - accuracy: 0.7284
Epoch 44/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6963 - accuracy: 0.7239
Epoch 45/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6848 - accuracy: 0.7367
Epoch 46/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6888 - accuracy: 0.7325
Epoch 47/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6822 - accuracy: 0.7314
Epoch 48/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6815 - accuracy: 0.7340
Epoch 49/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6769 - accuracy: 0.7311
Epoch 50/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6747 - accuracy: 0.7367
Epoch 51/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6776 - accuracy: 0.7340
Epoch 52/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6694 - accuracy: 0.7364
Epoch 53/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6712 - accuracy: 0.7412
Epoch 54/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6698 - accuracy: 0.7379
Epoch 55/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6685 - accuracy: 0.7376
Epoch 56/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6684 - accuracy: 0.7376
Epoch 57/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6707 - accuracy: 0.7406
Epoch 58/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6618 - accuracy: 0.7388
Epoch 59/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6586 - accuracy: 0.7438
Epoch 60/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6574 - accuracy: 0.7453
Epoch 61/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6636 - accuracy: 0.7364
Epoch 62/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6543 - accuracy: 0.7471
Epoch 63/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6576 - accuracy: 0.7447
Epoch 64/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6529 - accuracy: 0.7423
Epoch 65/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6496 - accuracy: 0.7453
Epoch 66/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6479 - accuracy: 0.7465
Epoch 67/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6451 - accuracy: 0.7426
Epoch 68/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6445 - accuracy: 0.7474
Epoch 69/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6430 - accuracy: 0.7453
Epoch 70/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6367 - accuracy: 0.7471
Epoch 71/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6386 - accuracy: 0.7498
Epoch 72/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6418 - accuracy: 0.7462
Epoch 73/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6393 - accuracy: 0.7453
Epoch 74/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6376 - accuracy: 0.7530
Epoch 75/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6360 - accuracy: 0.7501
Epoch 76/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6318 - accuracy: 0.7483
Epoch 77/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6346 - accuracy: 0.7465
Epoch 78/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6352 - accuracy: 0.7516
Epoch 79/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6363 - accuracy: 0.7504
Epoch 80/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6283 - accuracy: 0.7477
Epoch 81/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6367 - accuracy: 0.7486
Epoch 82/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6242 - accuracy: 0.7530
Epoch 83/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6216 - accuracy: 0.7554
Epoch 84/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6288 - accuracy: 0.7519
Epoch 85/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6340 - accuracy: 0.7483
Epoch 86/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6267 - accuracy: 0.7510
Epoch 87/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6196 - accuracy: 0.7519
Epoch 88/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6165 - accuracy: 0.7614
Epoch 89/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6272 - accuracy: 0.7578
Epoch 90/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6182 - accuracy: 0.7554
Epoch 91/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6200 - accuracy: 0.7525
Epoch 92/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6135 - accuracy: 0.7584
Epoch 93/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6100 - accuracy: 0.7599
Epoch 94/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6157 - accuracy: 0.7608
Epoch 95/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6212 - accuracy: 0.7548
Epoch 96/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6082 - accuracy: 0.7620
Epoch 97/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6127 - accuracy: 0.7587
Epoch 98/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6132 - accuracy: 0.7590
Epoch 99/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6094 - accuracy: 0.7605
Epoch 100/100
106/106 [==============================] - 0s 1ms/step - loss: 0.6068 - accuracy: 0.7581
'''
최적의 모델 요약
'''
best_model.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_9 (Dense) (None, 33) 1122
_________________________________________________________________
dense_10 (Dense) (None, 64) 2176
_________________________________________________________________
dense_11 (Dense) (None, 15) 975
=================================================================
Total params: 4,273
Trainable params: 4,273
Non-trainable params: 0
_________________________________________________________________
'''
모든 내용을 정리한 내용 출력
'''
from sklearn import metrics
train_pred = grid_results.predict(X_train)
train_pred = tensorflow.keras.utils.to_categorical(train_pred, Classfier)
rst_train = metrics.accuracy_score(y_train, train_pred)
rst_train = round(rst_train*100, 2)
test_pred = grid_results.predict(X_test)
test_pred = tensorflow.keras.utils.to_categorical(test_pred, Classfier)
rst_test = metrics.accuracy_score(y_test, test_pred)
rst_test = round(rst_test*100, 2)
rst_val = result.history['accuracy'][-1]
rst_val = round(rst_val*100, 2)
print(f'훈련 데이터 정확도: {rst_train}%')
print(f'테스트 데이터 정확도: {rst_test}%')
print(f'분할 데이터 정확도: {rst_val}%')
print(f'최적 매개변수: {grid_results.best_params_}')
훈련 데이터 정확도: 77.51%
테스트 데이터 정확도: 76.49%
분할 데이터 정확도: 75.81%
최적 매개변수: {'activation': 'sigmoid', 'batch_size': 0, 'dropout': 0, 'epochs': 100, 'layers': 1, 'lr': 0.001, 'node': 64}
'''
생성한 최적의 모델 저장
'''
from tensorflow.keras.models import load_model
model_name = f'{rst_train}-{rst_test}-{rst_val}'
best_model.save(f'{model_name}.h5')
'''
모델을 사용하고 싶을때 불러와서 사용
'''
save_model = load_model(f'{model_name}.h5')
save_model.predict_classes(X_val)
array([ 7, 13, 12, ..., 7, 11, 10])
728x90
반응형