< TensorFlow >
※ 그래프를 그려주는 python 라이브러리
▷ 구성요소
- Tensor : 다차원 배열 (데이터)
- node : 수치연산(+, -, *, /), 데이터의 입출력
- Edge : node와 node를 연결하는 선으로, 이 선을 따라서 Tensor가 이동
▷ Graph를 만들고 실행해야 원하는 값을 출력할 수 있다.
- Graph를 생성하는 단계
- Graph를 실행하는 단계 → Session
▷ 사용자로부터 데이터를 입력 받으려면 placeholder라는 값을 받아들이기 위한 node를 사용해야한다.
< TensorFlow - Linear Regression >
- TensorFlow는 python 기반으로 생성된 ML을 위한 플랫폼이다.
- 이번 예제에서는 1.15 버전을 사용
- pip install tensorflow == 1.15
- node를 생성하고, node들을 연결하여 그래프를 생성
- 그래프를 실행하기 위해서 Session을 생성해야 함
※ Linear Regression Model Graph

1. node 실습
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tensorflow as tf | |
# tensorflow 버전 확인 | |
print(tf.__version__) # 1.15.0 | |
# node 생성 | |
node = tf.constant('Hello World') | |
# 그래프를 실행하기 위해서는 Session이 필요!! | |
sess = tf.Session() | |
# runner인 Session이 생성되었으므로 이를 이용해서 node를 실행해보자 | |
print(sess.run(node).decode()) # Hello World | |
# node를 2개 만들어서 실습해보자 | |
node1 = tf.constant(10, dtype=tf.float32) | |
node2 = tf.constant(20, dtype=tf.float32) | |
node3 = node1 + node2 | |
sess = tf.Session() | |
# 여러개의 노드 실행 가능 | |
print(sess.run([node3,node2])) # [30.0, 20.0] |
2. placeholder 실습
※ placeholder를 만들어두고, feed dictionary를 활용하여 Tensor를 mapping하는 방식
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tensorflow as tf | |
# scalar 형태의 실수 값 1개를 받아들을 수 있는 placeholder | |
node1 = tf.placeholder(dtype=tf.float32) | |
node2 = tf.placeholder(dtype=tf.float32) | |
node3 = node1 + node2 | |
sess = tf.Session() | |
print(sess.run(node3, feed_dict={node1:20, node2:40})) # 60.0 |
3. 간단한 ML 실습
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tensorflow as tf | |
# 1. Training Data Set | |
x_data = [2,4,5,7,10] | |
t_data = [7,11,13,17,23] | |
# 2. Weight & Bias | |
W = tf.Variable(tf.random.normal([1]), name='weight') | |
b = tf.Variable(tf.random.normal([1]), name='bias') | |
# 3. Hypothesis, Simple Linear Regression Model | |
H = W * x_data + b | |
# 4. Loss function | |
loss = tf.reduce_mean(tf.square(t_data-H)) | |
# 5. train node 생성 | |
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) | |
train = optimizer.minimize(loss) | |
# 6. 실행준비 및 초기화 작업 | |
sess = tf.Session() | |
sess.run(tf.global_variables_initializer()) # 초기화 작업 - 무조건 실행 | |
# 9. 반복해서 학습을 진행 | |
for step in range(30000): | |
_, W_val, b_val, loss_val = sess.run([train, W, b, loss]) | |
if step % 3000 == 0: | |
print('W : {}, b : {}, loss : {}'.format(W_val, b_val, loss_val)) | |
print(sess.run(H)) # [ 6.9997516 10.999857 12.999908 17.000013 23.000172 ] | |
### 결과 ### | |
# W : [1.6663913], b : [-1.3141297], loss : 45.45983123779297 | |
# W : [2.196295], b : [1.6465404], loss : 0.3515620231628418 | |
# W : [2.063593], b : [2.5615284], loss : 0.03689724951982498 | |
# W : [2.020603], b : [2.8579428], loss : 0.0038729305379092693 | |
# W : [2.0066757], b : [2.9539702], loss : 0.00040662504034116864 | |
# W : [2.0021634], b : [2.9850838], loss : 4.27004270022735e-05 | |
# W : [2.0007029], b : [2.9951618], loss : 4.4914136196894106e-06 | |
# W : [2.0002275], b : [2.998434], loss : 4.706189429271035e-07 | |
# W : [2.0000815], b : [2.9994454], loss : 5.922065682284483e-08 | |
# W : [2.0000525], b : [2.9996467], loss : 2.4012525301486676e-08 |
4. 간단한 ML 학습 및 예측 실습
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tensorflow as tf | |
# 1. Training Data Set | |
x_data = [1,2,3,4,5] | |
t_data = [2,4,6,8,10] | |
# 2. placeholder | |
X = tf.placeholder(dtype=tf.float32) | |
T = tf.placeholder(dtype=tf.float32) | |
# 3. Weight & Bias / shape => (3,4) | |
W = tf.Variable(tf.random.normal([1]), name='weight') | |
b = tf.Variable(tf.random.normal([1]), name='bias') | |
# 4. Hypothesis : Simple Linear Regression Model | |
H = W*X + b | |
# 5. Loss function | |
loss = tf.reduce_mean(tf.square(H-T)) | |
# 6. train 노드 생성 | |
train = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(loss) | |
# 7. Session 생성 & 초기화 | |
sess = tf.Session() | |
sess.run(tf.global_variables_initializer()) | |
# 8. 학습 진행(Graph 실행) | |
for step in range(30000): | |
_, W_val, b_val, loss_val = sess.run([train, W, b, loss], feed_dict={X:x_data, T:t_data}) | |
if step % 3000 == 0: | |
print('W : {}, b : {}, cost : {}'.format(W_val, b_val, loss_val)) | |
# 9. predict | |
print(sess.run(H, feed_dict={X:[6]})) # [11.999949] | |
### 결과 ### | |
# W : [0.60376024], b : [0.01065517], cost : 22.40163803100586 | |
# W : [1.9629316], b : [0.13382721], cost : 0.003262100275605917 | |
# W : [1.9865568], b : [0.0485307], cost : 0.0004289901698939502 | |
# W : [1.9951215], b : [0.01760776], cost : 5.6472275900887325e-05 | |
# W : [1.9982265], b : [0.00639496], cost : 7.449575605278369e-06 | |
# W : [1.999356], b : [0.00232831], cost : 9.873516546576866e-07 | |
# W : [1.9997668], b : [0.00084448], cost : 1.2977157837212872e-07 | |
# W : [1.9999119], b : [0.00031314], cost : 1.792932380340062e-08 | |
# W : [1.9999607], b : [0.00013433], cost : 3.358274991427379e-09 | |
# W : [1.9999771], b : [7.416416e-05], cost : 1.075579847409358e-09 |
5. 오존데이터를 활용한 학습 및 예측 실습
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tensorflow as tf | |
import numpy as np | |
import pandas as pd | |
from scipy import stats | |
from sklearn.preprocessing import MinMaxScaler | |
# Raw Data Loading | |
df = pd.read_csv('./data/ozone.csv') | |
# Data Preprocessing | |
# 결측치 제거 | |
df = df.dropna(how='any') | |
df = df[['Ozone', 'Solar.R', 'Wind', 'Temp']] | |
# 이상치 처리 | |
zscore_threshold = 2.0 | |
# 각 column의 이상치를 확인하고 제거 | |
for col in df.columns: | |
outliers = df[col][np.abs(stats.zscore(df[col])) > zscore_threshold] | |
df = df.loc[~df[col].isin(outliers)] | |
# Normalization | |
# 독립변수와 종속변수의 scaler 객체를 각각 생성 | |
scaler_x = MinMaxScaler() | |
scaler_t = MinMaxScaler() | |
training_data_x = df.iloc[:,1:] | |
training_data_t = df['Ozone'].values.reshape(-1,1) | |
scaler_x.fit(training_data_x) | |
scaler_t.fit(training_data_t) | |
training_data_x = scaler_x.transform(training_data_x) | |
training_data_t = scaler_t.transform(training_data_t) | |
# Training Data Set | |
x_data = training_data_x | |
t_data = training_data_t | |
# placeholder | |
X = tf.placeholder(shape=[None,3], dtype=tf.float32) | |
T = tf.placeholder(shape=[None,1], dtype=tf.float32) | |
# Weight & bias | |
W = tf.Variable(tf.random.normal([3,1]), name='weight') | |
b = tf.Variable(tf.random.normal([1]), name='bias') | |
# Hypothesis : Multiple Linear Regression Model | |
H = tf.matmul(X,W) + b | |
# loss function | |
loss = tf.reduce_mean(tf.square(H-T)) | |
# train 노드 생성 | |
train = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(loss) | |
# Session 생성 & 초기화 | |
sess = tf.Session() | |
sess.run(tf.global_variables_initializer()) | |
# 학습 진행(Graph 실행) | |
for step in range(30000): | |
_, W_val, b_val, loss_val = sess.run([train, W, b, loss], feed_dict={X:x_data, T:t_data}) | |
if step % 3000 == 0: | |
print('W : {}, b : {}, loss : {}'.format(W_val, b_val, loss_val)) | |
### 결과 ### | |
# 마지막 출력문만 확인 | |
# W : [[ 0.174692 ] | |
# [-0.36312494] | |
# [ 0.5371174 ]], b : [0.1163004], loss : 0.020663734525442123 | |
# predict | |
predict_data_x = np.array([[150, 8.0, 85]]) | |
predict_data_x = scaler_x.transform(predict_data_x) | |
result = sess.run(H, feed_dict={X:predict_data_x}) | |
result = scaler_t.inverse_transform(result) | |
print(result) # [[53.870525]] |
'Python > Data Analysis' 카테고리의 다른 글
Data Analysis / ML / Logistic Regression(2) (0) | 2020.10.08 |
---|---|
Data Analysis / ML / Logistic Regression(1) (0) | 2020.10.07 |
Data Analysis / ML / Normalization (0) | 2020.10.04 |
Data Analysis / ML / Linear Regression Model (0) | 2020.10.04 |
Data Analysis / ML / Basic Concept(2) (0) | 2020.09.22 |