๊ฐœ๋ฐœ Code/์ธ๊ณต์ง€๋Šฅ A.I.

[Python][AI] ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„(EDA) - ์™€์ธ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹ (Wine Quality Dataset) - 3

5hr1rnp 2025. 2. 4. 20:57
๋ฐ˜์‘ํ˜•

2025.01.23 - [๊ฐœ๋ฐœ Code/์ธ๊ณต์ง€๋Šฅ A.I.] - [Python][AI] ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„(EDA) - ์™€์ธ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹ (Wine Quality Dataset) - 1

2025.01.24 - [๊ฐœ๋ฐœ Code/์ธ๊ณต์ง€๋Šฅ A.I.] - [Python][AI] ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„(EDA) - ์™€์ธ ํ’ˆ์งˆ ๋ฐ์ดํ„ฐ์…‹ (Wine Quality Dataset) - 2

๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”


๋ณ€์ˆ˜ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์‹œ๊ฐํ™”ํ•˜์—ฌ ํŒจํ„ด์„ ํŒŒ์•…ํ•˜๋„๋ก ํ•œ๋‹ค.

 

 

ํ’ˆ์งˆ ๋ถ„ํฌ ๋ฐ ๋ณ€์ˆ˜ ๊ฐ„ ์ƒ๊ด€ ๊ด€๊ณ„(Correlation Heatmap)

# Library Version
# pandas    : 2.2.3
# numpy     : 1.23.5
# matplotlib: 3.9.2
# seaborn   : 0.13.2

import matplotlib.pyplot as plt
import seaborn as sns

# ์™€์ธ ํ’ˆ์งˆ ๋ถ„ํฌ ์‹œ๊ฐํ™”
plt.figure(figsize=(10, 5))
sns.histplot(red_wine['quality'], bins=6, kde=True, color='red', label='Red Wine')
sns.histplot(white_wine['quality'], bins=6, kde=True, color='blue', label='White Wine')
plt.legend()
plt.title("Wine Quality Distribution (Red & White)")
plt.xlabel("Quality")
plt.ylabel("Count")
plt.grid(True)
plt.show()

# ๋ณ€์ˆ˜ ๊ฐ„ ์ƒ๊ด€ ๊ด€๊ณ„ ํžˆํŠธ๋งต (Red Wine)
plt.figure(figsize=(12, 8))
sns.heatmap(red_wine.corr(), annot=True, cmap='coolwarm', fmt=".2f", linewidths=0.5)
plt.title("Correlation Heatmap (Red Wine)")
plt.show()

# ๋ณ€์ˆ˜ ๊ฐ„ ์ƒ๊ด€ ๊ด€๊ณ„ ํžˆํŠธ๋งต (White Wine)
plt.figure(figsize=(12, 8))
sns.heatmap(white_wine.corr(), annot=True, cmap='coolwarm', fmt=".2f", linewidths=0.5)
plt.title("Correlation Heatmap (White Wine)")
plt.show()

 

  • ํ’ˆ์งˆ ๋ถ„ํฌ
    • ๋ ˆ๋“œ ์™€์ธ๊ณผ ํ™”์ดํŠธ ์™€์ธ์˜ ํ’ˆ์งˆ(quality) ๋ถ„ํฌ๊ฐ€ ๋‹ค์†Œ ์ฐจ์ด๊ฐ€ ์žˆ์Œ
    • ์ค‘๊ฐ„ ํ’ˆ์งˆ(5~6์ )์ด ๊ฐ€์žฅ ๋งŽ์œผ๋ฉฐ, ๊ทน๋‹จ์ ์œผ๋กœ ๋†’์€ ํ’ˆ์งˆ(8 ์ด์ƒ)์€ ์ ์Œ

 

์™€์ธ ๋ณ„ ํ’ˆ์งˆ(Quality) ๋ถ„ํฌ๋„

 

728x90
๋ฐ˜์‘ํ˜•
  • ๋ณ€์ˆ˜ ๊ฐ„ ์ƒ๊ด€ ๊ด€๊ณ„ (Correlation Heatmap)
    • ๋ ˆ๋“œ ์™€์ธ์˜ ๊ฒฝ์šฐ ์•Œ์ฝ”์˜ฌ(alcohol)๊ณผ ํ’ˆ์งˆ(quality)์˜ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ์ƒ๋Œ€์ ์œผ๋กœ ๋†’์€ ํŽธ
    • ํ™”์ดํŠธ ์™€์ธ๋„ ์•Œ์ฝ”์˜ฌ๊ณผ ํ’ˆ์งˆ์˜ ๊ด€๊ณ„๊ฐ€ ๊ฐ•ํ•˜๋ฉฐ, ํœ˜๋ฐœ์„ฑ ์‚ฐ(volatile acidity)์€ ํ’ˆ์งˆ๊ณผ ์Œ์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๊ฐ€์ง
    • ์ „๋ฐ˜์ ์œผ๋กœ ๋ฐ€๋„(density)์™€ ํ’ˆ์งˆ(quality)์€ ์Œ์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๋ณด์ž„

 

๋ ˆ๋“œ ์™€์ธ ๋ณ€์ˆ˜ ๊ฐ„ ์ƒ๊ด€ ๊ด€๊ณ„
ํ™”์ดํŠธ ์™€์ธ ๋ณ€์ˆ˜ ๊ฐ„ ์ƒ๊ด€ ๊ด€๊ณ„

 


๋ฐ”์ด์˜ฌ๋ฆฐ ํ”Œ๋กฏ ๋ถ„์„

 

# ์ฃผ์š” ๋ณ€์ˆ˜ ์„ ํƒ
features = ['alcohol', 'volatile acidity', 'density', 'sulphates', 'citric acid']

# ๋ ˆ๋“œ ์™€์ธ ๋ฐ”์ด์˜ฌ๋ฆฐ ํ”Œ๋กฏ
plt.figure(figsize=(15, 10))
for i, feature in enumerate(features, 1):
    plt.subplot(2, 3, i)
    sns.violinplot(x=red_wine['quality'], y=red_wine[feature], palette="Reds")
    plt.title(f"{feature} vs Quality (Red Wine)")
    plt.xlabel("Quality")
    plt.ylabel(feature)
    plt.grid(True)

plt.tight_layout()
plt.show()

# ํ™”์ดํŠธ ์™€์ธ ๋ฐ”์ด์˜ฌ๋ฆฐ ํ”Œ๋กฏ
plt.figure(figsize=(15, 10))
for i, feature in enumerate(features, 1):
    plt.subplot(2, 3, i)
    sns.violinplot(x=white_wine['quality'], y=white_wine[feature], palette="Blues")
    plt.title(f"{feature} vs Quality (White Wine)")
    plt.xlabel("Quality")
    plt.ylabel(feature)
    plt.grid(True)

plt.grid(True)
plt.tight_layout()
plt.show()
  • ์•Œ์ฝ”์˜ฌ (Alcohol)
    • ํ’ˆ์งˆ์ด ๋†’์•„์งˆ์ˆ˜๋ก ์•Œ์ฝ”์˜ฌ ๋„์ˆ˜์˜ ํ‰๊ท ์ด ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒฝํ–ฅ์„ ๋ณด์ž„
    • ํŠนํžˆ ๋†’์€ ํ’ˆ์งˆ(7~8์ )์—์„œ ์•Œ์ฝ”์˜ฌ ๋„์ˆ˜๊ฐ€ ๋น„๊ต์  ๋†’์€ ๊ฐ’์—์„œ ๋ถ„ํฌํ•จ
  • ํœ˜๋ฐœ์„ฑ ์‚ฐ (Volatile Acidity)
    • ํ’ˆ์งˆ์ด ๋‚ฎ์€ ์™€์ธ(ํŠนํžˆ 4~5์ )์—์„œ ํœ˜๋ฐœ์„ฑ ์‚ฐ์˜ ๊ฐ’์ด ๋†’๊ฒŒ ๋ถ„ํฌํ•จ
    • ์ฆ‰, ํœ˜๋ฐœ์„ฑ ์‚ฐ์ด ๋†’์œผ๋ฉด ํ’ˆ์งˆ์ด ๋‚ฎ์•„์งˆ ๊ฐ€๋Šฅ์„ฑ์ด ํผ
  • ๋ฐ€๋„ (Density)
    • ๋‚ฎ์€ ํ’ˆ์งˆ์ผ์ˆ˜๋ก ๋ฐ€๋„๊ฐ€ ๋†’๊ฒŒ ๋ถ„ํฌํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Œ
    • ํŠนํžˆ ํ™”์ดํŠธ ์™€์ธ์—์„œ ์ด ๊ฒฝํ–ฅ์ด ๋” ๋‘๋“œ๋Ÿฌ์ง
  • ํ™ฉ์‚ฐ์—ผ (Sulphates)
    • ํ’ˆ์งˆ์ด ๋†’์€ ์™€์ธ(7์  ์ด์ƒ)์—์„œ ํ™ฉ์‚ฐ์—ผ ๋†๋„๊ฐ€ ์กฐ๊ธˆ ๋” ๋†’์€ ๊ฒฝํ–ฅ์„ ๋ณด์ž„
  • ๊ตฌ์—ฐ์‚ฐ (Citric Acid)
    • ํ’ˆ์งˆ์ด ๋†’์„์ˆ˜๋ก ๊ตฌ์—ฐ์‚ฐ ๋†๋„๊ฐ€ ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์œผ๋‚˜, ์ผ๋ถ€ ํ’ˆ์งˆ์—์„œ๋Š” ํฐ ์ฐจ์ด๊ฐ€ ์—†์Œ

 

๋ ˆ๋“œ ์™€์ธ ํ’ˆ์งˆ ๋ณ„ ์ฃผ์š” ํŠน์„ฑ๋“ค์˜ ๋ถ„ํฌ ํ™•์ธ

 

ํ™”์ดํŠธ ์™€์ธ ํ’ˆ์งˆ ๋ณ„ ์ฃผ์š” ํŠน์„ฑ๋“ค์˜ ๋ถ„ํฌ ํ™•์ธ

 

๋ฐ˜์‘ํ˜•