๐ฌ Data Science with Python
Master data manipulation, analysis, and visualization using Python. Learn pandas, numpy, matplotlib, seaborn, and real-world data science workflows from data collection to insights.
๐ Course Modules
- Python for Data Science: NumPy, Pandas basics
- Data Manipulation: Cleaning, transformation, merging
- Data Visualization: Matplotlib, Seaborn, Plotly
- Statistical Analysis: Descriptive & inferential stats
- Real-world Projects: EDA on real datasets
- Feature Engineering: Creating meaningful features
- Data Wrangling: Handling missing data, outliers
๐ป Python Implementation Example
# Complete EDA Example with Python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris
# Load data
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['species'] = [iris.target_names[t] for t in iris.target]
# Data Overview
print("Dataset Shape:", df.shape)
print("
Data Info:")
print(df.info())
# Statistical Summary
print("
Descriptive Statistics:")
print(df.describe())
# Check for missing values
print("
Missing Values:")
print(df.isnull().sum())
# Visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
# 1. Distribution plots
sns.histplot(data=df, x='sepal length (cm)', hue='species', kde=True, ax=axes[0,0])
axes[0,0].set_title('Sepal Length Distribution')
# 2. Correlation heatmap
numeric_df = df.select_dtypes(include=[np.number])
sns.heatmap(numeric_df.corr(), annot=True, cmap='coolwarm', ax=axes[0,1])
axes[0,1].set_title('Feature Correlation Matrix')
# 3. Box plots
sns.boxplot(data=df, x='species', y='petal length (cm)', ax=axes[1,0])
axes[1,0].set_title('Petal Length by Species')
# 4. Scatter plot
sns.scatterplot(data=df, x='sepal length (cm)', y='petal length (cm)',
hue='species', style='species', s=100, ax=axes[1,1])
axes[1,1].set_title('Sepal vs Petal Length')
plt.tight_layout()
plt.show()
# Feature Engineering
df['sepal_petal_ratio'] = df['sepal length (cm)'] / df['petal length (cm)']
print(f"
New feature created: sepal_petal_ratio")๐ฏ Learning Outcomes
Data Manipulation
Master Pandas and NumPy for data manipulation
Visualization
Create stunning plots with Matplotlib and Seaborn
Statistics
Apply statistical analysis to real datasets
Real Projects
Work on actual data science projects