Assignment 5
🚀 Advanced Python
OOP, NumPy & Pandas
📝 4 Complex Questions
⏱️ 60 min
🎯 Advanced
📋 Questions & Solutions
1
Object-Oriented Programming: Circle Class
🎯 OOP
a) Create a Circle class with radius attribute
Initialize with default radius of 1.
b) Add method to calculate area
Area = π × r²
c) Add method to calculate circumference
Circumference = 2 × π × r
d) Add __repr__ method
Return a string representation of the circle.
import math
class Circle:
"""A class to represent a circle"""
def __init__(self, radius=1):
"""Initialize circle with radius (default 1)"""
self.radius = radius
def area(self):
"""Calculate and return area of circle"""
return math.pi * self.radius ** 2
def circumference(self):
"""Calculate and return circumference"""
return 2 * math.pi * self.radius
def __repr__(self):
"""String representation of Circle"""
return f"Circle(radius={self.radius})"
# Test the Circle class
c1 = Circle() # Default radius
c2 = Circle(5) # Radius of 5
c3 = Circle(radius=10) # Named argument
print("Circle 1:", c1)
print(f" Area: {c1.area():.2f}")
print(f" Circumference: {c1.circumference():.2f}")
print("\nCircle 2:", c2)
print(f" Area: {c2.area():.2f}")
print(f" Circumference: {c2.circumference():.2f}")
print("\nCircle 3:", c3)
print(f" Area: {c3.area():.2f}")
print(f" Circumference: {c3.circumference():.2f}")Output
Circle 1: Circle(radius=1) Area: 3.14 Circumference: 6.28 Circle 2: Circle(radius=5) Area: 78.54 Circumference: 31.42 Circle 3: Circle(radius=10) Area: 314.16 Circumference: 62.83
🎓 Explanation
__init__- Constructor that runs when object is createdself- Reference to the current instanceself.radius- Instance attribute (each object has its own)__repr__- Special method for string representationmath.pi- Built-in constant for π (3.14159...)
2
NumPy Operations
🔢 NumPy
a) Generate 10 random numbers between 1 and 100
b) Create two 2x2 matrices, add and multiply them
c) Calculate mean, median, and standard deviation
import numpy as np
# Set seed for reproducibility
np.random.seed(42)
# a) Generate 10 random numbers between 1 and 100
random_nums = np.random.randint(1, 101, size=10)
print("a) Random numbers (1-100):")
print(random_nums)
# b) Create and operate on 2x2 matrices
print("\nb) Matrix Operations:")
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[5, 6], [7, 8]])
print("Matrix 1:")
print(matrix1)
print("\nMatrix 2:")
print(matrix2)
# Addition
print("\nMatrix Addition (A + B):")
print(matrix1 + matrix2)
# Element-wise multiplication
print("\nElement-wise Multiplication:")
print(matrix1 * matrix2)
# Matrix multiplication (dot product)
print("\nMatrix Multiplication (A @ B):")
print(matrix1 @ matrix2)
# c) Statistical operations
print("\nc) Statistics of random numbers:")
print(f"Mean: {np.mean(random_nums):.2f}")
print(f"Median: {np.median(random_nums):.2f}")
print(f"Std Dev: {np.std(random_nums):.2f}")
print(f"Min: {np.min(random_nums)}")
print(f"Max: {np.max(random_nums)}")
print(f"Sum: {np.sum(random_nums)}")Output
a) Random numbers (1-100): [52 93 15 72 61 21 83 87 75 75] b) Matrix Operations: Matrix 1: [[1 2] [3 4]] Matrix 2: [[5 6] [7 8]] Matrix Addition (A + B): [[ 6 8] [10 12]] Element-wise Multiplication: [[ 5 12] [21 32]] Matrix Multiplication (A @ B): [[19 22] [43 50]] c) Statistics of random numbers: Mean: 63.40 Median: 73.50 Std Dev: 24.65 Min: 15 Max: 93 Sum: 634
🎓 Explanation
np.random.randint(low, high, size)- Random integers+adds matrices element-by-element*multiplies element-by-element (NOT matrix multiplication)@ornp.dot()does true matrix multiplicationnp.mean(),np.median(),np.std()- Statistical functions
3
Pandas DataFrame Operations
🐼 Pandas
a) Read a CSV file and display first 5 rows
b) Find maximum, minimum, and average price
c) Create a new column using lambda function
d) Sort by price descending
import pandas as pd
# Create sample data (simulating CSV read)
data = {
'Product': ['Laptop', 'Phone', 'Tablet', 'Watch', 'Headphones'],
'Price': [999.99, 699.99, 449.99, 299.99, 149.99],
'Quantity': [10, 25, 15, 30, 50]
}
df = pd.DataFrame(data)
# a) Display first 5 rows
print("a) First 5 rows (head):")
print(df.head())
# b) Price statistics
print("\nb) Price Statistics:")
print(f"Maximum Price: ${df['Price'].max():.2f}")
print(f"Minimum Price: ${df['Price'].min():.2f}")
print(f"Average Price: ${df['Price'].mean():.2f}")
# c) Create new column with lambda
print("\nc) Add 'Total Value' column:")
df['Total_Value'] = df.apply(lambda row: row['Price'] * row['Quantity'], axis=1)
print(df)
# Alternative: Vectorized approach (faster)
df['Discounted'] = df['Price'].apply(lambda x: x * 0.9)
print("\nWith 10% discount column:")
print(df)
# d) Sort by price descending
print("\nd) Sorted by Price (descending):")
sorted_df = df.sort_values('Price', ascending=False)
print(sorted_df)Output
a) First 5 rows (head):
Product Price Quantity
0 Laptop 999.99 10
1 Phone 699.99 25
2 Tablet 449.99 15
3 Watch 299.99 30
4 Headphones 149.99 50
b) Price Statistics:
Maximum Price: $999.99
Minimum Price: $149.99
Average Price: $519.99
c) Add 'Total Value' column:
Product Price Quantity Total_Value
0 Laptop 999.99 10 9999.90
1 Phone 699.99 25 17499.75
2 Tablet 449.99 15 6749.85
3 Watch 299.99 30 8999.70
4 Headphones 149.99 50 7499.50
With 10% discount column:
Product Price Quantity Total_Value Discounted
0 Laptop 999.99 10 9999.90 899.99
1 Phone 699.99 25 17499.75 629.99
2 Tablet 449.99 15 6749.85 404.99
3 Watch 299.99 30 8999.70 269.99
4 Headphones 149.99 50 7499.50 134.99
d) Sorted by Price (descending):
Product Price Quantity Total_Value Discounted
0 Laptop 999.99 10 9999.90 899.99
1 Phone 699.99 25 17499.75 629.99
2 Tablet 449.99 15 6749.85 404.99
3 Watch 299.99 30 8999.70 269.99
4 Headphones 149.99 50 7499.50 134.99🎓 Explanation
pd.read_csv('file.csv')- Read CSV filedf.head(n)- First n rows (default 5)df['column'].max(),.min(),.mean()- Statisticsdf.apply(lambda, axis=1)- Apply function row-wisedf['col'].apply(lambda)- Apply to single columndf.sort_values('col', ascending=False)- Sort descending
4
Advanced Pandas Analysis
🐼 Pandas Advanced
a) Filter data based on conditions
b) Group by category and calculate aggregations
c) Handle missing values
d) Merge two DataFrames
import pandas as pd
import numpy as np
# Create sample sales data
sales = pd.DataFrame({
'Product': ['Laptop', 'Phone', 'Tablet', 'Laptop', 'Phone', 'Watch'],
'Category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics', 'Accessories'],
'Price': [999, 699, 449, 1099, 599, np.nan],
'Units': [5, 10, 8, 3, 15, 20]
})
# a) Filter: Products with Price > 500
print("a) Products with Price > 500:")
filtered = sales[sales['Price'] > 500]
print(filtered)
# Multiple conditions
print("\nProducts: Price > 500 AND Units > 5:")
multi_filter = sales[(sales['Price'] > 500) & (sales['Units'] > 5)]
print(multi_filter)
# b) Group by and aggregate
print("\nb) Group by Product - Statistics:")
grouped = sales.groupby('Product').agg({
'Price': ['mean', 'min', 'max'],
'Units': 'sum'
})
print(grouped)
# Simpler groupby
print("\nTotal units per category:")
print(sales.groupby('Category')['Units'].sum())
# c) Handle missing values
print("\nc) Handling Missing Values:")
print("Missing values per column:")
print(sales.isnull().sum())
# Fill missing values
sales_filled = sales.fillna(sales['Price'].mean())
print("\nAfter filling with mean:")
print(sales_filled)
# Alternative: Drop rows with NaN
sales_dropped = sales.dropna()
print(f"\nRows after dropping NaN: {len(sales_dropped)}")
# d) Merge DataFrames
products = pd.DataFrame({
'Product': ['Laptop', 'Phone', 'Tablet', 'Watch'],
'Manufacturer': ['Dell', 'Apple', 'Samsung', 'Fitbit']
})
print("\nd) Merge with manufacturer info:")
merged = sales.merge(products, on='Product', how='left')
print(merged)Output
a) Products with Price > 500:
Product Category Price Units
0 Laptop Electronics 999.0 5
1 Phone Electronics 699.0 10
3 Laptop Electronics 1099.0 3
4 Phone Electronics 599.0 15
Products: Price > 500 AND Units > 5:
Product Category Price Units
1 Phone Electronics 699.0 10
4 Phone Electronics 599.0 15
b) Group by Product - Statistics:
Price Units
mean min max sum
Product
Laptop 1049.0 999.0 1099.0 8
Phone 649.0 599.0 699.0 25
Tablet 449.0 449.0 449.0 8
Watch NaN NaN NaN 20
Total units per category:
Category
Accessories 20
Electronics 41
Name: Units, dtype: int64
c) Handling Missing Values:
Missing values per column:
Product 0
Category 0
Price 1
Units 0
dtype: int64
After filling with mean:
Product Category Price Units
0 Laptop Electronics 999.000 5
1 Phone Electronics 699.000 10
2 Tablet Electronics 449.000 8
3 Laptop Electronics 1099.000 3
4 Phone Electronics 599.000 15
5 Watch Accessories 769.167 20
Rows after dropping NaN: 5
d) Merge with manufacturer info:
Product Category Price Units Manufacturer
0 Laptop Electronics 999.0 5 Dell
1 Phone Electronics 699.0 10 Apple
2 Tablet Electronics 449.0 8 Samsung
3 Laptop Electronics 1099.0 3 Dell
4 Phone Electronics 599.0 15 Apple
5 Watch Accessories NaN 20 Fitbit🎓 Explanation
df[condition]- Boolean indexing for filtering&for AND,|for OR (use parentheses!)groupby().agg()- Multiple aggregations at onceisnull().sum()- Count missing valuesfillna(value)- Replace NaN with valuedropna()- Remove rows with NaNmerge(df, on='col', how='left/right/inner/outer')- SQL-like joins
🎁 Bonus: Quick Reference
📚 OOP Cheat Sheet
class MyClass:
class_var = "shared" # Class variable
def __init__(self, value):
self.value = value # Instance variable
def method(self): # Instance method
return self.value
@classmethod # Class method
def class_method(cls):
return cls.class_var
@staticmethod # Static method
def static_method():
return "No self needed"🔢 NumPy Essentials
import numpy as np
# Create arrays
a = np.array([1, 2, 3])
zeros = np.zeros((3, 3))
ones = np.ones((2, 2))
range_arr = np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
# Shape operations
a.reshape(3, 1) # Change shape
a.flatten() # To 1D
# Math
np.sum(a), np.mean(a), np.std(a)
np.min(a), np.max(a), np.argmax(a) # Index of max🐼 Pandas Essentials
import pandas as pd
# Read/Write
df = pd.read_csv('file.csv')
df.to_csv('output.csv', index=False)
# Explore
df.head(), df.tail(), df.info(), df.describe()
df.shape, df.columns, df.dtypes
# Select
df['col'], df[['col1', 'col2']]
df.loc[rows, cols], df.iloc[row_idx, col_idx]
# Transform
df['new'] = df['col'].apply(lambda x: x * 2)
df.groupby('col').agg({'val': 'sum'})
df.sort_values('col', ascending=False)
df.merge(df2, on='key')