Gen AI Developer Week 2 — Day 2

Sai Chinmay Tripurari
3 min readJust now

--

Intro to Linear Regression Model

Linear regression is a simple yet powerful method in machine learning and statistics. Many applications that use data are based on predictions and insights made possible by modeling the connection between variables that are dependent and independent. As part of my trip through Generative AI Developer Week, we discuss the fundamental ideas, real-world applications, and mathematical underpinnings of linear regression in this piece.

Let’s recap what is Linear Regression?
Linear Regression predicts a target y based on a linear combination of input features x1,x2,…,xn.

We’re going to train a Linear Regression Model with the Diabetes data.

Load Dataset & Create a DataFrame

from sklearn.datasets import load_diabetes # Get Diabetes Dataset
import pandas as pd

data = load_diabetes()

# Create a dataframe with pandas
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target

print(df.head())
Dataframe output for Diabetes Dataset

Split dataset to Training and Testing Data using Train-Test Split

# Train Test Data Split
from sklearn.model_selection import train_test_split

X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
Output of Train Test Split

Train a Linear Regression Model

# Preparing a Model
from sklearn.linear_model import LinearRegression

model = LinearRegression()

model.fit(X_train, y_train)

print("Coeff: ", model.coef_)
print("Intercept: ", model.intercept_)
Output of Coef & Intercept

Coefficient: The dependent variable’s change for every unit change in the independent variable is indicated by the line’s slope.
Intercept: The value of the dependent variable at the point where the line crosses the y-axis and the independent variable is zero.

Make Predictions

# Predict on the test data
y_pred = model.predict(X_test)

print("Predictions:", y_pred[:5])
print("Actual values:", y_test.values[:5])

Evaluate the Model

from sklearn.metrics import mean_squared_error, r2_score

# Calculate metrics
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("Mean Squared Error:", mse)
print("R^2 Score:", r2)

Mean Squared Error (MSE):
calculates the mean squared variation between the expected and actual values. Better model fit is indicated by a lower MSE.
R-squared:
shows the percentage of the dependent variable’s variance that the model can account for, A better fit is indicated by a higher R-squared, which ranges from 0 to 1.

Task — Plot the predicted vs. actual values for the test set.

import matplotlib.pyplot as plt

# Plot predicted vs actual values
plt.scatter(y_test, y_pred, alpha=0.7)
plt.xlabel("Actual Values")
plt.ylabel("Predicted Values")
plt.title("Predicted vs. Actual")
plt.show()

Deliverables for the day:
Linear Regression model trained on the Diabetes dataset.
MSE and R2R²R2-score of the model.
A scatter plot of predicted vs. actual values.

Happy Learning!😊.. For any questions or support, feel free to message me on LinkedIn.

--

--

Sai Chinmay Tripurari
Sai Chinmay Tripurari

Written by Sai Chinmay Tripurari

Software Developer | ReactJS & React Native Expert | AI & Cloud Enthusiast | Building intuitive apps, scalable APIs, and exploring AI-driven solutions.

No responses yet