Simple Web Scraping using BeautifulSoup and Python in Google Colab

Extracting data from a website and use it for analytics

Photo by MadMax Chef on Unsplash

So why Web Scraping ?

But why you do that ?

Importing necessary library

from bs4 import BeautifulSoup
import requests
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

Get the website & extracting the data

webpage = requests.get("insert-your-url-here")soup = BeautifulSoup(webpage.content, "html.parser")

Understanding the structure of the website

Cocoa Ratings website from Codecademy used for Web Scraping in Data Science and Data Analytics
Cocoa Ratings website from Codecademy used for Web Scraping in Data Science and Data Analytics
Codecademy website about Cocoa Ratings
HTML code from Inspect Element on Cocoa Ratings website for Web Scraping
HTML code from Inspect Element on Cocoa Ratings website for Web Scraping
HTML Structure of Cocoa Ratings Website
example_column = soup.find_all(attrs={"class": "class-name"})
example_list = []
for x in example_column[1:] :
example_list.append(float(x.get_text()))

Analyzing the data

data = {"Rating": ratings, "CocoaPercentage": cocoa_percents}df = pd.DataFrame.from_dict(data)
z = np.polyfit(df.CocoaPercentage, df.Rating, 1)
line_function = np.poly1d(z)
plt.scatter(df.CocoaPercentage, df.Rating)plt.title('Cocoa Percentage & Ratings Correlation')plt.xlabel('Cocoa Percentage (%)')plt.ylabel('Ratings')plt.plot(df.CocoaPercentage, line_function(df.CocoaPercentage), "r--")plt.show()
Linear Regression Plot using Matplotlib on Python, Data Science
Linear Regression Plot using Matplotlib on Python, Data Science

The end

Hello there ! I’m Nathanael Victorious, a 3rd year Computer Science student at Tarumanagara University who like topics about Data Science, Cloud Computing & IoT