This code will help you search for articles related to your 5 favorite keywords and save them as TXT files.
Importing Libraries
import requests
from bs4 import BeautifulSoup
import datetime
import os
Defining Function for Saving Articles as TXT files
def save_as_txt(title, content):
now = datetime.datetime.now()
time = now.strftime("%Y-%m-%d %H:%M:%S")
filename = f"{title} - {time}.txt"
filepath = os.path.join("articles", filename)
with open(filepath, "w", encoding="utf-8") as f:
f.write(f"{title}\\\\n\\\\n{content}")
Searching and Saving Articles from Websites
keywords = ["{keyword1}", "{keyword2}", "{keyword3}", "{keyword4}", "{keyword5}"]
websites = ["{website1}", "{website2}", "{website3}", "{website4}", "{website5}"]
for keyword in keywords:
for website in websites:
url = f"{website}/search?q={keyword}"
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
articles = soup.find_all("article")
for article in articles:
title = article.find("h2").text.strip()
content = article.find("p").text.strip()
save_as_txt(title, content)
In this code, we first import the necessary libraries – requests, BeautifulSoup, datetime, and os.
Next, we create a function called save_as_txt
that takes in the title
and content
of an article and saves it in a TXT file with the format title - YYYY-MM-DD HH:MM:SS.txt
.
We then define the keywords
and websites
that we want to search for articles from.
Using a nested for loop, we search for articles related to each keyword on each website. We extract the title and content of each article using BeautifulSoup and save it as a TXT file using the save_as_txt
function.
Note: Make sure to replace the {keyword}
and {website}
placeholders in the code with your desired keywords and websites.