A little Python Script…


This post is being made because this is not quite worthy of a git repo… But it is an example of how a small script can solve a “problem”.

In spring 2022, “Hamilton” was coming to town, and I really wanted to get tickets!!! They announced it multiple months in advance, and I was constantly ready to click that purchase button. After checking for about 2 weeks, they were still not on sale.

The landing page they had dedicated to ticket purchase had “Error” in the H1 tag. As soon as “error” was removed, I did not expect to be able to purchase tickets immediately, but I assumed they would at least have more relevant information. My assumption would turn out to be correct.

My initial thought was to scrape the page using Beautiful Soup periodically, and check the H1 tag for “Error”. Unfortunately, this was not possible. I could speculate forever as to why BS could not scrape the page, but I chose not to go down this rabbit hole.

After some light searching, I decided to use a curl command. This pulled the contents of the web page in HTML, and this was saved to a .txt file. This data was able to be scraped, and BS was back in the mix!

With BS, I would check the H1 tag, and when it no longer said “Error”, I used Twilio to send myself a text. I could then check the website for the new information, or see if I needed to adjust what I was looking for. After about 2 months, my script paid off. I now had a date of when the tickets would go on sale!

To fill in some of the gaps, this is a little more explanation of the included elements…

I used a random server that was set up on DigitialOcean, but could be anywhere, even on my laptop. The laptop gets turned off sometimes. I wanted something with round the clock availability.

Using cron job and a curl statement, the landing page was scraped about twice an hour, and saved to a text file. Again using cron job, the Python script triggered BS to check the H1 tag, and if it did not contain “Error”, it would send me a notification via my Twilio account.

It did not change the world, but it freed me up from constantly checking back and thinking I would miss out on a great musical!

In case you are wondering, I was actually able to get tickets, and absolutely enjoyed it!

This is the python script, minus important account information.

from bs4 import BeautifulSoup
import requests
from twilio.rest import Client

# url to check, was moved to cron job
# url = “<webpage to check>”

# would be used to scrape directly if allowed
# request = requests.get(url)

# location of HTML data from curl command 
curl_file_location = 'hamilton.txt'

with open(curl_file_location) as file:
    hamilton_data = file.read()

content = hamilton_data

soup = BeautifulSoup(content, "html.parser")

h1 = soup.find(['h1'])

if h1.text != "Error":

    send_list = ["<phone number to notify>, "<phone number to notify>"]
    account_sid = “<account sid>”
    auth_token  = “<auth token>”
    
    client = Client(account_sid, auth_token)
    
    for _ in send_list:
        message = client.messages.create(
            to=_, 
            from_=“<twilio account phone number>”,
            body="If you are getting this, then the Hamilton page no longer says 'Error' Please check for updates.  {url}“)

cron job commands

# runs every hour at minutes specified in first column (5 & 35 after)
5,35 * * * * curl <website to pull HTML> --output <HTML from curl>.html

# run python script after curl command has chance to save
6,36 * * * * <path to python execution> <path to file to run>