Creating a Chrome Extension

“Programs often stem from a developer’s itch”

unknown

Introduction

Google’s Chrome browser is an amazing feat of cross-platform software engineering that I (along with millions of other people!) use on a daily basis.

Like other web browsers, Chrome has the ability to support extensions to extend its functionality.

In this post, I am going to document my experience writing my first Google Chrome extension. Along the way, I will illustrate the way that I went about doing it, and some of my philosophy for making something like this happen.

Great Courses

I am a big proponent of life-long learning: as a matter of fact, I would even argue that all learning is life-long in nature, in as much as there is really never any “end” to it.

I remember hearing a story one time about a lady who thought if you knew how to play the piano, you could “play anything” written for the instrument. I wish that were true, because if so, I could plop down on my piano bench and whip out a few of Godowsky’s Etudes and perhaps one of Lizst concertos.

But alas it is not so! Instead, playing the piano, much like practicing yoga — or a million other examples — is just something that you can plug away at for as long as you would ever care to, without really doing more than even scratching the surface of what is possible.

Anyway, one of my favorite things things to do is to listen to the audio lectures from the Great Courses. Over the last few years, Great Course offerings such as Masterpieces of Short Fiction and The Great Works of Sacred Music have greatly enhanced my daily commutes, and my life in general.

But one trend I have noticed in the digital content community is a move towards a streaming model, where for a single lump-sum monthly payment, you can consume as much content as you would like! (With the ubiquity of Netflix and Spotify, it is hard to imagine that — not too long ago — paying 20 dollars to buy a single CD or DVD was common place.)

Well, the same thing eventually happened with the Great courses, and wouldn’t you know it, eventually I started getting advertisements for The Great Courses Plus, the streaming version of the Great Courses which offers most of their oeuvre for a flat monthly fee!

So how does all this relate to a Chrome extension? I’m glad you asked!

As it happens, the courses streamed through The Great Courses plus are hosted on a different website than their piecemeal counterparts which are still sold individually on the original Great Courses website.

Additionally, the web page for the streamed course has very little information on it, and mostly just consists of a list of the audio lectures. The corresponding web page from the original Great Courses website, in contrast, has quite a bit of information explaining in detail what the course is all about.

The following two screen shots illustrate this.

The first one shows lots of information:

https://www.thegreatcourses.com/courses/the-great-works-of-sacred-music.html

While the Great Courses Plus page for the same course basically only lists the lectures available:

https://www.thegreatcoursesplus.com/show/the_great_works_of_sacred_music

Therefore, the point of my Great Courses chrome extension will be to create a small button on the Great Courses Plus page, which when clicked, will link to the original page on the Great Courses website.

The whole thing can be summarized in the following micro-Haiku:

Itch / Scratched!

Here is a picture of my finished product:

Getting Started

I began this blog post with the quote that “many programs start out because of a developer’s itch.” (I seem to remember seeing that quote somewhere at one time in my life, but it is possible that I just made it up.)

But one way or another, the quote is good to include here because it illustrates a truism: in order for something to exist, somebody had to take the time to create it.

But in order for them to do so, another requirement must also be met, and that is for somebody to have what I call the “creator mindset.” A creator mindset is a mindset that allows one to conceptualize of themselves a creator of things, instead of just a consumer of them.

Learning to program is an excellent way to make this shift, because it allows us to look at the activities we do on computers as things to be extended, customized, and automated.

So with all this being said, the little annoyance in my life of having to manually seek out the corresponding Great Courses page when evaluating an offering from the Great Courses Plus, would soon be solved with — you guessed it! — a handy Chrome extension.

The Document Object Model

One beautiful feature of most web pages is that they are written using plain text markup, which as everybody knows, is usually in the HTML format.

A web browser, then, takes this markup, and creates an internal representation of it, which is ultimately used to render the HTML’s content onto the screen. This internal representation is called the DOM, or document object model.


File:DOM-model.svg. (2012, January 15). Wikimedia Commons, the free media repository. Retrieved 03:23, March 23, 2019 from https://commons.wikimedia.org/w/index.php?title=File:DOM-model.svg&oldid=65410253.

I mention the existence of the DOM, because its existence allows us to do something cool with web pages: modify them after they’ve loaded! This is possible because as I mentioned above, the actual content we see on our screens is rendered from the DOM, and only from the original .HTML indirectly. Additionally, any changes to the DOM — even after the page is completely loaded — are instantly reflected into the resultant web page.

These introductory remarks about the DOM serve to illustrate the way I went about creating the Chrome extension.

I have heard before that a good motto in software development is to “release early, and release often.” (I think this is sort of a “software-centric” way of saying “well begun is half done.”) Either way, this type of “jump right in” mentality serves us well, because I have found that while coming up with ideas is easy, actually grounding our vision into material existence is hard!

As it pertains to this example, I decided to “jump right in”, and start to prototype a way to create the button on the Great Courses Plus page right way: even before starting development on a Chrome extension.

I did this by just firing up Chrome, loading up a course I was interested in, and taking a quick look at the source for the web page. After a few minutes of poking around, I found the code responsible for the location on the original web page where I wanted to insert the new button:

And using the javascript console available via Chrome’s amazing developer tools, I concocted some simple Javascript:

x=document.getElementsByClassName("button-container")
x[0].innerHTML = x[0].innerHTML + 
"<p> <form><button formaction='https://www.thegreatcourses.com/courses/the-great-works-of-sacred-music.html'>Great Courses</button></form>"

As you can see, the javascript above uses the formaction attribute (which I think is maybe new in HTML 5) to have a button that links to another URL. (That is why the <button> element we concatenate on above is wrapped in a <form> element.)

(Note: I have no idea if using the formaction attribute is necessarily the optimal way to achieve our aims, but it seems to work… so in this case, we will just stick to the trusty motto of KISS: Keep It Simple Stupid!)

So with the first part of our work done — how to actually embed a new button onto a Great Courses web page — now we can turn our attention to the next part of our requirement: how can we determine the correspondent URLs between the Great Course and Great Courses Plus websites.

URL Mapping

Consider the following two URLs:

https://www.thegreatcourses.com/courses/the-great-works-of-sacred-music.html

and

https://www.thegreatcoursesplus.com/show/the_great_works_of_sacred_music

As can be seen, it would appear trivial to translate between. Alas, however, the pattern doesn’t always seem to hold, and there is no easy way to map one URL onto another.

Consider for example, the following:

https://www.thegreatcourses.com/courses/latin-101-learning-a-classical-language.html

and

https://www.thegreatcoursesplus.com/latin-101

Since no consistent pattern exists between the URLs used by the two sites, I decided I would create a lookup table to convert one onto the other.

This seems to make sense, but begs the question: how the heck do we create the lookup table??

Obviously one approach would be to try to create it by hand, but that type of approach is antithetical to the programmer’s mindset. (The programming mindset is captured in an adage I think might be attributed to Ken Ritchie, one of the father’s of computing: “if something is worth doing once, it is worth automating.“)

What I ended up doing was to use the page listed on The Great Courses Plus that shows the entire catalog:

Which upon further inspection, conveniently provides all the data in JSON format:

The above JSON provides us with the Great Courses Plus description of each course offering.

Enter Python

An extremely useful very-high level scripting language is called Python. I actually resisted the language at first, especially the fact that it uses indents for code blocks, but over time I’ve come to really like Python, and try to use it anytime I have a one-off type program to write.

The other nice thing about Python is just how high level it is. When you are writing Python code, you don’t have to worry about a lot of low-level details. Instead, you literally stand on the back of giants, and make use of lots of pre-existing functionality which exists in the form of Python libraries.

The next task for my Chrome Extension was to use the JSON that I mentioned above to somehow map to the corresponding URL on the Great Courses website.

I did this by using programmatic access to Google’s Search algorithm, which is available as part of the Google Cloud Platform.

The GCP is really amazing, and provides developers access to programmatic application programming interfaces (APIs) to interact with Google’s servers.

In this case, I made use of Google’s Custom Search JSON API. What I did was to write some Python code to iterate through the .JSON above, and in essence ask google to search the domain “www.greatcourses.com” for the best match for each course name in turn.

(Using Google’s amazing API explorer is a good way to get a handle on how to interact with the Custom Search JSON API.)

Once I got a JSON response back from Google, I just created a new entry in the original JSON data.

So in essence, I turned this:

With a little help from Google, into this:

Here is what the Python script looked like:

# 
# GenerateGreatCourseURLs.py
#
# load in JSON file description of the great courses, extracted from:
#  https://www.thegreatcoursesplus.com/allcourses
# 
# Then use google custom search engine API request to search for each course on Google, 
# to locate the corresponding Great Courses URL for the course.
#
# Update the JSON to include the URL, then dump the results back out to disk
#
# This is used as part of the Great Courses Plus Google Chrome Plugin that I am developing, and is 
# sort of just a one-off type script, so not much real error checking is done...
# 
# Benjamin Pritchard / Kundalini Software
#
# 23-March-2018     Version 1.0     Initial Release
#


import json
import urllib.parse
import urllib.request

def QueryGoogle(course):
    
        print("Querying Google for %s" % course["COURSE_TITLE"])
    
        # create the query string
        SearchString = urllib.parse.quote(course["COURSE_TITLE"])
        
        # grab my google credentials from a file, to make sure I don't accidently upload them
        # to github
        with open('c:\kundalini\keys\google_cs_credentials.txt') as fp:  
            fp.readline()                         # first line is comments    
            GoogleAPIKey = fp.readline().rstrip()
            CX = fp.readline().rstrip()

        URL = 'https://www.googleapis.com/customsearch/v1?q=' + SearchString + '&cx=' + CX + '&key=' + GoogleAPIKey
        
        # make the request
        contents = urllib.request.urlopen(URL).read()

        # parse the returned JSON
        JSON = json.loads(contents)
        course["GREATCOURSEURL"] = (JSON["items"][0]["link"])

print("Reading and parsing course JSON...")
# grab the course data from disk...
with open('Course.JSON') as json_file:  
    data = json.load(json_file)

# and ask google for each URL in tern...
counter = 0
for course in data:
    if  ("GREATCOURSEURL" in course): 
        print("URL already present; skipping course %s" % course["COURSE_TITLE"])
        x = 1  
    else:
        try:
            QueryGoogle(course)
            print("need to query for %s" % (course["COURSE_TITLE"]))
            counter += 1
        except:
            print("error occurred for course %s" % course["COURSE_TITLE"])    

print("%d total courses processed! " % counter)
    
# dump the output (which will include the Great Course URL!!) back out to disk...
print("dump .JSON back to disk...")
with open('courseOutput.JSON', 'w') as output_json_file:  
	json.dump(data, output_json_file, indent=4)

Finally, now that I had the URL that I needed for each course, I just needed a way to get at that information conveniently from my extension itself, which will ultimately be implemented in Javascript, not Python.

So to do this, I created one more Python script, whose job was to generate the Javascript code that I could incorporate into my extension:

# 
# GenerateGreatCourseGenerateJS.py
#
# load in JSON file created by GenerateGreatCourseURLs.py, and generate a simple .JS file
# for use with our Chrome plugin 
# 
# Benjamin Pritchard / Kundalini Software
#
# 23-March-2018     Version 1.0     Initial Release
#


import json

code_header = '''// this lookup table is auto-generated by GenerateGreatCourseGenerateJS.py
// Any modifications to this file will be lost if you run the script again!!
var lookup = new Map([ 
'''

code_footer = ''']);
// find out name of current course...
courseTitle = document.getElementsByClassName("course-info")[0].firstElementChild.innerText;
console.log("course title = " + courseTitle);

// lookup its URL...
URL = lookup.get(courseTitle);
console.log("URL = " + URL);

// and add a new button to it...
if (URL != undefined) {
ButtonContainer=document.getElementsByClassName("button-container");
NewHTML = "<p> <form target='_blank'><button formaction=" + URL + ">Great Courses</button></form>";
ButtonContainer[0].innerHTML = ButtonContainer[0].innerHTML + NewHTML; 
}
'''

print("Reading and parsing course JSON...")
# grab the course data from disk...
with open('courseOutput.JSON') as json_file:  
    data = json.load(json_file)

print("Creating Javascript...")
with open('greatcourses.js', 'w') as output_js_file:
    output_js_file.write(code_header)

    counter = 0
    for course in data:

        if counter == (len(data) - 1):      # don't put a comma on the last line!!
           CommaString = ""
        else:
            CommaString = ","    

        output_js_file.write(
            '["%s","%s"]%s\n' % (course["COURSE_TITLE"], course["GREATCOURSEURL"], CommaString))

        counter += 1

    output_js_file.write(code_footer)

print("Done")

Once that was done, I just created a manifest file, found some freely available icon files, and packaged the whole thing into the .ZIP format that Google wants for Chrome extensions.

The manifest is simple enough, and just uses a declarative content_scripts block to inject the Javascript code I generated above into each webpage that matches the regex https://www.thegreatcoursesplus.com/*

{
"name": "Great Courses Link for Great Courses Plus",
"description": "Create a link on thegreatcoursesplus pages to the corresponding page on thegreatcourses, which has more information",
"version": "1.0",
"content_scripts": [
{
"matches": ["https://www.thegreatcoursesplus.com/*"],
"js": ["greatcourses.js"]
}
],
"permissions": [
"activeTab"
],
"icons":
{"32": "flower-icon-32x32.png",
"48": "flower-icon-48x48.png",
"64": "flower-icon-64x64.png",
"72": "flower-icon-72x72.png",
"96": "flower-icon-96x96.png",
"128":"flower-icon-128x128.png" },
"manifest_version": 2
}

For clarity sake, I will show an excerpt of the generated javascript here:

// find out name of current course...
courseTitle = document.getElementsByClassName("course-info")[0].firstElementChild.innerText;
console.log("course title = " + courseTitle);

// lookup its URL...
URL = lookup.get(courseTitle);
console.log("URL = " + URL);

// and add a new button to it...
if (URL != undefined) {
ButtonContainer=document.getElementsByClassName("button-container");
NewHTML = "<p> <form target='_blank'><button formaction=" + URL + ">Great Courses</button></form>";
ButtonContainer[0].innerHTML = ButtonContainer[0].innerHTML + NewHTML; 

Conclusion

Google Chrome is an amazing piece of software that can be customized using extensions. Writing our own is a fun way to understand a little bit more how Chrome works, and to create a useful piece of software that just happens to provide a scratch to our itch.

Getting the extension

This extension can be downloaded from the Google Web Store.

Source code for this project is available on GitHub.