A Refactoring Project in Python: Music CD List, Part 1 of 3
Published: January 25, 2021Last Updated: January 25, 2021
Sometimes, as a software developer, there’s a little bit of code which could use some fixing up to make it just right. Other times, there’s a lot of code that could use a ton of rewriting to make it half-way decent. And sometimes, it’s not until months later when this realization occurs.
What follows is a contrived example to demonstrate building, and in the process refactoring, a small Python application. The only “requirement” at the start: store a list of CDs using a linked list. And yes, inputting the CDs in the code is just fine (this will be fixed after unit tests are added, and not addressed in the post directly).
This is the first entry in a series of entries that are going to explore this program and a series of refactorings to improve it. The plan is for three entries, each building upon the previous to achieve a Pythonic and well factored linked list used in a simple test program: The “Music CD List.”
The code will start as a rushed example which “works.” The first thing I’ll do after showing this is add unit tests and make it fit more closely to PEP 8 code formatting. Following that I’ll aim to make it coded with an eye towards extensibility and try to make it Pythonic, or as close to that ideal as I can.
From the perspective of someone who can write Python but perhaps not Pythonically, it will be an endeavor to become aware of the ways of the Pythonic, and imitate them to write better code.
The example application will start as a simple “Music CD list application,” and use a hand coded linked list for its data storage. During this and the following entries I’ll improve it to be well factored and efficient Python code. Hopefully.
The code that follows is a first take with my “rusty” Python skills which I’ll refactor and learn some updated Python best practices with.
This code is available at Github in the “CD List Refactoring Demo” repo, and this version is tagged “linked_list_snippet1a.” It’s even got some messy comments in the repo on this version. Just waiting to be cleaned up!
Check it out here.
Without further delay, here is the initial version (all in one file linked in Github, the copy posted here removed some comments).
Initial “Music CD List”:
###CodeSnippet-linked_list_snippet1a###
class MusicCD:
def __init__(self, artist, title, year, cdid):
self.artist = artist
self.title = title
self.year = year
self.cdid = cdid
def __str__(self):
return f"__str__:{self.title} by {self.artist} from {self.year}"
def comparator(self, other):
if self.title.lower() == other.title.lower():
return 0
elif self.title.lower() > other.title.lower():
return 1
else:
return -1
def display(self):
return f"{self.title} by {self.artist} from {self.year}"
class Node:
def __init__(self, item):
self.item = item
self.next = None
def compare(self, other):
return self.item.comparator(other)
class LinkedList:
def __init__(self, node):
self.head = node
def __iter__(self):
self._ptr = self.head
return self
def __next__(self):
n = self._ptr
if self._ptr == None:
raise StopIteration
else:
self._ptr = self._ptr.next
return n
def get_last_node(self):
ln = self.head
while ln != None and ln.next != None:
ln = ln.next
return ln
def add_to_list(self, item):
n = Node(item)
ln = self.get_last_node()
if ln == None:
self.head = n
else:
ln.next = n
def sort(self):
# bubble sort it
start = True
while start:
prev = None
start = False
for n in self:
a = n
b = n.next
if b is not None and a.compare(b.item) == 1:
self.__swap(prev, a, b)
start = True
prev = a
return
def reverse(self):
p = None
current = self.head
while current != None:
next = current.next
if next != None:
self.head = next
current.next = p
p = current
current = next
return
def reverse_recur(self, p=None, current=None):
if p == None:
current = self.head
if current != None:
next = current.next
if next != None:
self.head = next
current.next = p
self.reverse_recur(current, next)
return
def delete(self, item):
n, prev = self.find_node(item)
if prev is None:
self.head = n.next
else:
prev.next = n.next
return
def insert(self, item_to_insert, item_after):
#insert before item
n, prev = self.find_node(item_after)
if n is None:
raise ValueError("Error finding item to insert before...")
# if prev is null, then item_after is head, insert before
if prev is None:
node = Node(item_to_insert)
node.next = self.head
self.head = node
#else, insert between prev and n
else:
node = Node(item_to_insert)
prev.next = node
node.next = n
def __len__(self):
return self.count()
def __swap(self, prev, a , b):
self.__swap_adjacent(prev, a, b)
def __swap_adjacent(self, prev, a, b):
# swap nodes a and b, assuming previous is provided
if a is None or b is None:
raise ValueError("No value is provided for either node!")
a.next = b.next
b.next = a
if prev is not None:
prev.next = b
else:
self.head = b
def count(self):
i = 0
p = self.head
while p != None:
i += 1
p = p.next
return i
def find_node(self, item):
prev = None
for n in self:
if n.compare(item) == 0:
return n, prev
prev = n
return None
def merge(self, other):
# merge the two lists
new_list = LinkedList(None)
a = self.head
b = other.head
while a is not None or b is not None:
if a is not None and b is not None:
if a.compare(b.item) <= 0:
new_list.add_to_list(a.item)
a = a.next
else:
new_list.add_to_list(b.item)
b = b.next
elif a is not None:
while a is not None:
new_list.add_to_list(a.item)
a = a .next
else:
while b is not None:
new_list.add_to_list(b.item)
b = b.next
self.head = new_list.head
def print_music_list(music_list):
for cd in music_list:
print(f"CD: {cd.item}")
if __name__ == "__main__":
music_list = LinkedList(None)
kid_a = MusicCD("Radiohead", "Kid A", 2000, 1)
the_bends = MusicCD("Radiohead", "The Bends", 1995, 2)
the_king_of_limbs = MusicCD("Radiohead", "The King of Limbs - Live from the Basement", 2011, 3)
crash = MusicCD("Dave Matthews Band", "Crash", 1996, 4)
american_idiot = MusicCD("Green Day", "American Idiot", 2004, 5)
vsq_strung_out_vol9 = MusicCD("Vitamin String Quartet", "Strung Out, Vol. 9: VSQ Performs Music's Biggest Hits", 2008, 6)
music_list.add_to_list(kid_a)
music_list.add_to_list(the_bends)
music_list.add_to_list(the_king_of_limbs)
music_list.add_to_list(crash)
music_list.add_to_list(american_idiot)
music_list.add_to_list(vsq_strung_out_vol9)
print(f"Count of list: {len(music_list)}")
print_music_list(music_list)
music_list.sort()
print(f"\n\nPost Sort:\n")
print_music_list(music_list)
music_list.delete(the_bends)
print(f"\nDelete the_bends")
print_music_list(music_list)
music_list.delete(american_idiot)
print(f"\nDelete american_idiot")
print_music_list(music_list)
music_list.insert(american_idiot, crash)
print(f"\nInserted america_idiot before crash...")
print_music_list(music_list)
print(f"\nInserted the_bends before vsq_strung_out...")
music_list.insert(the_bends, vsq_strung_out_vol9)
print_music_list(music_list)
music_list.sort()
print(f"\nSort it again....")
print_music_list(music_list)
# Test out Merge capability
print(f"\n\nTest out merge capability (by title)")
tmp = LinkedList(None)
tmp.add_to_list(MusicCD("A", "A", 1, 1))
tmp.add_to_list(MusicCD("C", "C", 2, 2))
tmp.add_to_list(MusicCD("E", "H", 1, 1))
tmp.add_to_list(MusicCD("E", "J", 2, 2))
print("\ntmp 1 list:")
print_music_list(tmp)
tmp2 = LinkedList(None)
tmp2.add_to_list(MusicCD("B", "B", 1, 1))
tmp2.add_to_list(MusicCD("D", "D", 2, 2))
tmp2.add_to_list(MusicCD("G", "G", 2, 2))
print("\ntmp 2 list:")
print_music_list(tmp2)
print("\nAfter merging tmp1 and tmp2 (by title)")
tmp.merge(tmp2)
print_music_list(tmp)
###CodeSnippetEnd-linked_list_snippet1a###
Obviously this program is screaming profitable startup business. I always like to type my CDs into a Python file to add them to a list that I can’t even save to disk. Just imagine the sales. Investors are probably looking for my email address after reading through it.
Realistically, it’s screaming: TEST ME!!!!
Or, at least test it better than it’s being tested so far anyway.
It is also screaming for some type of system which allows the person to input their CDs without having to open a Python file to type them in (that will be in the repo at the end, and not covered in the post directly).
On Route to the Pythonic
So a long time ago, before the sample program was pasted into this blog post and to the point you would have to scroll to review what I wrote, I said we were going to make this have unit tests, and endeavor to make the code “Pythonic.”
Well, unit tests are not that hard. But what in the world is “Pythonic?” Is there an official “Organization Pythonic” which declares what is and what is not Pythonic? Is it people who know Python on Twitter tweeting yay or nay?
Hmmmm, this just got complicated all of a sudden. I have stumbled into the world of Pythonic subjectivity and I’m not even an expert in all things Python. It is time to consult the scrolls of other bloggers posted elsewhere to determine the proper definition.
Here are some links to smarter folks who have written it down to be shared:
- How to be Pythonic and why you should care, Featuring: Pythonic: a Bionic Python?
- The Hitchhiker’s Guide to Python: Code Style
- Pythonic Code: Best Practices to Make Your Python More Readable
- Pythonic | Definition of Pythonic at Dictionary.com
- PEP 20 – The Zen of Python
The previous list shows how the Internet can lead you astray. The Dictionary.com definition of ‘Pythonic’ is potentially right, at least one of the definitions! If all Python code was Pythonic and the definition of Pythonic is “Python like…” I’ll stop now. The rest of the links are fine.
My personal definition is that you should follow the best practices of software development as applied to the Python language. Which to me, at this point, means following PEP 8 and respecting PEP 20, or, “The Zen of Python.” PEP 8 is more style guidelines for Python code, while PEP 20, is more a spiritual framework for writing good Python code, with pearls like:
- Beautiful is better than ugly.
- Explicit is better than implicit.
- Sparse is better than dense.
- Now is better than never.
- Although never is often better that right now.
- …
In essence, follow good practices and use the modern idioms of Python as best as possible. Don’t write C style code in Python. That wouldn’t be Pythonic.
It’s time to clean up this code, add some unit tests, and make it’s second iteration on the whole better than the first. The goal will be to test it well, and try to follow the idioms of Python as well as possible for now. If I miss some, I’ll pick them up later.
Refactor Take 1: Add Unit Tests, Basic Cleanup
The first thing was to figure out how to unit test the Python code. Turns out this is extremely easy in Python, just use the “unittest” module which is built-in. There are other ways, but to get started, and have some potential automation, this is “good enough.” It will let us test the software.
To get started cleaning the code up I’ve cleaned up the
linked_list.py
of the previous snippet, removing all of the test
code which was of very little use there anyway.
Then I created test_linked_list.py
which is where the unit tests are
stored. The tests are simple to run from the command line:
python -m unittest
This scans the directory for all files that start with test*
and
runs the tests and reports the results.
It’s also possible to run them through an integrated test runner, such as is available and easily setup in a few clicks with Visual Studio Code. In fact this is what I have been doing in general.
The basics of the unittest
module are extremely easy to get started
with. Import the module, and create a class which derives from
unittest.TestCase
. The methods in that class which begin with
test_
will be executed by the runner. Success or failure is handled
by the assert
statements which are used to check the results of the
code being tested with the expected results.
You can see the code for with these changes here. It’s tagged “linked_list_snippet1b.”
Here is the unit testing code:
###CodeSnippet-test_linked_list_snippet1a###
import unittest
from linked_list import LinkedList, Node
class Item:
""" Default Item for Linked List Test """
def __init__(self, title):
self.title = title
def comparator(self, other):
""" Comparator for item """
if self.title.lower() == other.title.lower():
return 0
if self.title.lower() > other.title.lower():
return 1
return -1
class TestLinkedList(unittest.TestCase):
""" Linked List Unit Test Class """
@staticmethod
def _validate_list(linked_list, items = None):
for idx, val in enumerate(linked_list):
assert items[idx].title == val.item.title
@staticmethod
def _gen_tmp_list():
tmp = LinkedList(None)
item_1 = Item("a")
item_2 = Item("b")
item_3 = Item("c")
tmp.add_to_list(item_1)
tmp.add_to_list(item_2)
tmp.add_to_list(item_3)
return tmp, [item_1, item_2, item_3]
@staticmethod
def test_iterator():
""" Test the iterator """
a, b, c = Item("a"), Item("b"), Item("c")
tmp = LinkedList(Node(a))
tmp.head.next = Node(b)
tmp.head.next.next = Node(c)
TestLinkedList._validate_list(tmp, [a, b, c])
assert len(tmp) == 3, "Expected 3"
@staticmethod
def test_get_last_node():
""" Test getting the last node """
a,b,c = Item("a"), Item("b"), Item("c")
tmp = LinkedList(None)
tmp.add_to_list(a)
assert tmp.get_last_node().item is a, "Expected last to be Item a"
tmp.add_to_list(b)
tmp.add_to_list(c)
assert tmp.get_last_node().item is c, "Expected last to be Item c"
@staticmethod
def test_sort():
""" Test sorting the linked list """
tmp, [i1, i2, i3] = TestLinkedList._gen_tmp_list()
tmp.sort()
TestLinkedList._validate_list(tmp, [i1, i2, i3])
i4 = Item("d")
i5 = Item("e")
i6 = Item("f")
tmp.add_to_list(i5)
tmp.add_to_list(i6)
tmp.add_to_list(i4)
tmp.sort()
TestLinkedList._validate_list(tmp, [i1, i2, i3, i4, i5, i6])
tmp2 = LinkedList(None)
tmp2.add_to_list(i1)
tmp2.sort()
TestLinkedList._validate_list(tmp2, [i1])
@staticmethod
def test_reverse():
""" Test reversing the linked list"""
item_d = Item("d")
tmp = LinkedList(None)
tmp.add_to_list(item_d)
tmp.reverse()
TestLinkedList._validate_list(tmp, [item_d])
tmp, [i1, i2, i3] = TestLinkedList._gen_tmp_list()
tmp.add_to_list(item_d)
tmp.reverse()
TestLinkedList._validate_list(tmp, [item_d, i3, i2, i1])
@staticmethod
def test_reverse_recur():
""" Test recursively reversing the linked list """
item_d = Item("d")
tmp = LinkedList(None)
tmp.add_to_list(item_d)
tmp.reverse_recur()
TestLinkedList._validate_list(tmp, [item_d])
tmp, [i1, i2, i3] = TestLinkedList._gen_tmp_list()
tmp.add_to_list(item_d)
tmp.reverse_recur()
TestLinkedList._validate_list(tmp, [item_d, i3, i2, i1])
@staticmethod
def test_insert():
""" Test inserting an item into the linked list """
tmp, [i1, i2, i3] = TestLinkedList._gen_tmp_list()
i4 = Item("d")
tmp.insert(i4, i2)
TestLinkedList._validate_list(tmp, [i1, i4, i2, i3])
i5 = Item("e")
tmp.insert(i5, i1)
TestLinkedList._validate_list(tmp, [i5, i1, i4, i2, i3])
@staticmethod
def test_count():
""" Test getting a count of items """
tmp, items = TestLinkedList._gen_tmp_list()
assert tmp.count() == len(items), "Expected equal lengths"
assert len(tmp) == len(items), "Expected equal lengths"
@staticmethod
def test_merge():
""" Test merging the linked list """
letters = ["a", "c", "e", "g"]
letters2 = ["b", "d", "f", "h"]
items = []
l1, l2 = LinkedList(None), LinkedList(None)
for _, (i, i2) in enumerate(zip(letters, letters2)):
l1.add_to_list(Item(i))
items.append(Item(i))
l2.add_to_list(Item(i2))
items.append(Item(i2))
l1.merge(l2)
TestLinkedList._validate_list(l1, items)
@staticmethod
def test_add():
""" Test adding an item to the linked list """
tmp = LinkedList(None)
tmp.add_to_list(Item("a"))
assert tmp.count() == 1
tmp.add_to_list(Item("b"))
tmp.add_to_list(Item("c"))
assert tmp.count() == 3
titles_correct = ["a", "b", "c"]
for idx, val in enumerate(tmp):
assert titles_correct[idx] == val.item.title
@staticmethod
def test_delete():
""" Test deleting an item from the linked list """
tmp, items = TestLinkedList._gen_tmp_list()
tmp.delete(items[2])
items.remove(items[2])
TestLinkedList._validate_list(tmp, items)
tmp, [i1, i2, i3] = TestLinkedList._gen_tmp_list()
tmp.delete(i2)
TestLinkedList._validate_list(tmp, [i1, i3])
tmp, [i1, i2, i3] = TestLinkedList._gen_tmp_list()
tmp.delete(i1)
TestLinkedList._validate_list(tmp, [i2, i3])
if __name__ == "__main__":
unittest.main()
###CodeSnippetEnd-test_linked_list_snippet1b###
So overall, very easy to add unit tests here, and easy to see that it
offers a consolidated way to automate testing of the LinkedList
class, and any other classes in a similar fashion. It begs to work
with Test Driven Development (TDD), or something similar in the
future. It makes it easier to modify the class later as well, because
as soon as you change it, you can run the tests to see if any
functionality has broken.
Now it’s possible that this code will be changed fairly drastically in the next post. The LinkedList class has some modularity coming to it, so that pieces of its functionality can be implemented by different providers. Nobody wants to use Bubble Sort all the time to sort their list like we are here. Part of arguably making this more “Pythonic” will involve even further refactoring.
Despite this, cleaning up the code and adding some tests is a good first step, and by the sole act of adding tests I can be confident that the list is working. If I find a bug, I have a place to come and add a test to track it down or prevent it before shipping it next time.
Is it Pythonic?
Um, maybe, slightly more? Although definitely not if you follow along
with a book like “Effective Python” by Brett Slatkin. The LinkedList
class definitely doesn’t inherit from collections.abc
or implement
all that it should with regards to that. I’m sure there are idioms
that were missed as well. Overall, I’d say it’s more Pythonic than
the first iteration, but it has a reasonable distance to go. With the
addition of running pylint
and some configuration I have managed to
get it to follow the PEP 8 guidelines for formatting the code.
I’ll continue to fix it up in the next two blog posts.
I also improved it by moving the MusicCD
class out of
LinkedList.py
since it’s independent of it.
Wrapping Up Part 1:
The Linked List has a number of features, including sort, reverse, add and delete. After the first refactoring it has a nice suite of tests which can be easily automated.
This little program was (and I would argue still is) fairly useless because there was no dynamic input, but that has been resolved.
To add input to the Music CD List, what I’ve done after the first refactoring, is to add a basic user interface in the form of a SPA using React. It allows the user to upload a CSV file with CDs which are then stored in a Linked List on the server, or add a CD through a form on the website. It also also has a download feature.
It’s not “feature complete,” but at least with something to add features to I’ll have something of an idea generator to push progress along. It was also fun to fire up a Flask project and tie a React front end to it. I’m also not a React expert, but it was pretty easy going to get it up and running in VS Code. By the end of the third post I’m planning on having an Ansible and Terraform based deployment system for it.
That being said, so far the web site isn’t really a core portion of this series and while it will drive some of the features, I don’t plan on focusing on it’s development or updates as much as the Linked List.
The Music CD List program has a ways to go, but it’s off to a decent start.
Let’s take a quick look at some of the features we may add in the second and third post (not an exhaustive list):
- Sort on any key
- Change sorting algorithms
- Store Video CDs (or other types of items)
- Ability to page through the list
- Put the LinkedList class into a repo all of its own for reuse.
- Put the Program that uses it in another repository and call this code.
- Inherit from
collections.abc
- Be able to swap out the Linked List storage with some other data structure for performance.
- Many other things I’ll think of later.
In the next post we’re going to start with a few of these. Starting with refactoring the Linked List class so that we can accomplish a few of these items, and of course, also – be more Pythonic. I’ll also look at how it’s connected to the web application and how to improve this.
The tagged repo with the web interface support is available here.
To run it you’ll need to know how to launch the Flask application and also the SPA using npm and the create-react-app. This information is available all over the Internet so I won’t go into it here.
Code Links (also in post):
Stay up to date on new posts and projects by subscribing to my newsletter. I won't send spam or sell your information.
© Copyright 2021, Tyler Rhodes