As we learned in a previous reading, an algorithm is a step-by-step procedure for solving a problem.
These problems vary in scope from simple one-off tasks to complicated, generalized tasks that form the core of large, complex systems.
For example, consider the problem of going through a web page and finding the links it contains.
It turns out that a web page is plain text in a format known as hypertext markup language (HTML), so we can search the web page source file for occurrences of the text <a href="...">...</a>
which correspond to links.
For example, the beginning of this paragraph is rendered with the following HTML:
<p>As we learned in <a href="/csc151/readings/algorithm-building-blocks.html">a previous reading</a>, an algorithm is a step-by-step procedure for solving a problem.
These problems vary in scope from simple one-off tasks to complicated, generalized tasks that form the core of large, complex systems.
For example, consider the problem of going through a web page and finding the links it contains.
It turns out that a web page is plain text in a format known as hypertext markup language (HTML), so we can search the web page source for for occurrences of the text <code class="language-drracket highlighter-rouge"><span class="nv"><a</span> <span class="nv">href=</span><span class="s">"..."</span><span class="nv">></a></span></code> which correspond to links.
For example, the beginning of this paragraph is rendered with the following HTML:</p>
The paragraph contains one link corresponding to the text yesterday's reading
.
We will eventually learn how to do operations like this in Scheme, but even though we can’t write a program to do this yet, we can imagine that with proper library support that this is a simple task.
In contrast, the task of scraping web pages for links forms the basis of the algorithms that search engines use to rank webpages. For example, Google’s famous PageRank algorithm ranks the relevance of a webpage by the number of webpages that link to it. In order to do this, Google’s servers comb the Internet for webpages and scrapes their links, recording where they point to in a database.
PageRank itself is a complicated algorithm with many parts. However, what we should take away from this example is the fact that this complicated algorithm boils down to a simple task: scraping webpages for links. The process by which we take a complicated problem like ranking all webpages and distill it into a collection of smaller, easier-to-solve problems is called algorithmic decomposition.
Algorithmic decomposition is the lifeblood of a computer scientist. It is the primary skill that we employ to manage complexity in the problems that we solve. As such, we introduce this concept in this first week of the course to start getting you thinking with this mindset:
The problem that I am trying to solve can be decomposed into these smaller problems…
While we haven’t seen much of the Scheme programming language yet, we know enough to introduce the basics of algorithmic decomposition with the image
library introduced in yesterday’s reading.
As a reminder, I encourage you not to read this passively; instead, enter the code interactively as you cover it in your reading.
This will not only help you get used to typing out Scheme code but also encourage you to play around and experiment with the language.
From last class period’s class, recall that we must include a import
statement in our program so that Scheme knows we’re using the image
library.
(import image)
Our initial reading on the Scheme language introduce us to functions for drawing circles and rectangles:
(import image) (circle 50 "outline" "blue") (rectangle 75 50 "solid" "red")
As well as functions that allow us to place images above and beside each other.
(import image) (above (circle 35 "outline" "blue") (circle 35 "outline" "red")) (beside (rectangle 50 50 "solid" "blue") (rectangle 50 50 "solid" "red"))
Let’s consider the problem of drawing the following more complex figure:
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue"))) (define bottom-row (beside (circle 75 "outline" "blue") (circle 50 "solid" "red"))) (define circles (above top-row bottom-row)) circles
How might we approach this problem? We might start trying to cobble together random combinations of the functions we’ve seen so far, and we might stumble on a program that works. But such an approach—haphazardly trying stuff out until success is achieved—will not scale well when our problems get more complex!
Algorithmic decomposition is our primary problem solving strategy for systematically tackling complex problems. Rather than blundering into a solution, we proceed by:
For example, we might decompose the problem of computing a student’s average assignment grade for a course as follows:
We perform this process naturally in many situations, almost without thinking about it. However, when put into a novel situation, e.g., computer programming, you might neglect to go through these steps. It is therefore instructive to be explicit about your decomposition—luckily, this coincides with excellent programming practices, so your diligence is well-rewarded in this context.
Coming back to the proposed picture, let’s tackle the problem of drawing the image by breaking it down into smaller pieces. One way to do this is to analyze the image by row:
Now, when we go to write the code to produce this figure, we can translate this decomposition into code using the image
library function and the define
command to explicitly name each of the parts.
We can approach decomposition in either a bottom-up or top-down style of design. In a bottom-up style, we first implement the individual of pieces of the program and then we combine them to form the complete program. In a top-down style of design, we first partially implement the complete program and then implement the individual pieces. We’ll illustrate both styles of design below.
Let’s begin with the top row.
We’ll define top-row
to be the top row of circles using the circle
and beside
functions.
Note that this define
command should go into your program below your import
statement rather than in the explorations window:
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue")))
We can now go to the explorations window and test our code.
The practical effect of the define
command is to make top-row
an alias for the image, so we can simply type in top-row
as an additional statement in the explorations window to check our work:
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue"))) top-row
Next, we’ll define bottom-row
to be the bottom row of circles.
(import image) (define bottom-row (beside (circle 75 "outline" "blue") (circle 50 "solid" "red")))
And we’ll check our work in the explorations window:
(import image) (define bottom-row (beside (circle 75 "outline" "blue") (circle 50 "solid" "red"))) bottom-row
Finally, let’s put it all together.
As we discussed, the overall picture is obtained by stacking the top-row
with the bottom-row
which we can do with the above
function.
We can then check that circles
is the image that we wanted in explorations window.
(Make sure that you re-run the explorations window to update it with updates to your program!)
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue"))) (define bottom-row (beside (circle 75 "outline" "blue") (circle 50 "solid" "red"))) (define circles (above top-row bottom-row)) circles
With top-down design, rather than starting with top-row
and bottom-row
, we start with the overall program circles
.
We have identified that circles
is a stack of two rows of images, so we know that the definition of circles
will involve above
.
However, we have not defined top-row
and bottom-row
yet—what do we fill in for the arguments to above
?
Scamper defines a special value {??}
which represents an undefined value in the program.
{??}
acts as a syntactically valid placeholder, reminding us that we should replace {??}
with an eventual implementation.
Let’s do that for top-row
first.
(define top-row {??})
Note that we can use our hole value as a placeholder to be able to write out the syntax of a define correctly and ensure that we get it right.
Of course, when we run this code, we the Hole encountered! Fill me in!
error, but we at least know that we have all the keywords and parentheses in the right place.
Once we define top-row
as before:
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue")))
We can now fill in the corresponding hole in circles
:
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue"))) (define circles (above top-row {??}))
Finally, we can define bottom-row
just like before and then complete the definition of circles
:
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue"))) (define bottom-row (beside (circle 75 "outline" "blue") (circle 50 "solid" "red"))) (define circles (above top-row bottom-row)) circles
You might wonder which sort of design—top-down or bottom-up—to use when writing your programs.
The short answer is that it depends on the kind of problem you are tackling and your own personal preference.
Sometimes you might see how to immediately implement the smaller pieces of a program in which you can start with those pieces and then build up to the overall program.
In other cases, you might not see the pieces and want to essentially outline how the program ought to behave in code, similar to the bulleted list we identified for the design of shapes
.
In this case, you can use top-down design to write this outline and then fill it in incrementally.
Either strategy is valid—be willing to experiment early on with both styles to discover your preferences and be flexible in how you design your code!
Finally, let’s look at the big picture. Take a look at the complete program that we wrote in the definitions pane:
(import image) (define top-row (beside (circle 50 "outline" "red") (circle 75 "solid" "blue"))) (define bottom-row (beside (circle 75 "outline" "blue") (circle 50 "solid" "red"))) (define circles (above top-row bottom-row))
Note how our decomposition strategy has been enshrined in the code. That is, our approach to solving the problem of drawing the grid of circles is evident in the code:
circles
is defined to be a stack of atop-row
andbottom-row
. Each of the rows contain two circles of varying color, shape, and sizes.
By employing algorithmic decomposition in our problem solving and programming, we not only gain the able to make tangible progress in solving the problem; our code is much more readable as a result! As we move forward in the course, always approach problems with decomposition in mind even if they are easy to solve at first. Honing this skill early on in your programming journey will prepare you well for the complex problems will encounter later in the semester!
Here is an alternative version of the code to produce the image of this reading.
(import image)
(define circles
(above (beside (circle 50 "outline" "red")
(circle 75 "solid" "blue"))
(beside (circle 75 "outline" "blue")
(circle 50 "solid" "red"))))
Paste this code into a fresh .scm
source file and verify that circles
produces the same image as before.
Compare and contrast the final version of the code the reading with this version. Answer each of the following questions in a few sentences each.
There are many ways to decompose a problem, many of which are equivalent, but many produce subtlety different solutions. The decomposition we chose in the reading was one where we recognized the image was two rows stacked on top of each other. Try writing the code corresponding to a decomposition where we think of the image as two columns placed side-by-side. Does this result in an identical image? Why or why not?