map
, filter
, and
reduce
.
We’ve started to see some significant power in using two “higher order”
list operations, map
and apply
. These are called “higher order”
procedures because they take procedures as inputs.
map
is particularly useful for a variety of reasons. First, it lets
us do a form of repetition: We use map
to repeatedly apply a procedure
to different values (the elements of the list or lists we map over).
But it’s more than that. Experience suggests that using map
leads
to a different way of thinking about repetition than some of the
other mechanisms, such as “for loops”. (Don’t worry if you’ve never
heard of for loops. You’ll learn about them in another CS class.)
As importantly, map
permits some cool implementations. Since map
does not specify the order in which the elements are processed, one
can implement map so that all (or at least many) of the applications
can be done at the same time. (Computer scientists say “in parallel”.)
As we start to reach the limits on the speed of one processing unit,
the way we make large computations faster is to parallelize them,
to make them run on multiple processing units. While there are explicit
ways to break up a program, by relying on procedures like map
, we
can trust the underlying system to parallelize sensibly, without worrying
about particular details. In fact, map
is so important that it’s a
core part of the parallelization technologies that both Google and
Amazon use.
As you might expect, map
alone does not suffice. As you’ve already
seen, it helps to be able to combine the elements of a list into a
single value. We’ve used apply
for that, but, as we shall soon see,
it does not suffice for all cases. In addition, we may find that not
all of the values in a list are worth considering. So let’s consider
some other useful procedures.
When we began our exploration of numbers, we used a variety of unary (one parameter) procedures, such as those above. But we also used some binary (two parameter) operations, such as addition or multiplication. Can we also use those with lists? It seems like we’d want to. For example, if we wanted to compute mean value in a collection of numbers, we want to add up all of the elements in the collection and then divide by the length of the collection.
We’ve seen one way to do so. We can use apply
. For example, we
can sum the elements in a list with (apply + lst)
. But what
happens if we don’t have an “n-ary” procedure (one that takes
arbitrarily many inputs). Consider, for example, the problem
of reversing each word in a string consisting of a sequence of words
separated by spaces.
The first two steps seem relatively straightforward. We use string-split
to break the string into words. We use string-reverse
(or our anonymous
version of string-reverse
) to reverse each word.
(map (lambda (s) (list->string (reverse (string->list s)))) (string-split "this is a sample set of words" " "))
How do we get them back together? We could try string-append
.
(define reversed-words (map (lambda (s) (list->string (reverse (string->list s)))) (string-split "this is a sample set of words" " "))) (apply string-append reversed-words)
Whoops! We’ve lost the spaces. What to do? What to do?
Fortunately, there’s a standard approach that involves a different kind of decomposition. Rather than thinking about putting everything together all at once, we write a procedure that combines neighboring elements and then repeatedly apply it to neighboring pairs until we’re down to a single element. There are two common terms for that work, “folding” and “reducing”. We’ll use the latter.
First, let’s build our helper procedure. While it will eventually be an anonymous procedure, we’ll start by naming it for convenience.
(define combine-with-space (lambda (s1 s2) (string-append s1 " " s2))) (combine-with-space "a" "b") (combine-with-space "Hello" "Goodbye")
Now, we can use the reduce
procedure with combine-with-space
.
(reduce FUN LST)
, converts LST
to a single value by repeatedly
applying FUN
to neighboring pairs of values, replacing the pair
with the result of the function.
(define combine-with-space (lambda (s1 s2) (string-append s1 " " s2))) (reduce combine-with-space (list "a" "b" "c" "d"))
How does that work? Well, we said that reduce
repeatedly
applies the function to neighboring pairs of values. Let’s
consider what happens.
"a"
, "b"
, "c"
, and "d"
."a"
and "b"
with a space, yielding "a b"
. We now
have three values to combine:
"a b"
, "c"
, and "d"
."a b"
and the "c"
with a space in between, yielding
"a b c"
. We now have two values to combine.
"a b c"
and "d"
."a b c"
and the "d"
with a space in between,
yielding "a b c d"
. We are now down to one value.Of course, we could also combine the values in other ways.
"a"
, "b"
, "c"
, and "d"
."c"
and "d"
with a space, yielding "c d"
. We now
have three values to combine:
"a"
, "b"
, and "c d"
."a"
and "b"
with a space, yielding "a b"
. We now
have two values to combine:
"a b"
and "c d"
."a b c d"
.
We are now down to one value.Fortunately, we end up with the same value either way.
So we can now go back to our original problem: Creating a new string with reversed versions of all the original words.
(define combine-with-space (lambda (s1 s2) (string-append s1 " " s2))) (define reversed-words (map (lambda (s) (list->string (reverse (string->list s)))) (string-split "this is a sample set of words" " "))) (reduce combine-with-space reversed-words)
Note that we expect these auxiliary function definitions, combine-with-space
and reversed-words
, to be local to this computation and not particularly useful outside of this computation.
We, therefore, could inline our definitions to create one large expression to avoid binding excessive names.
(reduce (lambda (s1 s2) (string-append s1 " " s2)) (map (lambda (s) (list->string (reverse (string->list s)))) (string-split "this is a sample set of words" " ")))
Like map
, reduce
provides some advantages to the computer
scientist, programmer, or software engineer. First, it encourages
you to think in terms of decomposition. Rather than dealing with
the whole list at once, you simply think of what to do with neighboring
pairs and then rely on reduce
to do the heavy lifting. And, once
again, we can gain some efficiencies. If it doesn’t matter which
order we do the operations, we can do some of them simultaneously
and otherwise find orderings that are a bit more efficient. (Yes,
it turns out that there are orderings that are more efficient.)
And string-append
(or, in our case, combine-with-space
) is not
the only operation in which the order of operations doesn’t matter.
The same holds (more or less) for addition and multiplication.
For those of you with a mathematical mindset, reduce
works well
with any associative binary operation.
We can, of course, use reduce
in many other ways. To find the largest
value in the list, we reduce with max
.
(reduce max (list 3 1 5 10 3 2))
(reduce min (list 3 1 5 10 3 2))
There’s one more “big” higher-order list-processing functional procedure, (filter pred? lst)
.
filter
takes two parameters, a unary (one-parameter) predicate and a list of values, and selects all the values for which the predicate holds.
(define stuff (list -5 10 18 23 14.0 87 (/ 1 2) 0.5 -12.2)) (filter negative? stuff) (filter integer? stuff) (filter (lambda (n) (and (<= 0 n) (<= n 10))) stuff)
That seems pretty powerful, doesn’t it? Believe it or not, but by the end of this course, you’ll be able to write filter
yourself. (You’ll also be able
to write map
and reduce
, as well as other higher-order list procedures
you design yourself.)
And, as in other cases we’ve seen, combining filter
with other
procedures can let us do new, clever things. For example, suppose
we want to add up the digits in a string containing both digits
and non-digits.
We know that we can convert a string to a list of characters.
(string->list "a 1 and a 2 and a 3")
We can convert a digit character to the digit by getting its collating sequence number and subtracting the collating sequence number of zero.
(map (lambda (c) (- (char->integer c) (char->integer #\0))) (list #\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9))
We can extract all the digits from the list of characters with filter
.
(filter char-numeric? (string->list "a 1 and a 2 and a 3")) (map (lambda (c) (- (char->integer c) (char->integer #\0))) (filter char-numeric? (string->list "a 1 and a 2 and a 3")))
And then we add them all up.
(reduce + (map (lambda (c) (- (char->integer c) (char->integer #\0))) (filter char-numeric? (string->list "a 1 and a 2 and a 3"))))
You’ll see this combination of “the big three” fairly frequently. We filter, we map, then we reduce. Together, they bring great power (and the accompanying great responsibility).
map
with multiple listsWe’ve seen one way to use binary procedures with lists: We can reduce
a list of values to a single value by repeatedly combining pairs of
values with a function. But there’s another. Just as we can use map
to create a new list of values by applying a unary procedure to each
element of a list, we can also use a more generalized version of map
that grabs values from multiple lists and combines them into values
in a new list. In particular, map
can also build a new list by applying
the procedure to the corresponding elements of all the lists. For example,
(define increment (lambda (n) (+ n 1))) (map * (list 1 2 3) (list 4 5 6)) (map + (list 1 2) (list 3 4) (list 5 6)) (map list (range 10) (map increment (range 10)) (map square (range 10))) (define first-names (list "Addison" "Bailey" "Casey" "Devon" "Emerson")) (define last-names (list "Smith" "Jones" "Smyth" "Johnson" "Doe")) (map (lambda (first last) (string-append first " " last)) first-names last-names) (map (lambda (last first) (string-append last ", " first)) last-names first-names)
You may be starting to see some interesting possibilities. If you are not, stay tuned.
In the last example, we used map
to map two lists of names into a single list of names of the form "last, first"
.
(define first-names (list "Addison" "Bailey" "Casey" "Devon" "Emerson")) (define last-names (list "Smith" "Jones" "Smyth" "Johnson" "Doe")) (map (lambda (last first) (string-length (string-append last ", " first))) last-names first-names)
Extend this expression with filter
and reduce
so that:
a. The list only contains "last, first"
formatted names that are at least 13 characters in length (including the ,
and space).
b. We obtain the sum of the lengths of all the names in "last, first"
format.