I spent a
little over an hour jotting
down some notes from the problem
statement, wrote some pseudocode
and then jumped in. I was disappointed when I finished almost eight
coding hours later after making some late-night debugging and logic
blunders. I figured if I hadn't stayed up so late trying to finish it,
I might have made it in six hours. Most of my mistakes came from not
understanding the problem thoroughly.
However, I wasn't complaining too
much. I was still
under the median time (~11 hours) for the rest of the C++ programmers
in the study. My LOC for the program was below the median for the other
C++ coders as well: 220 non-comment non-blank lines for my code vs.
~275 lines for the others. I believe one of the main reasons my
program was shorter than average
was that I took advantage of the C++ Standard Template Library (STL),
which for some curious reason was avoided by all the other coders in
the study.
However, what still bothered me was that Norvig's Lisp version
was so much shorter:
45
lines
vs. my 220 lines.
It made me so curious that I had to figure out why. I hadn't looked at
Norvig's code at all before since I didn't want to gain any ideas from
it, but now I studied it in earnest. I dug out my old copy of Paul
Graham'sANSI Common Lisp
and began to decipher Norvig's program. Once I understood the logic, I
decided to code it in C++ myself to prove that there was
nothing special about Lisp. I believed that an equivalent C++ program
could
be written in the same number of lines, if I simply took Norvig's
approach.
After 3 hours, I finished my
Lisp
to C++ port of Norvig's code
using Visual Studio .NET. (I used VS.NET
because it provided STL hash table functionality, and Norvig was making
use
of CL's built-in hash table implementation in his program.) Converting
the
code took longer than I expected – I was simply unable to translate the
program line-for-line. I had cut my program's size in half (142 lines),
but
even with my changes, I still wasn't able to get close to 45 lines.
I wasn't
sure of all the reasons
I couldn't match the conciseness of Lisp in C++, but I could
intuitively see some differences in the coding styles. One major
difference seemed to be that Lisp could combine more on one line than
my C++ code.
Consider the following statements from my code:
words.reverse();
cout
<< num << ":";
for
(list<string>::const_iterator i = words.begin(); i !=
words.end(); ++i)
cout << *i;
cout
<< "\n";
These lines are used to print out the alphanumeric encoding for
a specific phone number. The encoding is stored as individual
words/digits in a list variable called 'words'. In this code, the
'words' list is reversed (since it was originally stored in reverse
order) and the original phone number is printed, followed by the
space-separated 'words' list.
To do the same thing in Norvig’s code took only one line:
(format
t "~a:~ { ~a}~%" num (reverse words))
This particular example showcases one of Lisp's strengths: functional
programming. Almost every
operation performed in Lisp returns values instead of modifying
variables. [1] Keeping with our above example of reversing a
list, calling (reverse
words)
in Lisp would return a new
reversed list while leaving the original list untouched. So in the Lisp
code, reverse
immediately
hands off its new list to format,
which then prints it.
In contrast, C++ modifies the existing list in-place
when words.reverse()
is called and returns void.
This means that the cout
statement which prints the list to the console has to be on a separate
line, because there is no return value from the function that is
printable. Characteristically then, C++ normally consists of one or two
distinct operations per line because many functions or algorithms directly
modify the variables that are
passed to them.
Also, notice that I had to print out the list explicitly -- cout
doesn't know how to print an STL list by default, a rather egregious
fault in my book. (However, we can solve this with C++ operator
overloading, a
fairly advanced skill.)
I also
found other stylistic differences
between Lisp and C++. Here is another sample line from
Norvig's
program ...
(loop
for word in (gethash n *dict*) do
... and a particularly ugly line from my code ...
for
(HashMap::iterator i = gDict.equal_range(n).begin(); i !=
gDict.equal_range(n).end(); i++)
(notice that hash_multimap
has been typedef'd
to make things less wordy!)
Both the Lisp and the C++ versions perform the same operations
(looking up 0 or more values from a hash map by key), but the Lisp
version
is obviously less wordy and, I believe, more understandable.
The Lisp code is more concise here because the loop is operating on
words contained in the hash map. Instead of words, the C++ code is
returning STL iterators,
which are object-oriented versions of
pointers. The iterators need dereferenced to obtain the words, which
isn't
a significant problem. However, look at the loop structure itself and
how many keywords are devoted to dealing with iterators (iterator,
begin,
end ) which have nothing to do with the actual task -- looping over
words
in the hash map.
I believe most of the complexity of the STL arises from its explicit
use of iterators. 99% of the time you want to iterate over the entire
container in a loop, and in searches you're interested in the value
itself,
not the pointer to the value. Iterators simply provide no benefit in
these circumstances. My conclusion from extensive use of the STL is
that iterators are mostly useless and should be abstracted away
everywhere possible.
Also, there’s an additional side effect that results from iterator use.
I prefer the style of the C++ loop as above, but as it stands, it is
not optimized. The end
iterator really should be calculated outside of the for
loop so that it is not recalculated on each
pass. In addition, the iterator should be incremented using prefix
notation
(++i)
instead of postfix (i++)
to avoid inefficiences resulting from temporary objects as well.[2]
These kinds of considerations are typical of the low-level idioms that
a C++
developer must keep in mind when writing their code. It is also these
kinds
of aspects that a Lisp developer is shielded from with Lisp's
higher-level
constructs. [3]
One other
lesson that can be learned
from Lisp is that it is well suited to bottom-up programming. That is,
in the words of Paul Graham,
Instead of just writing a program in Lisp, you can write your program on Lisp, and write your program in that.[3]
Graham goes on to say that he
believes that Lisp is the language most naturally suited for bottom-up
programming, even though other languages can be written in this style
as well.
I will definitely say that after six years of C++ programming,
I have tended to stay with what the language and library have given me.
C++ tends to keep you focused on small details, rather than think in
terms
of building up the language to suit the problem. Freedom from this
mindset is one of the most revolutionary things Lisp has to offer. With
the poweful alternative Lisp was providing, the real question for me
was whether to leave C++ and embrace Lisp, or to try to bring the idea
of bottom-up programming to my own language.
I wish I could say that Lisp was clearly the right choice for me,
because the powerful constructs of the language and the unified
high-level syntax have intellectual appeal. The difficulty for me is
writing Lisp in practice. Dealing with parentheses placement and prefix
notation simply get in the way of my thought process.
I probably would still struggle through writing Lisp code if it
wasn't for the fact that the STL is so powerful. Lisp's lists are
flexible
and can be used to represent everything from arrays to linked lists to
queues, but all those containers are already built into STL.
I decided
to stick with C++ and try
to make it better. But how? My first thought was to get rid of
iterators.
I decided to rewrite all of the STL's algorithms to accept containers
instead of iterators. But more importantly, I decided to also make the
algorithms (where possible) return copies of containers instead of
iterators
as well. This would enable the functional programming that makes Lisp
code
so succinct.
Secondly, I would attempt to build another layer of utility functions
on top of the STL to perform common tasks such as tokenizing strings,
looking up values in containers, reading from files ... I decided to
take
a page from both Lisp and Python in the types of high-level functions
to
include.
It soon became clear I was writing my own library, and one that
I hoped would be useful outside of a small coding contest. However, my
criteria for success would be whether I could make a C++ program with
my
library as small as Norvig's code ... and just as readable.
The results are located on my software
page. The library I named EasySTL;
my new
revised program is located here.
The new line count is 79 lines. However, to be on par with Norvig's
Lisp code, the #include
and using
statements
for the STL headers and the separate lines for curly braces should not
be
counted (since this is simply a style issue between C++ and Lisp). With
these
provisions on the counting, the revised C++ code comes in at 48 lines.
I was
able to match the functionality
of Lisp in C++ for a small sample problem. Not very scientific, I
suppose,
but it tells me C++ isn't all that bad. Maybe the STL is slightly
low-level,
but with a little effort, it can be brought up to the level of a
language
like Lisp in important areas.
Of course, Lisp has a lot of other things going for it, like automatic
garbage collection (which makes functional programming much more
efficient), macros (like C++ templates on steroids) and lambda
functions. But then C++ has its own strengths: extensive library
support, solid tools, and
of course, its premier status as a programming language for Windows.
However,
this little contest does prove
one thing to me: C++ needs to be simpler out of the box. The recent
popularity of Java and C# have proven this. Fewer and fewer people will
continue
to trade programmer productivity for processor cycles.
The C++
standards committee has begun
meetings to decide the next C++0x standard. It's time to decide: what
kind of language does the community need? One that is simpler and more
high-level, or one that is increasingly complex and verbose?
I think
it's clear that Lisp has lessons
in productivity and succinctness that are worth pursuing.
[1]ANSI
Common Lisp by Paul
Graham, pp. 22-23.
[2] Exceptional C++ by
Herb Sutter: Item 6, "Temporary Objects"
[3] For
more on this topic, see Todd Proebsting's
comments on "Language Design: C vs. Lisp" in his presentationDisruptive
Programming Language Technologies.
[4]On
Lisp by
Paul Graham,
vi.