Aug 22

This post concerns something that has bothered me for years. Namely, the tendency of people to write their code using as few keystrokes as possible. While there’s often nothing to gain, there’s a lot to lose, two major issues being readability and debuggability. While reading this post may be akin to drinking poison to some of you (and you know who you are!), your fellow developers will thank you when they are wading through your code.

Depending on the environment you’re using, run-time debug information may be available to you in several ways. Since I have most experience with the Eclipse/Carbide IDE, the debuggability part of post will be primarily applicable to that environment, but it’ll likely apply just as well to most other IDEs. The readability part of course applies no matter what environment you’re using.

Consider the following fragment of code:

int i = 5;

When you’re running this on a debugger, you’re able to see that the value of a variable named ‘i’ is 5. (It often happens that the compiler optimizes out unused variables even in debug builds, so you may need to write some superfluous statement (such as logging) where you use such an otherwise unused variable in order to be able to see it in the debugger, but I digress.) This is all well and good, but consider the following example:

Base* b = ptrGadget->DoStuff(item).Factor()[index];

Here, some IDEs are able to tell you what kind of an object the DoStuff() and Factor() functions return. You’d still have to hover your mouse over the function names (or step into the code), which is basically extra work for you every single time you forget the return types. Considering the amount of information you have to juggle in your mind while debugging, this soon becomes a real frustration. Additionally, from a readability standpoint, let’s consider how a developer new to the project (or you after a 5-week vacation) sees this:

“We’re apparently trying to get a pointer to a class named Base (might be a virtual base class, mind you) by calling a method named DoStuff() on the ptrGadget pointer. The function takes as parameter a variable called item and returns some object (by value or reference?), on which we then call a function named Factor(). Finally, we take an element from an array(?) returned by the Factor() function and store it to the pointer b of the class named Base.”

The code works, but the problem for the person debugging the code is that a lot of details are hidden beneath the surface. For example, we cannot see just by looking at the code what kind of object the DoStuff() function returns. Neither do we see what kind of an array (or any type for which the subscript operator is defined) the Factor() function returns. Sure, you could step into the code using the debugger and have a look, but doing this every time after you forget the types in question gets old fast. Fortunately, the solution is simple and just as efficient.

For the solution, we just have to look up the types in the code being called once and write them as local (automatic) variables to our own code. For the sake of this example, let’s say you find out that the DoStuff() function returns a const reference to an object of class Charger (just picking words out of my immediate surroundings here). The Charger class has the Factor() function which returns an std::vector that contains Base class pointers.

const Charger& charger = ptrGadget->DoStuff(item);
std::vector<Base*> vecBase = charger.Factor();
Base* b = vecBase[index];

Now, we have divided our “neat” piece of code on one line to three. And to what gain? Well, we are now able to clearly see what types we are working on (Charger and std::vector<Base*>). These were hidden from the view previously. And although the code is on three lines and there’s more text, it is much more readable because now we have just a single statement per line instead of three. The debugger is also able to display information about the automatic variables we have created, namely charger and vecBase.

Naturally, you could replace the types used in this example with anything you like, and the point would still stand. The types I used where were from the top of my head and at some point I’ll probably replace them with others that better suit the example.

Jul 02

It’s easy to disregard writing code comments. Let’s face it, we as coders are not going for the Pulitzer prize, so it’s not about the lack of personal ability of being able to express oneself, it’s just plain laziness.

To be honest, leaving code without descriptive comments where they’re needed is much worse than that. Firstly, it indirectly shows the programmer’s lack of respect towards his peers. Everyone knows how frustrating it is to wade through dozens of classes and even more functions with cryptic, non-self-descripting code, especially when the original coder is on vacation, has left the company or otherwise not unavailable for any other reason. A programmer is an artist, with fellow programmers as the audience. And as all artists, we should learn to respect our audience.

Secondly, leaving out comments shows the programmer’s overconfident attitude towards himself. Sure enough, when writing code, we usually have a good idea of what we’re doing. But programming is often about juggling several things in mind at once, and when you next time look at your code, it may be quite difficult to get back to the mindset you were in when writing the code originally. In these quite common situations, it pays to have descriptive comments in place.

Naturally, I’m not suggesting to comment everything. In fact, one should avoid redundant comments when the code is clear by itself, i.e. when the code is self descriptive. For example, consider the following:

// Declare category id for products
const int prodCategoryId = 1024;

// Create an iterator over products
vector<Product>::iterator iter = products_.begin();

// Iterate through all products
for ( ; iter != products_.end(); ++iter )
    // Assign categody id to each product
    iter->AssignCategoryId( prodCategoryId );

This is an obvious example of excessive commenting. So what to leave out? Consider commenting general ideas, not individual code lines. We can refactor the above comments so that we both reduce comment clutter, yet make our general reasoning obvious:

// Assign categody id to each product
const int prodCategoryId = 1024;
vector<Product>::iterator iter = products_.begin();

for ( ; iter != products_.end(); ++iter )
    iter->AssignCategoryId( prodCategoryId );

The intent of the code is clear from the single comment alone. What’s more, variable and function names are descriptive enough not to leave too much for the imagination.

Also remember that the more comments you have, the more of a maintenance burden it will be to keep them all up to date. So strive to write concise and descriptive comments only when they are needed.

Jul 01

In general, the name of a class should reflect the responsibility of the class in question. Note that I say responsibility in singular form. This is because every class should provide features as a small and concise package, covering only a single reasonably sized area of responsibility. A class should not take multiple responsibilities because it would muddle its intent, make it less modular, hinder reusability and, in spirit of this blog entry, make class naming more difficult.

As developers, we often see the class names listed in alphabetical form, for example in a file explorer or an IDE. If we are unsure which class exactly we are looking for, the class name usually gives us a good idea what the class’ responsibility is.

As a real world example, consider Symbian’s RWriteStream class, which provides a base class for writing to different destinations (for example, a descriptor or a file; descriptor being basically a string) in a stream form. That is, you open a write stream to a descriptor or a file, write to it using one or more WriteL() calls, and close it when you’re done.

There are classes that derive from RWriteStream that are used for writing to a descriptor or a file, amongst other types of destination. Consider what happens when you don’t know which exact class you are looking for, but you know, or at least suspect, that there’s a RWriteStream derived class that is able to write to a descriptor. The Symbian documentation doesn’t tell you what classes are derived from RWriteStream, so no luck there. Next you have a look at the table of contents of the documentation, which will show you multiple classes, ordered alphabetically as follows (I’ve highlighted RWriteStream for your convenience):

  • RBufReadStream
  • RBufWriteStream
  • RDesReadStream
  • RDesWriteStream
  • RDictionaryReadStream
  • RDictionaryWriteStream
  • RFileReadStream
  • RFileWriteStream
  • RMemReadStream
  • RMemWriteStream
  • RReadStream
  • RShareBufReadStream
  • RShareBufWriteStream
  • RStoreReadStream
  • RStoreWriteStream
  • RWriteStream

In this simple example, after looking for it a for a moment you can spot the probable class you need to use, i.e. RDesWriteStream. But consider how much easier it would be to find the class if the classes were named so that the parts of the class names were ordered so that they would present the most generic part first, followed by the more specific parts one after another. For example, RDesWriteStream would become RStreamWriterDes (notice the added ‘r’ for better English), RFileWriteStream would become RStreamWriterFile and so on. These class names would depict that the classes are based on streams, they write to the streams (instead of reading), and their destination (i.e. a descriptor or a file.)

Following the from-generic-to-specific naming convention (don’t know of any existing name for it!) specified above, the alphabetical list would become like this:

  • RStreamReader
  • RStreamReaderBuf
  • RStreamReaderDes
  • RStreamReaderDictionary
  • RStreamReaderFile
  • RStreamReaderMem
  • RStreamReaderShareBuf
  • RStreamReaderStore
  • RStreamWriter
  • RStreamWriterBuf
  • RStreamWriterDes
  • RStreamWriterDictionary
  • RStreamWriterFile
  • RStreamWriterMem
  • RStreamWriterShareBuf
  • RStreamWriterStore

In the example above, the stream sister classes can be categorized into reader and writer classes, and further based on their source/destination. I bet you can spot the RStreamWriterDes class much easier from this more ordered looking list.

This is but a single example. The fact of the matter is that our job is hard and frustrating enough as it is. There’s no need to name classes that are similar in purpose in a form that just confuses their users. Strive to name your classes so that the correct ones are easily found by your customers.

Finally there is, of course, the question when to go for the from-generic-to-specific class naming idiom. Essentially, the more classes you have that provide similar functionality and derive from a common base, the more reason you have to go for it. The obvious minimum (in my experience) would be three derived classes, preferably four. Also, keep in mind possible future classes that may be written later on. If you are sure there are going to be e.g. three or more “sister” classes to the one you are writing, by all means go for the naming idiom. You may well earn the right to pat yourself on the back later on.

preload preload preload