Monday, July 16, 2012

Dissection of a sub-optimal function

Untitled Page

I came across the following function the other day and it cried out to me, “please refactor me, I’m so lonely.”  Alas the code isn’t mine to fix however a dissection of the cadaver should be interesting.

Here’s the code unmodified:

CString ConvertToBase38(CString csDec, int iLen)
{
    csDec.TrimLeft(); csDec.TrimRight();
    const int iMaxDigitsIn = 16;
       const int iMaxDigitsOut = 11;

       CString csResult=_T("");
    int i;
    int iLoops;
    char caBase38Chars[38] = {'0','1','2','3','4','5','6','7','8','9',
       'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O',
       'P','Q','R','S','T','U','V','W','X','Y','Z','-','.'};

    iLoops = csDec.GetLength();
    if(iMaxDigitsIn < iLoops)
    {
              csResult=_T("");
    } else
       {
              __int64 iValue = atoi64(csDec);

              BOOL bConvStarted=FALSE;
              for(i=iMaxDigitsOut-1; i>=0; i--)
              {
                     __int64 iPlaces = 1;
                     for(int j=0;j<i;j++)
                           iPlaces*=((__int64)(38));
                     if(iValue>=iPlaces)
                     {
                           bConvStarted=TRUE;
                           int iChar= (int)(iValue / iPlaces);
                           if(iChar>=0&&iChar<38)
                                  csResult += caBase38Chars[iChar];
                           iValue -= iChar * iPlaces;
                     } else if(bConvStarted)
                     {
                           csResult+=caBase38Chars[0];
                     }
              }
       }

       if(iLen!=0)
       {
              if(csResult.GetLength()>iLen)
                     csResult=_T("");
              while(csResult.GetLength()<iLen)
                     csResult=_T("0")+csResult;
       }
    return csResult;
}

If you're new to C++ (or lucky enough to work exclusively in U/Linux) you might not notice that the code above uses MFC CStrings which limits its portability, but I'll ignore portability since that was exactly why I encountered the code. 
  1. On the first line CString ConvertToBase38(CString csDec, int iLen) we see that a CString object has been passed to the function by value.  This will require a call to the CString copy constructor. Even though performance suffers due to the call to the constructor, MFC CStrings will do a lazy copy and the CString isn’t actually duplicated unless it is modified.  Unfortunately that modification will happen later on.  To avoid copying the CString with its call to the copy constructor we should use a constant reference. CString ConvertToBase38(CString const &csDec, int iLen)   Using a constant reference gives us the benefits of passing a value with a pointer or reference (fast without duplication) but keeps us from inadvertently modifying the variable.
  2. Next, we have csDec.TrimLeft(); csDec.TrimRight(); This modifies the CString forcing the string to be duplicated before it can be modified, thereby losing the benefit of the lazy copy.  Additionally, newer versions of MFC have a CString::Trim function that will remove leading and trailing white space.  But wait, why remove the whitespace at all? The subsequent call to atoi ignores leading whitespace we don’t need to modify the string, which allows us to use the more efficient constant reference.
  3. From there on to char caBase38Chars[38] = {'0','1','2'... Unless the compiler is really smart this array will be created on the stack and initialized every time the function is called, which is unnecessary since the array of base characters never changes.
  4. Before I forget the “iLen” parameter isn’t the length of the incoming string it’s the padding length, the name was confusing, but it became clear during the dissection, it is used in the code to ensure the result is left padded with zeros.
  5. Next the code checks for an input that is too long, this is an extra check that could be skipped, atoi will return 0 if there is a problem, once again like the whitespace, letting atoi handle the unexpected seems a preferable.
  6. After this the author goes to great pains to calculate the base conversion from left to right. Including calculating the value of 38 raised to power “i” with a for loop, the math function “pow” would be a faster since it would use the floating point processor. And faster still, since the values for 38 raised to x don’t change, the values could be stored / retrieved from a constant array. But…I think there is an even better way.

The last time I did an arbitrary base conversion function was in school, it was an assembly language class so I had the benefit of a stack and the full return from a divide operation (result and modulo).  I remember calculating from right to left using the stack to reorder the digits. Needless to say the assembly version was really fast and efficient. 
I think the same can be accomplished with some well written C++/C code.  Below is what I came up with.

#define MAX38DIGITS 11
const char BASE38CHARS[38] = {'0','1','2','3','4','5','6','7','8','9',
       'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O',
       'P','Q','R','S','T','U','V','W','X','Y','Z','-','.'};


char* ConvertBase10to38(const char *inVal, char *outVal, int maxLen, int padLen)
{
       long long value = atoi64(inVal);
       char *ptr = outVal;

       // build the base 38 number (reversed)
       while (value != 0 && (ptr - outVal) < maxLen)
       {
              int mod = value % 38;
              *ptr = BASE38CHARS[mod];
              value /= 38;
              ptr++;
       }
       // pad the string if needed
       while (ptr-outVal < padLen)
       {
              *ptr = BASE38CHARS[0]; ptr++;
       }

       // null terminate
       *ptr = 0;

       // reverse the string
       reverse(outVal, ptr);

       return outVal;
}

The profiler reports that my code as 5.7x faster with the slowest part being the “std::reverse” which improved to 6.7x faster with a custom reverse function.  Which is cool.

No comments:

chris' shared items

Twitter Updates

Official blog of Chris Lee Runyan

Fastest C++ in the west.