Book Review – Force.com Enterprise Architecture

As I headed out from Dreamforce, one of my last stops was the developer library where I saw Andrew Fawcett signing his new book, “Force.com Enterprise Architecture”. It took me a while to get around to reading it, and I thought I’d share a few comments since he was kind enough to give me a copy.

As I discuss in my Pluralsight course “Learning Technology in the Information Age”, I feel that books provide a unique value proposition – along with taking a course, they are the best way to gain domain knowledge that is curated and organized in a way that is easy to learn. So that’s how I measure the value of a book beyond the obvious standards of clarity and accuracy – by choice of content and organization.

The first thing you should know is that this is not a book for beginners. This is not the book for an admin to read who wants to learn Apex. It’s also not the book to read if your goal is to obtain one of the innumerable certifications that Salesforce offers. This book is intended for intermediate to expert level Force.com developers.

The title, “Force.com Enterprise Architecture” is a rather generic title that is accurate enough, but as you will see, tends to obscure the real value of the book. This is a good book for any Force.com developer who wants to learn to how to architect solutions on the platform. The exact approaches in the book aren’t necessarily applicable or necessary for every solution, but they demonstrate the right way to think about architecture on the platform.

That said, if you are a developer who is thinking about creating a managed package or application to distribute on the AppExchange, this book isn’t just good – it’s indispensable. It is a “drop everything you are doing and buy a copy for every member of your team before you do anything else” kind of book.

There are many books on Salesforce and Force.com, including many books published by Salesforce itself, but what almost all of them have in common is that they are written by in-house developers and consultants. As far as I know there are just two books in existence written by developers who have shipped major managed packages on the AppExchange and this is one of them (mine is the other). Andrew Fawcett is CTO at FinancialForce, and he may know more than anyone in the world on what it takes to ship a Force.com application (myself included) – so if you’re even thinking about doing that, you’d be a fool not to buy this book and study it carefully. It’s full of the kinds of hints, tricks and suggestions that you won’t find anywhere else (including the books published by Salesforce – most of their authors haven’t shipped managed packages either).

And it’s a great complement for Advanced Apex Programming – you’ll find there is little overlap between them.

Summer 14 Describe Patterns

On each Force.com release, developers eagerly look through the release notes for exciting new features. I’ve found that the things that excite me most often aren’t the same things that thrill others. I often get most excited about small changes – sometimes they can have a huge impact on software design patterns.

This summer, the biggest feature for me is the elimination of Describe limits. This eliminates a huge Catch-22 when developing on the platform. On one hand, good Apex code is supposed to respect field level and object security. But the previous Describe statement limit made it difficult and sometimes impossible to do so on larger applications and systems, where the number of fields processed in an execution context could easily exceed the available limits.

The elimination of Describe limits does, however, raise an interesting question. How does this change impact design patterns and could other limits come into play? Or put another way – how costly are Describe calls in terms of CPU time?

Prior to now, the best design pattern for using Describe statements involved caching each Describe call so that you could at least ensure that you don’t call getDescribe on a field more than once in an execution context. The design pattern looked something like the getDescribeInfo function below, where the parameters are the field name and SObjectField token for the desired field:

private static Map<String, Schema.DescribeFieldResult> fieldDescribeCache = new Map<String, Schema.DescribeFieldResult>();
private static Schema.DescribeFieldResult getDescribeInfo(String name, SObjectField token)
    if(!fieldDescribeCache.containsKey(name)) fieldDescribeCache.put(name, token.getDescribe());
    return fieldDescribeCache.get(name);

Does it still make sense to use this kind of pattern? Or should you just call getDescribe() whenever you need describe information?

To find out, I did some benchmarking using the techniques described in chapter 3 of the second edition of Advanced Apex Programming.

I found that the approximate cost of a Describe statement is about 3 microseconds. This looked pretty fast to me. Can the earlier design pattern, with its cost of an additional function call and map lookup, be any faster?

The answer, as it turns out, is no. The overhead of caching and looking up data exceeded any benefits that might have come from avoiding the extra Describe calls.

Further testing showed the same results with SObject describes as field describes.

I don’t know if Describe calls on summer 14 are fast because work went in to optimize them, or because the platform is now caching describe data internally for you, but it doesn’t really matter. It seems clear that going forward, the optimal design pattern for describe statements is to use them inline as needed.

New Course Teaches Data Visualization Using the Salesforce Platform

I’m pleased to announce my latest Pluralsight course “Data Visualization for Developers”. This is not a course on Force.com – but in some ways it’s even better. It teaches the principles and practice of data visualization using Force.com as an underlying technology.

The course is published on Pluralsight.com. Free trials are available if you are not already a subscriber.

Find out more in this preview video:


Advanced Apex 2nd Edition Available!

I’m pleased to announce the immediate availability of the second edition of Advanced Apex Programming for Saleforce.com and Force.com

A few months ago, when SFDC announced the elimination of script limits, I knew that it had finally happened – a change that really impacted some of the content of the book. That led to some major changes in chapter three. And I figured, as long as I’m working on the book anyway; why not add a few more changes?

Chapter 6 extends the discussion on triggers to clarify some points based on questions I’ve received over the past year.

Chapter 7 has significant new content on batch apex and scheduled apex asynchronous patterns

Chapter 8 is a new chapter on concurrency issues (the later chapters have been renumbered).

Plus, there are numerous other smaller changes and additions scattered throughout the book.

All told, the book has grown by about 50 pages.

It also has a snazzy new cover – making it easy to determine going forward which edition you’re looking at.

Also, unlike last year, I’m pleased to announce that the Kindle and Nook editions are also available for those of you who prefer the eBook format.

The book available now on several Amazon.com country sites, and I’ll be linking the others as they go live. The links on the left will take you to the new edition – it will take a few weeks before all of the channel databases are updated.

Dreamforce 2013 Sessions

I’ve been so busy for the past month that I haven’t had much time to post, but I’m pleased to say that I’ll be presenting three sessions at Dreamforce this year.

Monday at 11:15am, Moscone West -2009, High Reliability DML and Concurrency Design Patterns for Apex

It’s remarkable when you think about it, that even though Force.com is a highly scalable multi-user and multithreaded system, there is hardly any documentation on how to deal with Apex concurrency issues. I’m looking forward to shining some more light on this topic and sharing some of my own adventures (and misadventures).

Monday at 1:30pm, Hilton San Francisco Union Square – Community Success Zone Theater, Apex Design Patterns for Managed Packages

This one is for the ISV’s in the community, particularly the developers. Those of us who create managed packages are a growing minority – it’s nice to see us getting some more attention this year!

Tuesday at 5:15pm, Moscone West – 2024, Design Patterns for Asynchronous Apex

At first I was thinking – 5:15pm before the gala? Talk about bad timing. But then again, talking about timing (good and bad) is a large part of asynchronous apex, and if the late hour gives you a syncing feeling, so much the better 🙂

I hope to see many of you there. Also, be sure to attend the developer keynote on Wednesday 10:30 at Moscone South – Gateway.

And, I encourage you to visit this site sometime this weekend for another post that you may find of interest.


Goodbye Script Limits, Hello what?

Perhaps the most surprising change for Winter ’14 is the elimination of script limits, to be replaced with a single CPU time limit for each transaction.

This is an extraordinary change, and it’s worth taking a few minutes to explore the consequence, both long term and short term, of this decision. Keep in mind, that what follows are my preliminary thoughts – I’m still somewhat in a state of shock 🙂

In the immediate future, I don’t expect this change to have any impact. I believe SFDC when they say they’ve analyzed the situation and that no current code will exceed the CPU time limits.

To understand the long term impacts, let’s consider the real meaning of this change.

  • First, managed packages will no longer have their own set of script limits, or their own CPU time – CPU time will be shared among all managed packages interacting with a transaction and code native to the organization.
  • Second, my understanding is that time spent within Salesforce code counts as CPU time. Up until now, script limits only impact your code – a long running built-in operation such as a sort or RegEx would only count as a single script line.

This will obviously have an immediate impact on how one might code for efficiency. Your code can be more verbose – there will be less need to build complex conditional statements that are barely readable in order to cram everything into one line of code. Not having to trade-off readability for efficiency will be very nice.

For the first time Apex developers will need to care about the efficiency of the built-in Apex class code. This will be a whole new topic for discussion, as the community gradually discovers which classes and methods are good, and which should be avoided, and when.

The real question comes down to what happens going forward – say, six to twelve months from now. Without the script limits, the pressure to optimize code will be reduced, and I’m sure we’ll see code appear on orgs that would never have survived in the current system.

As an ISV partner, this brings up an interesting question. What happens when some of that bad code, either on an org or in another package, uses up most of the CPU time, and when it becomes time for my package to run, limits are exceeded? Running a debug log with profile information should presumably allow identification of the greedy piece of code, but how many sys admins will take the time or trouble to actually figure this out? It’s so much easier to blame a package – possibly the one unfortunate enough to have tipped the CPU limit.  As this occurs more and more often, one can envision a case where customers gradually lose trust in applications in general, never knowing if one can be safely run. Ultimately this could impact trust in the platform overall.

Arguments that the proposed CPU time limits are generous (and they are are), don’t (so far) address the well known fact that software inevitably expands to use available CPU time (often because it’s expensive to optimize code and therefore often not done unless it’s necessary).

There seem to me three possibilities going forward.

  1. There is a real commitment within SFDC to build infrastructure to support inefficient code, so the performance will increase faster than the spread of inefficient code. (And don’t try to convince me that people won’t write inefficient code ).
  2. The amount of headroom in the current CPU limits really is so great that it pretty much takes an infinite loop to exceed it. (I’m sure I won’t be the only one experimenting with this in days and weeks to come).
  3. The engineers who made this choice are deluding themselves that all Apex developers will continue to write efficient code even when they don’t have to.

As an ISV partner who ships a very large application, I confess that the relaxed script limits are definitely going to make life easier. At the same time, I really hope that when CPU time limits are exceeded, they don’t just post an error blaming the application that tripped the limit, but rather more detailed information that explains to users where the CPU time went – so that it is easy for clients and vendors alike to quickly focus on the code or package that deserves the blame.

A Most Interesting Apex Trigger Framework

In my book Advanced Apex Programming, I spend quite a bit of time discussing trigger design patterns. But I’m going to let you in on a little secret – what you find in the book isn’t really a “design pattern”, so much as a design concept.

And despite the chapter name “One trigger to rule them all”, I didn’t originate the idea that it was a good idea to control execution sequence by using just one trigger – experienced Apex developers already knew this. What I think I brought to the table was the idea that we could take advantage of the Apex language object oriented features to implement that concept in some really good, supportable and reliable ways.

Here’s a secret – the examples I used in the book do not, in fact, accurately reflect the framework I used in our own products. The framework we use is considerably more sophisticated. But the examples do reflect the concepts that our framework uses.

I did this because I do not believe there is any one “right” trigger design pattern or framework for everyone and every situation. So my goal in the book was to demonstrate the concepts involved, in the hope that others would build on it – come up with variations of different design patterns and frameworks based on those concepts.

I was thrilled to see the other day a blog post by Hari Krishnan called “An architecture framework to handle triggers in the Force.com platform”. It’s beautiful piece of work (and I do appreciate the shout out). As with our own framework, I don’t think it’s a solution for every scenario, but it does present a very elegant object oriented implementation to the problem. What really struck me was the innovative use of dynamic typing to instantiate objects based on the object type and name. Our own framework doesn’t use that approach, for the obvious reason that it was built before Apex supported dynamic object creation by type, but it’s definitely worth considering for any design going forward.

I don’t know if Hari has worked on the .NET platform (he does mention Java and C#), but the idea of dispatching by name is one we’ve seen in a number of Microsoft frameworks and languages. One can’t help but wonder if, now that we have a real tooling API, someone might come up with a client tool to generate and manage trigger handlers based on a framework like this….

Not only might this automate some of the “plumbing”, but conceivably bring us to that state of Nirvana where, with judicious use of some global interfaces, we might be able to control order of trigger execution across cooperating packages and between packages and Apex code on an organization instance.

Ah well, one can dream. Meanwhile, kudos to Hari for a fine piece of work. Definitely worth a read.

Code Coverage and Functional Testing for Optional Salesforce Features

A couple of days ago Matt Lacey posted an excellent article on developing for optional Salesforce features. He ended it with a question – how do you ensure code coverage for those orgs that have those features disabled?

For example – let’s say you have code that only runs when multi-currency is enabled on an org:

if(Schema.SObjectType.Opportunity.fields.GetMap().Get('CurrencyIsoCode') != null)
    // Do this on multi-currency orgs
    obj.Put('CurrencyIsoCode', o.Get('CurrencyIsoCode'));

How do you get code coverage for this section?

One way to do this is as follows:

First, we refactor out the currency test into it’s own function as follows:

private static Boolean m_IsMultiCurrency = null;
public static Boolean IsMultiCurrencyOrg()
    if(m_IsMultiCurrency!=null) return m_IsMultiCurrency;
    m_IsMultiCurrency = Schema.SObjectType.Opportunity.fields.GetMap().Get('CurrencyIsoCode') != null;
    return m_IsMultiCurrency;

Though not necessary for this example, in any real application where you have lots of tests for whether it’s a multi-currency org, you may be calling this test fairly often, and each call to Schema.SObjectType.Opportunity.fields.GetMap().Get(‘CurrencyIsoCode’) counts against your limit of 100 Describe calls. This function (which is written to minimize script lines even if called frequently) is a good tradeoff of script lines to reduce Describe calls for most applications.

Next, add a static variable to your application’s class called TestMode

public static Boolean TestMode = false;

Now the code block that runs on multicurrency orgs can look like this:

if(TestMode || IsMultiCurrencyOrg)
   // Do this on multi-currency orgs
   String ISOField = (TestMode && !IsMultiCurrencyOrg())?
                     'FakeIsoCode' : 'CurrencyIsoCode';
   obj.Put(ISOField, o.Get(ISOField));

What we’ve effectively done here is allow that block of code to also run when a special TestMode static variable is set. And instead of using the CurrencyIsoCode field which would fail on non-multicurrency orgs, we substitute in any dummy Boolean field. This can be another field on the object that you define, or you can just reuse some existing field that isn’t important for the test. There may be other changes you need to avoid errors in the code, but liberal use of the TestMode variable can help you maximize the code that runs during the test.

Why use a TestMode variable instead of Test.IsRunningTest()? Because the goal here is to get at least one pass through the code, probably in one specialized unit test. You probably won’t want this code to run in every unit test.

With this approach you can achieve both code coverage and, with clever choice of fields and field initialization, functional test results, even on orgs where a feature is disabled.


Death of a Platform Bug

In my previous post, I walked through the process of discovering, diagnosing and reporting a legitimate platform bug. As I mentioned previously, on any platform as large and complex as Force.com, bugs are inevitable. Every OS has them. Every framework has them.

One of the biggest considerations when evaluating a platform bug is when it a appears. For example: if a bug appears on a new API version, and the platform is versioned – you can avoid the bug by either working around it, or by staying with the old API version until it is fixed.

If a bug is just there – and has been there for a while, you can either come up with a workaround, or just not use that particular feature – because the bug has always existed, there’s little or no risk the bug will impact code that you ship.

But, if a bug appears on the platform and breaks existing code – that’s a big problem. That’s why Salesforce puts in such a huge effort to test new releases, running every unit test (including customer unit tests and package tests) on the new version to detect any possible breaking change. Unfortunately, the DataDotComEntitySetting bug was this type of bug.

As it turns out, the problem related to a security setting on that particular object – one that I presume is used by Data.com Clean when enabled. It’s also not a common problem – it impacted our application and that of one other ISV (who started seeing sudden errors appearing with customers who enabled Data.com clean).

The good news is, that once we were able to reach the right people at Data.com to convey the impact that the problem was causing, they were phenomenal. They provided us with access to a sandbox with data.com to verify both the error and confirm the fix, they kept us updated as to the progress, and today – confirmed that a patch has been pushed out to production.

So – it’s a happy ending.

But, happy ending notwithstanding, it did point out one area that I hope Salesforce will work to improve. You see, the application I’m working is large and complex – and makes use of many platform features. So I’ve probably run into (and helped discover) more than my share of platform issues. Over the past few years I’ve noticed a dramatic improvement in the ability of the Salesforce frontline support to confirm, prioritize and address platform bugs. I’ve noticed a marked improvement in the Known Issues site – and the quick identification of workarounds where possible (and remember, for a developer, a workaround is as usually almost good as a fix). I’ve seen rapid and accurate responses on StackExchange.

I don’t know how Salesforce is organized internally, but from where I sit, the Data.com support group hasn’t quite gotten the message yet. Yes, they were great at confirming that a platform bug existed, but after that – things got… difficult. I won’t go into details, but it took some pretty extraordinary efforts on our part to finally reach the right people where we were able to have a good discussion and get real feedback that we could work with and convey to our customers. Anyway, I’m confident that they’ve learned as much from the experience as we have, and I am thrilled to see this particular platform bug dead and buried.





Anatomy of a Platform Bug

Update 5/20/13 – See “Death of a Platform Bug

Platforms and frameworks have bugs.

Nobody really likes to discuss it – especially platform and framework vendors. But it’s like Murphy’s law of computer programming: Every non-trivial program has at least one bug. In fact, one of the signs that you have become an “expert” on a platform or framework is that a high percentage of the problems that you run into and can’t solve are, in fact, platform bugs rather than your own code.

I’ve found bugs in Windows, MFC, ATL and the .NET Framework. Nowadays I find them in Force.com. The experience is pretty similar on all of the platforms. First you have to be very sure that it’s really not your bug – this can be harder than you might think. There’s a lot of detective work involved – unlike your own code, you can’t necessarily know what is going on with the platform – I once found a VB bug where I actually had to disassemble a part of the VB control interface code in order to demonstrate to the developers where their mistake was. Which brings us to one of the biggest challenges – getting past the first-line support team to someone who can actually solve the problem (or convince them that you really know what you’re talking about and that they should forward the information).

I thought it might be interesting to walk through what the process is like with an example that I am currently dealing with. This is a story in-progress – I will add more information as it becomes available.

It began with our latest release – where on some systems we started seeting many of our unit tests fail with the following error:

FATAL_ERROR|System.DmlException: Insert failed.
First exception on row 0; first error:
sObject type 'DataDotComEntitySetting' is not supported.: []

This was perplexing. After all, we don’t access an object called DataDotComEntitySetting. In fact, we don’t reference anything related to Data.com.

As a software vendor, you really don’t want to see most of your unit tests start failing. So this became a top priority issue.

Our first concern was whether we could install the software at all. The answer is, of course – yes. If you’ve read “Advanced Apex Programming”, you’ve seen unit test design patterns that allow you to dynamically enable or disable individual unit tests before or after deployment – so we’re not dead in the water. However, not being able to run unit tests means we can’t validate the operation of the application on those systems – which is definitely not good.

Because we could disable tests for installation and then reenable them after the software was installed, we were able to eliminate one theory – that the problem was purely related to software installation – perhaps some security issue related to the user context used during unit tests on installation.

Another early step was, of course, to search for other instances of this problem. Unfortunately, this was one of those cases where we clearly were innovators. There was only one reference to a similar problem, and our scenario did not match the one described.

This left us with a number of questions.

Was this really related to Data.com?

Yes, the error message referenced an object called ‘DataDotComEntitySetting’, but I’ve seen cases where an error message has nothing even remotely related to do with real source of the error. This is especially true in a complex framework, where internal error handling attempts to internally recover from a problem and only after a cascade of errors do you finally see an unrecoverable error – that has nothing to do with the original problem. In this case, there are a number of factors that suggested it really related to data.com aside from the object name. First, both systems on which we saw the problem did have Data.com enabled – too small a sample for a firm conclusion, but an indicator nonetheless. Second, the StackExchange issue was seemingly related to a jigsaw package, that later seems to have been integrated into Data.com. Later in this article you’ll see how we obtained further proof.

What changed?

Our new software release had dozens of unit test errors – most of them on code that had not changed from the previous version (as a reasonably agile organization, we have frequent releases). But there was one change that impacted the entire codebase – we upgraded from API 25 to API 27, mostly in order to take advantage the new string library and some other new Apex features. When code breaks from one API version to another, that can be an indicator of a platform bug as compared to a bug in your own code.

Looking for a Workaround

At this point we had already submitted an initial case. But when dealing with potential platform bugs, you can’t just sit around and wait for support. You need information – the more the better. Fortunately, we have some great customers who are ok with us using the license management system to log in to their sandboxes – when you do so, you can see detailed debug logs for your managed packages. The push upgrades system also provides better information than a regular package install. This allowed us to see where the failure was occurring.

The code, in a nutshell, was like this.

// Code that creates some test lead
// objects but doesn’t insert them
List<Lead> newleads = initTestLeads();

The InsertTestObjects function is a public method that we use to insert test objects and perform some additional tasks. In this case, it sets a static variable so that our trigger framework will know to ignore these test objects.

public static void InsertTestObjects(List<SObject> objs)
   DisableExternalUpdates = true;
   insert objs;
   DisableExternalUpdates = false;

The error was occurring during the insert. We saw it occur on Leads, Contacts and Accounts – a fact that again pointed towards Data.com as the culprit, as it uses those objects.

One thing we found in the debug logs was that when the problem occurred, no object triggers were being called (at least in our application, or in user code). This provided additional evidence that the problem was not in our code or other user code, though it theoretically could have been in a different managed package.

This code is extremely simple. So we looked for ways to reproduce the problem.

  • We built some unit test classes in the sandbox that contained similar code. They worked perfectly.
  • We created another test package that contained similar code and tried to install it. It worked perfectly.

Things are so much easier when you can reproduce a problem. When you can’t….

What this did tell us however, is that whatever it took to cause this problem, it was not obvious. We had some test functions that failed, and others with almost identical code that succeeded. The problem was not intermittent – tests that failed did so consistently, those that passed also did so consistently. But there was no clear pattern.

So our next step was to create some patch versions of the application and see if we could change things to get the test to pass.

And we found something. If instead of calling the InsertTestObjects function we called a new strongly typed InsertTestLeads function, most of the tests passed.

public static void InsertTestLeads(List<Lead> objs)
   DisableExternalUpdates = true;
   insert objs;
   DisableExternalUpdates = false;

This would suggest that it was perhaps a language issue, except for one problem: there were other places in the code where a direct strongly types insertion would fail. For example:

Account act = new Account(…..);
insert act;

This would fail with the same error. Not everywhere, just in some test functions.

Presenting the Case

We were very fortunate to be assigned a really good support person, but we’d also done our homework. While the original case was filed as a “application won’t install” problem, by the time we were on a GotoMeeting with support we could demonstrate failing tests, had log files showing the problem, and could demonstrate code changes that could in some cases resolve the problem. In short, we had overwhelming evidence that we were dealing with a platform bug.

The support person, who was familiar with data.com, then walked us through some experiments. One of them involved turning off the “Clean” feature in data.com. That did it – the tests stopped failing.

So now we were in as ideal a situation as one could ask for under the circumstances. Salesforce support agreed that it was a platform bug, and we knew for sure that it related to data.com.

You may think I’m glad it’s a platform issue, and while in some sense there is relief that it’s not our code, the truth is that it would be much better if it were our code – we can fix our code. Now we have to hope that Salesforce will commit the resources to resolve the issue, and be able to figure it out – the inconsistent nature of the problem suggests that it may be hard to track down.

This is the “dark side” of modern software development – where we build applications based on packages, platforms, frameworks and services, many of which are outside of our control. It’s certainly not unique to Force.com. The best thing you can do is to be proactive – work with the platform and framework vendors to resolve issues, but be prepared to work with them on solving the issues, and where possible, develop workarounds.

I’ll add updates to this post as new information becomes available.

Meanwhile, if you have any insight to share, feel free to leave a comment (note, comments are moderated to limit spam so you won’t see them immediately)