Some bugs are hard.
Last week I had one of the hardest. It only happened occasionally, after a row lock error, in very specific scenarios, on a customer production org. It was, of course, impossible to reproduce. And given that it only occurred now and then for random users, capturing a debug log was out of the question.
So what do you do? You go old-school. Search the code for any execution path that could possibly lead to the results we saw in the data. And after many hours of research, I found nothing. There was no scenario that could lead to the results we were seeing. And there were no workflows, processes or flows that could do it either. We started wondering if maybe some outside integration was involved, but that seemed unlikely.
Well, there’s that old saying “When you’ve eliminated the impossible, whatever remains, however improbable, must be the truth”. There was one “impossible” code path that could theoretically lead to what I was seeing, but it could only happen in one case – if you could somehow read a field from an SObject that was not included in a query, having it return null instead of throwing an exception.
You’ve all seen this exception. Imagine a custom object Soql_Query_Test__c that has two fields, Test_Field_1__c and Test_Field_2__c and you execute the following code
Soql_Query_Test__c sqt = [Select ID, Test_Field_1__c from Soql_Query_Test__c Limit 1]; String s = sqt.Test_Field_2__c;
The result is the notorious SObjectException “SObject row was retrieved via SOQL without querying the requested field: Soql_Query_Test__c.Test_field_2__c”
I’ve seen that exception many times. It’s invaluable during development and testing when it comes to making sure that all of the fields we use are in a query. But the only way our bug was possible was if I could read an unqueried field without raising that exception.
I tried everything I could think of – converting the object into a generic SObject, passing it to functions and accessing the field there. The exception always appeared. Was I on the wrong track? Was this actually happening? What could our code be doing?
Fortunately, we have unit tests – good unit tests. We even have unit tests that simulate row locking exceptions, so I was able to run that code path, though not for the exact scenario that would reproduce the bug. Still, I could set some fields in a source record, add some debug statements and see exactly what happened.
And sure enough, the improbable was true. I had a record. It had a field that had a value in the database but was not included in the query. I confirmed it was not included in the query using the wonderful SOBject.GetPopulatedFieldsAsMap function. But when my code accessed the field, the value was null. No exception. Null. I was floored.
I started trying other things in the org where I was experimenting – different field types, dynamic vs. static DML, dynamic vs. static queries, and finally had a breakthrough. I set the other field to a random value, and the exception vanished.
Soql_Query_Test__c sqt = [Select ID, Test_Field_1__c from Soql_Query_Test__c Limit 1]; sqt.Test_Field_1__c = 'Changed value'; String s = sqt.Test_Field_2__c;
This results in no exception, and the string is set to null.
If you set any field in a record, reading any unqueried field on the record will return null instead of raising an exception.
I had my answer, and was able to implement a solution so we could patch the bug. But I’ll be honest, at this point my biggest question was – how could I not have known about this?
Is it a Bug, or a Feature?
The next day I reached out to Don Robins who is an expert trainer. He knew about this, and his view, and that of another trainer he spoke to was that this was a known and expected behavior. The reasoning: that once you set any field in a retrieved record, further missing field SObjectExceptions are disabled under the assumption that you (the developer) know what you are doing at this point.
Robert Watson, a co-worker and expert Apex developer hadn’t seen this, but found the following StackExchange post: https://salesforce.stackexchange.com/questions/160112/sobjectexception-no-more-intentional-change/163429#163429
This post suggests that it was a bug that was introduced late 2017. But I knew our code dated to mid 2016. Fortunately, it’s possible to set the API version for an Apex class, so I set the class I was experimenting with to API 24 – which is about 6 years ago – and saw the same behavior. This leads me to conclude that either this behavior has always existed, possibly by design, or that it was an unversioned change.
You may wonder how could an unversioned change this significant occur and not be detected? What about the infamous Hammer test?
Well, think about what would have actually happened when this change was introduced. Existing code would continue to work. The lack of an exception would only break a test that was checking to verify that a missing field exception occurs – and what would be the point of such a test? In truth – this is not going to be a breaking change, and while it might have been caught by an internal Salesforce validation test, it’s highly unlikely any customer orgs, functionality or unit tests would be impacted.
A friend of mine at Salesforce brought the following known issue to my attention: https://success.salesforce.com/issues_view?id=a1p30000000jfXtAAI suggesting that it is a bug. And yes, I did miss this when searching for existing issues before doing my own research. Oops – lesson learned (again).
So this brings us to the big question: is this a feature? Is it an unversioned change? Or is it a bug? And ultimately, should this behavior be changed?
It’s not an easy question to answer.
Does it make sense to ignore unqueried fields once you’ve set any field value? I can see the logic in that argument, but let’s rephrase it.
When updating a record, do you ever read fields on that record? Of course you do. And is there any scenario where, on reading an existing field, you would intentionally leave it out of the query string in order to return null instead of the existing value? Probably not.
Yes, you can make the argument that the developers should know what they’re doing and make sure to query all fields, but we developers make mistakes. And the earlier we find a mistake, the better. Which scenario is more likely to help discover a missing query term earlier – an exception, or returning null? Obviously, the exception. The only way you’d detect the incorrect null field value is if you looked for it, or saw the consequences later in the data – as I did. So while it makes sense to me to allow writing fields that were not queried, I think it would be better for developers to have the exception always occur when accessing unqueried fields that have not been explicitly set.
So I’m leaning towards the “it’s a bug” camp… but is this a bug worth fixing?
The nature of this “bug” is the lack of an exception. How much code exists out there where someone queries a record, writes a field, and then inadvertently reads an unqueried field? Especially considering that this behavior may have existed from the earliest days of Apex? I’m afraid to even ask – the number could be enormous.
Sure, they would version this fix. But then you’ll have a new version of Apex where an exception might be thrown that wasn’t thrown before. Everyone will have to test their code. Unit tests will help, but only for those who have good unit tests, and even then, there can easily exist code paths where the bug was missed – which could lead to the sudden appearance of intermittent and occasional exceptions in code that is currently working for anyone who wants to upgrade their code to a new API version. For some orgs this could present a costly and risky obstacle to upgrading to a new API version – at exactly the time where the new Apex compiler promises to bring new enhancements to the language.
So yes, it may be a bug, but this may be a bug where the cure costs more than it’s worth. In which case, there’s only one thing left to do – turn it into a feature and document it.
Whichever approach they choose, this has been a fascinating case – I hope you found it as interesting as I did. And please spread the word – this behavior is something every Apex developer should know about and consider both at design time and when creating unit tests and QA plans.