Sorry to be somewhat off-topic as far as economics goes. This is another one of my biases about systems development–not that I am a qualified professional at it. But on an earlier post, a commenter wrote,
code sent messages to customers by e-mail, phone, text, etc., for flight notifications (on time, late, cancel, gate change, etc.). Just for frequent flyers, the decision making on whether to send a message that a flight was delayed ran past 1000 lines of code. It depended on whether messages had been sent before already, plus results from multiple databases with customer contact details, timezones, whether the flight was a connection or originating, time-of-day, saved preferences, plus legal issues such as whether the customer had triggered telco opt-out for text messages.
My claim is that none of that should be written into computer code. Instead, think of a list of conditions that might trigger a message. Put those into a database and write code that constantly checks against those conditions. Then think of a list of conditions for sending a message by phone, a list of conditions for sending a message by email, etc. Put those into a database, and when “might trigger a message” is true, check these conditions and if they are satisfied, send a message.
My point is that business rules should reside as much as possible in data, not in code. That way, you know where they are, and you do not have the maintenance problems that come with large amounts of code.
One of the advantages I see in rewriting software is that you take the business rules that have crept into the code out of the code and into data.
Deciding what’s better in a database is hard; code can be a lot better at describing rules than data is: http://thedailywtf.com/articles/Soft_Coding
The article is certainly on point and raises issues with my thinking. I think it is usually true that if you need to add or change one business rule, it is easier to do it in code. But if you keep doing that, one change at a time, the cumulative effect is pretty devastating. So I think that it’s an argument you always have to have with yourself.
The problem does magically not become easier if you try to push you buisiness rules out into a database. That’s just a different kind of coding, and probably one that becomes a mess much faster than the traditional kind.
There *are* languages such as varied as Make, Mathematica and Prolog which are actually designed as rule databases. Switching to such a language, perhaps domain specific one might be a good idea. But this is just a special case of “good software architecture is better than bad software architecture”. But I already knew that maxim.
Perhaps we could reformulate you idea as an engineering goal: The business logic of your application should be split out into a location and syntax where the suits can easily find it, review it, understand it and propose changes too it.
Arnold:
You said, “One of the advantages I see in rewriting software is that you take the business rules that have crept into the code out of the code and into data.”
The major flaw in your logic is that business rules change – dynamically and constantly. The reason software is called software is that it allows dynamic changes (changing business rules) to be assimilated quickly. The last and most expensive changes one wants to make to information systems, in adapting to business rule changes, is to have to alter either the database structure or its content. Or at the least, absolutely minimize changes to database structures and content.
Again, that seems right if you are just going to do one change. But when the changes accumulate, it gets ugly.
It’s a good thought experiment. Let’s tease out a couple of issues.
First, if what you are trying to put into the software is a collection of rules, then a programming language should be your go-to solution. That’s the problem that programming languages solve: encoding of arbitrary rules into a machine-processable form. Don’t discount it.
Second, in the given example scenario, database entries are being used to drive some decision. What is the nature of the software that processes this database? Well, it’s going to have to have support for talking to the database (or really, configuration file), and it’s going to have to have specialized code for each *kind* of rule that the system supports. It ends up being basically a domain-specific language (DSL) solution. Is this a win? Sometimes. It has to be a fairly complicated problem before a DSL solution wins, though, because the DSL implementation is often several thousand lines of code by itself. Plus, it’s tricky code. Plus, you end up railroading your potential solutions into what the DSL can express, which is a cost in flexibility that’s hard to quantify.
“My claim is that none of that should be written into computer code. Instead, think of a list of conditions that might trigger a message. Put those into a database and write code that constantly checks against those conditions.”
That’s kind of a slippery distinction, since in any reasonably sophisticated system, what’s stored in the DB is probably going to be some kind mini rule-language, with at least a lexical analyzer and parser/expression evaluator. And what has to go in the ‘code-code’ are the characteristic of the rule language. Need to trigger a decision on some new, unanticipated attribute or data source? You may be back to code changes anyway. And a long, complex set of rules is not necessarily easier to understand and tweak without breaking something than code. That’s not to say this is a bad approach — I’ve implemented it myself. But it’s certainly not magic.
I believe Arnold is correct. I work with a “programmer” to develop rules to check building 3D models before construction. We make the “rules” in the database because the rules change from building to building and with each new building code. At the beginning of every project we review the rules from the last project and add, delete or adjust as necessary. Previously this was in the code, an it was impossible to find every rule, except when it generated mistakes. Since the database (in reality a simple excel spread sheet) is easy to look at for a non-programmer (me) it is easy to check and easy to update.
If your rules get complicated enough, are you not just writing code inside your database?
At least if the rules were in code you could use Type systems to catch some flaws, or write tests to check your rules.
I think the code/database distinction is close, but not cutting the problem quite the right way. Putting the rules behind a single API call might be just as good a way of abstracting it as putting it into a DB. I think the general point is to separate the business logic from everything else. A DB implementation is one way, but a separate service has advantages as well. (Unit tests!)
I’d also guess that, at most places, the challenge is less in defining best practices than in educating and enforcing them across an organization where both the developers and who owns what part of the code base is constantly changing.
Actually, from a “theory of computing” perspective, whether you encode business behavoir in “data” or in “code” is a false distinction. Both are encodings of rules/behavoir.
It may, or may not, be easier to change structures in a database than to change code. Remember that every behavoir you wish to express through some database entry must be backed by a code stack which can present that behavoir to the outside world.
In the real world any competitive business will be under change pressure, and the encoding of its rules and behavoirs will have to change, be it in “data” or “code”.
Yes, data and instructions are isomorphic.
I began this discussion by suggesting that software applications should be rewritten frequently. My view is that in the short run, you fix applications by adding instructions. But over time, the accumulation of new lines of instructions makes it hard to maintain software. You get to a point where a simple change to a business rule threatens to cause all sorts of unanticipated consequences. The business people expect a change next week, because it seems like such a simple change to them, and the systems people come back and say that it will take 10 staff-years to implement.
So to keep an application reasonably clean, fairly frequently you need to rewrite it, to take a lot of the accumulated ad hoc instructions and structure them more like data. After the major rewrite, then you will start to make ad hoc changes again, and so in a few years you will need another rewrite.
“Instead, think of a list of conditions that might trigger a message. Put those into a database and write code that constantly checks against those conditions. Then think of a list of conditions for sending a message by phone, a list of conditions for sending a message by email, etc. Put those into a database, and when “might trigger a message” is true, check these conditions and if they are satisfied, send a message. ”
I think “put those into a database” is begging the question. Put them in how? For things like “when was the last message sent” it will almost certainly be written to the database by a piece of code, not by a human sitting there and doing it.
Certainly the answer to the question “how long between messages” does not have to be hard coded. The program can read it from configuration, be that from a database, a config file, something like Zookeeper, or any of a dozen ways, but that is the standard way to do things.
He was talking about was gathering information from various sources, evaluating that information, and then making a decision about what to do.
Whether you do that some fancy way in the database or in a piece of code, the logic is basically the same because what has to be done is basically the same, and wherever you put it, that logic is what is called code.
It is advantageous to express your business rules/logic in the most concise and expressive way possible. Frequently, this means using some DSL or simpler rule language rather than a general purpose programming language. At my current place of employment we have such a custom DSL with business logic functionality.
I somehow get the feeling that computer science has regressed.
I wanted to reply to this post the day I saw that you had quoted me. I was on my phone the past four days as we’ve been moving to a new apartment. I hope I’m not too late to respond.
Some thoughts:
1) Real software is developed under serious staff constraints. The programmer that wrote the 1000 lines for frequent flyer delay messages was reliable, but not smart enough to encode rules into data or a better language. My more advanced (much smarter!) programmer that dealt with much more complex rules used a very sophisticated rules-based language called Jess from Sandia Labs. While you still encoded most of what you “knew” as a script, it resembled more a set of interacting assertions rather than procedural code. This is probably the closest I’ve come to encoding all business rules as data, but only one of my developers was smart enough to use such a tool. I had to use the human resources available, which usually means a lot of IF…THEN code in some easy language (Java, .NET).
2) Rewrites are much less costly if a product is highly modularized, and may not even be called rewrites. Putting business rules into a database can destroy this compartmentalization. Most databases are designed to link many tables together efficiently, right? The danger of cruft in business logic code is greatly reduced if you are very strict about putting it into small stand-alone components. If you review a one-page function that makes one small business decision, and change it, comment it, unit test it, and commit it to Git, is this a “rewrite”? Is the Git repository a database? Unit tests can also be more easily created and run against business rules that are in distinct packages or modules, especially if they respond to messages only. I am not sure how you would isolate one subset of business rules in a database for unit testing at all, actually. If you have a data access layer (stored procedures, triggers, etc.) then that is still code, not data, even if they’re “stored” in the database.
3) Rewrites did happen often. The architecture I created fed messages to a single method that was triggered only for that specific customer for that specific event, and triggered in real-time as the flight schedule evolved. The trigger contained all the required data from all sources (we used XML for human legibility), and the method returned a payload (message) to go to another component for actual transmission to a phone. If a set of business rules in a simple language began to overwhelm a less capable programmer, I could have it rewritten (and often did) by assigning that one component to a smarter developer, usually moving it to Jess. The distinct units of work are not only for rewrites. They also allow for domains of responsibility and maintenance as well.
4) Accountability (CYA) in a big political enterprise is often paramount. If you have business rules in a database, how do you manage the blame game when something goes wrong? (Also, don’t underestimate the power of comments in code. How do you comment rules in a DB?) The trouble with complicated solutions that are “better” is that your management may not be able to understand it, so when they are trying to pin blame on someone they need to see code, with comments, with people’s names next to changes, with DIFFs in a source code repository with user names attached. You can’t do this with a database (or at least not easily, and certainly not easily in 2004 when I started the design on this project). I was lucky management was looking the other way when I did most of my design in the first place. They never did really grasp message-based design. At least when the CEO called my boss (which happened several times) we could print the relevant code for the CTO to prove to him that we had done what we were supposed to do with that customer’s message. The CEO couldn’t read code, but the CTO could, provided the code was in some simple language.
I do think that extensive experience with real world constraints in crufty organizations and with huge systems causes one’s view to change in terms of what is possible and what is ideal. I think some of the newest approaches (messaging, parallelism, micro-modularization, etc.) have obviated the need for the rewrite in many ways, or at least isolated it to a much smaller subset of the whole application. I also believe this has seriously blurred the line between code and data, especially when much business logic is now specialized script stored as text in databases.
I would be very curious if anyone can point me to an example of business rules successfully stored as data instead of code. I confess that beyond the Jess-style encoding of logic as assertions with an associated list of dynamically changing facts, I am struggling to envision how even basic logic like I had to deal with could be stored as data at all. I’d love to see an example of this to study it.
Thanks for your comment. I have to admit that it has been many years since I was involved with systems, and even longer since I interacted with developers of large systems. So my biases could readily be out of date. I remain skeptical that a code-rich system can be gracefully maintained, which is the basis of my bias in favor of rewrites. But I could be wrong.