Switching out the dies
The Japanese car companies began to run circles around Detroit in the 1970s. They didn’t outspend us or apply better technology - they focused on continual improvement.
They worried about more than improved quality. The Japanese engineers thought about how to make it easier to improve quality.
One typical example was a focus on switching out dies - massive hunks of metal that bash sheet steel into doors, hoods, and fenders. Want to fix a problem caused by a bad die? That will cost you three days in assembly line down time. Of course, dies were rarely switched out.
Toyota concentrated on switching out dies quickly - an activity that has no direct benefit for the consumer. They tried many ideas and eventually brought change times down to a minute (it’s now called Single Minute Exchange of Die).
Then the customer started getting benefits. Fixing problems was no longer painful, and quality shot up.
It’s the same in software or in banking. Process matters. Make it easy to fix problems and quality moves up.
We’ve trimmed our process back by removing the fat. A few days ago, we went from identifying a problem to making an urgent fix in less than one hour. And we followed procedure. The underlying issue was pretty simple (a customer had posted deposit rates with larger minimum balances than we thought possible). But the point is clear: procedure should not bog down fixes.
- Note the problem - Randa clearly identified the problem in our online bug tracking system. “Here’s what I did, here’s what I expected, here’s what happened”. Takes a minute and the issue is tracked to completion. Far different from noting the problem in email, voice mail, or post-it.
- Notify the developer - the system automatically emails the developer when the case is assigned.
- Fix the problem - since our code is written clearly, we found the underlying issue in minutes.
- Unit test - confirm we didn’t accidentally break something else when fixing this issue with automated tests. To be honest, we don’t have as many tests as we need yet.
- Commit in source control - one click uploads all the new code to our source control server, where the rest of the team can retrieve it wihout stepping on their own work.
- Documentation - The developer edits the online wiki and the new user documentation is complete.
- One-click build - The developer clicks a button in internal software. This changes the version number, converts the software from programming language to final program, and builds the installation file (a program that places our software on the user’s computer)
- Resolve the bug case - The developer marks the case fixed. The person who opened the case is automatically emailed.
- Close the case - Randa confirmed the code works properly and closed the case. It is now out of mind forever, but there is a complete paper trail in case we ever need to revisit the issue.
There are tradeoffs. We prefer the customer get a physical manual to hold in the hand. But the cost of printing manuals in change speed - and therefore software quality - is prohibitive. It can take two months to notify the technical writers (working in specialized software), create a new manual, and then send the order to the printer, check the print job, and mail out revised manuals.
Mainly, better organization comes at little cost. Spend a little on software, push the team to work in slightly different ways, slow down short-term speed to worry about long-term issues.
We even made the fix process itself easy to modify. The complex steps are documented in our wiki too. Anyone can make an improvement without worry - each change is logged and it’s simple to revert back to older documentation.
How quickly can your bank identify and fix a systematic problem? A little agility can work wonders.
Update: Bankervision remarks on Agility’s effect on bank profits.










I think that’s an excellent bug-fix procedure except that you left out step 2.33: Duplicate the bug, and step 2.67: Understand the bug. The only reason I make a big deal out of this is that too many developers leap right in and start “fixing” bugs before they have a solid and deep understanding of why the code did the wrong thing. That kind of “shotgun debugging” just wastes time and leads to more bugs.
Nice blog - I just discovered it today, although I read about your company in Bob Walsh’s book a while ago.
Good points. I’m lucky enough to work with top-notch developers. They instinctively look carefully at the case to make sure they understand it well.
If the case was vaguely written, the developer asks questions in the case and passes back to Randa. Sometimes, there’s a whole conversation in the case.
But we use a little Getting Things Done in this step (for less urgent cases) that’s worth detailing:
- If the case is clear and can be done quickly (half hour), it’s done on the spot.
- Otherwise, an estimate is added to the case.
The estimate not only helps us schedule work, but signifies “I read the case and think I understand without further questions”.
I like what you have to say about paying attention to process and procedures in software development actually saving time and effort. As you insinuate, you have to make the investment in some “infrastructure” but it starts to show up in faster turnaround and quality.
Your steps remind me of the content of a book I am reading right now called Producing Open Source Software by Karl Fogel. In truth, a lot of what he says in the book (e.g. regarding techical infrastucture for a software development project) could be applied to any software development effort, not just open source.