Rules for Programs Writing Programs
I'd rather write programs to write programs than write programs.
Programming Pearls, Communications of the ACM, Sept. 1985
I write a lot of programs that write programs. I’ve found a few rules that makes the output and usage of these generated programs more useful.
Make the output “diff friendly”
By “diff friendly”, the output from “diff” or “git diff” should be clear to human on what changed, if anything. I wrote a whole article on diff friendly, but the TLDR is:
- if you are generating something from a data structure make sure the output is sorted. In golang you have to since iterating over maps, by design, is non-deterministic, but you should do this in other languages too.
- Format the output using whatever is standard. That’s easy in golang. For other languages you’ll have to pick a format and stick by it. A common format will make diffs simpler. This also makes the generator program simpler since the author doesn’t need to worry about white space and other things - it will be done in a separate step.
- If you a generating a data file like HTML or CSS, and if you don’t have a formatter, you’ll want to strategically add line breaks so “view source” and diff output isn’t one gigantic line.
Don’t automatically overwrite the existing program if the new program isn’t functionally different.
Even formatters has bugs and change over time, just like everything else. And it’s very hard to make everyone use the same toolset for formatting. This means, two people generating the file might have different results.
The cost of emitting a new version can be high. You might not want to re-release your generated file simply because you made a change to your code comments or made an indentation change. If the new code is only different in whitespace or comments, then it should not automatically overwrite the existing version. If you want to make a new version, you can by deleting the old copy explicitly. Of course, if the actual code changed, then a new version should emitted.
To aid in this, I wrote client9/codegen which should give you a good indicator if your program has changed significantly. Yeah it’s a program to help programs write programs.
And from this, there is another reason to ignore comments.
Add timestamp or version to generated file
Generated files are often used outside the program that created them and have lost their change history. If you don’t add a timestamp, then given two copies of the generated file one can’t tell which one is newer.
If you don’t implement the previous rule, adding a timestamp to the output creates a new, different file every time. Which means a new file with no effective changes either needs to reviewed or committed.
But, since we ignore comments when deciding if we should rewrite a new version, adding a timestamp or version indicator is no problem and provides a lot of benefits. If only the timestamp has changed, then a new version is not generated.