- I gasped when I saw this: - A bit of discussion indicated that the trigger for the CPU spikes both times was our CEO logging in. We re-deployed to get a clean start, permanently banned him from the service, and moved on. - This is like finding a live grenade under your bed and putting it under the rug. - They found a way to reproduce a system killing bug, and instead of taking the time to understand it, they threw away their test case. - They contained the impact. Root causing or “understanding” should come after impact mitigation. If needed find a safe way to reproduce the bug without customer impact. - We reverted the refactoring, deployed, un-banned the CEO, and set about analysis. 
- Yeah me too but if you keep reading they didn’t actually “move on” in the way that it sounds. 
 
- Well done. More and more companies are deploying LLM-written code in production environments. Might as well be honest about the results so we can learn what does and doesn’t work. 
- It’s obvious that the LLM didn’t understand the code at all. It chose to refactor the way it did because of a silly comment. - It’s an inference model. It does not understand code no matter how much context it has. It can however output the most probable solution based on the context it has. 
 
- Why are we using tools that can’t parse the comment and code via syntax for refactoring? - The first problem is they’re letting AI touch their code. - The second problem is they’re relying on a human to pick up changes in moved code while using git’s built-in diff tools. There’s a whole bunch of studies that show how git’s diff algorithms are terrible, and how swapping to newer diff algos improves things considerably. - TL;DR on the studies: - Only supporting add/remove/move operations is really bad.
- Adding syntax awareness to understand if differences in indentation should be brought to a reviewer’s attention, improves code and makes code reviews more accurate. (But this is hard because it’s language dependent)
- Adding extra operations (indent/deindent/move/rename-symbol/comment/un-comment/etc…) makes code review easier, faster and more accurate. (But again, most of this requires syntax awareness.
 
 - There’s also a bunch of alternative diff algos you can use, but the best ones are paid, and the free ones have fewer features. See: 
 





