Sanity Checks and the Importance of Peer Review
To be wise is to know you will err, and to build systems to expose your errors.
On Dumpster Fires that are your Fault
I recall in an email delineating some guiding principles for the company culture, Elon Musk wrote something along the lines of “bad news should be escalated immediately and as loudly as possible. Fess up early, because if you don’t get everyone’s eyes on the problem immediately, you and others will end up in big trouble.”
Imagine you are Analyst at a rocket company using a simple hand-calculation analysis tool that you put together in a spreadsheet, that you didn’t get peer reviewed, and you happen to stumble on a mistake you made: you had forgot a 0 when converting millimeters to meters, and so the tool is producing wildly inaccurate (low) outputs. But last week, you used the spreadsheet to calculate the design loads for the designer of an important flight part, and she was thrilled to see how low they were, because it meant she could qualify the part by analysis and save having to do a long and costly testing campaign.
You’re going to ruin her day when you break this news, and you’re going to look like an idiot when you do it. She and her team might never collaborate with you again. She’ll be furious to have to communicate a new drastically more painful, expensive, and time consuming development and production plan to the teams supporting that part. The word might spread to other designers, and you fear you might lose your job. At your rocket company, it is a after all a high octane culture where the incompetent1 could be fired on the spot.
Could you just let this one slide, and never tell anyone? The chances of her part failing in flight and causing a catastrophe because it is under-designed are really low anyways, because of the number of other safety checks other teams have put in place. Regardless of what you choose, you’d do anything to have caught this issue before getting that designer’s hopes up, and either a peer review or a sanity check could have saved the day for you. But first a brief aside on this dilemma you are now in:
How to do the right thing when the pressure is on? (An Aside on Engineering Ethics)
A purely self-serving risk/incentive/consequence calculation would result in you never fessing up, despite Elon’s exhortation. So to confess your mistake, you will need to appeal to a higher moral ideal than raw darwinian selfishness can provide. Religious and quasi-religious orientations enter the scene.
I’m a big believer in moral absolutism not moral relativism there is there's good and bad in the absolute.- Elon Musk in a 1/22/24 Interview
Many a debate has raged between religious and irreligious parties as to whether than can be an “absolute moral standard” without God as its source. I will not take on this debate here as it’s out of my area of expertise. Musk’s quasi-religious “The universe is the answer” and "I prayed for this one" sensibilities are pretty par for the course for the insipid agnosticism rampant in the circles I’ve run in, and to many like Musk they are enough to occasionally coax you to self-denying actions that abrade against your self-preserving intuitions.2
Another quasi-religious framework that might incentivize you is the belief that the mission of the company is somewhat messianic: necessary to save the world from earthbound extinction.3 In this setup, the company leader and his dictates become divine and infallible. If Elon says to fess up, you do it. Co-workers become family, and the work that supports the mission consumes all life outside of it. It goes without saying that this is also quite common, and not mutually exclusive with the pseudo-religion mentioned above (and is closer to being symbiotic). I don’t think I need to expound on the many potential motives for a Christian engineer to fess up, but I will mention that once I became a Christian during my time at SpaceX I began to see that there were a lot more of us at SpaceX than I had previously assumed (which was zero)4.
I will say that regardless of your framework, having a good boss is essential, and I certainly did. If you have a boss who is going to unconditionally support you5, forgive you, have your back no matter what, and continue to invest in you regardless of your current standing, it becomes substantially easier to venture into the humiliation of admitting your mistakes. I certainly had this good boss at SpaceX in Steve Furger, and it made a world of difference for me at times when my internal moral compass may have otherwise failed me. Finally, to tie this back in with my previous “religious frameworks” points especially for those who don’t have a good boss (or teacher, for students), a massive advantage for a Christian is that God embodies these good characteristics as an ultimate boss in your life, and so even if you don’t get support from a work boss and may even get fired, you can rest in the assurance that you always operate in a “broad place,”6 with a guiding hand in your life that is characterized by forgiveness and unconditional support.
Adventures of A Novice Analyst
In my first year at SpaceX, I opened up an email on a Saturday that shook me to my core. It was from Angela, the highest ranking engineer in my department and one of the first employees that the company ever had. She had been tasked with peer reviewing an analysis I had done for a customer, and discovered I had forgotten a key mathematical factor in my simulation7, making the results of the analysis clearly wonky. For example, a satellite that supposedly weighs 10,000 pounds was being simulated as only weighing 25.9 pounds, and yet I didn’t notice in the analysis results that anything was wrong, or do simple sanity checks that would expose the response of an unrealistically light satellite.
I emailed Angela immediately that I’d be racing into the office immediately to fix the issues and restart the calculations. She was clearly amused by my novice mistake and the juvenile sense of life-or-death vigor that I rushed to remedy my error with. She comforted me that it could wait until Monday, that something like this has happened to every single experienced engineer, and that this is the reason even she still needs every single one of her analyses peer reviewed. I still ended up in the office to correct my mistake and re-initiate the simulations, but now relieved that this wasn’t the end of my still budding engineering career. I never forgot from that point on:
Just how insanely frequent careless errors were, even among the best engineers.
The importance and means of “sanity checking” and outputs of a calculation.
The insistence of the most experienced engineers on always getting a peer review before passing along results of an analysis.8
On AI Peer Reviews (Another Aside)
Will AI tools be the new peer reviewers9? Or will an AI tool produce the analysis and a human peer review it, because of the proclivity of these tools to “hallucinate” or make things up? Or when these tools are sufficiently advanced, will the path to the most reliable analysis be an AI creator and an AI reviewer? Many are hypothesizing and experimenting around the power and possibility of AI agents collaborating to achieve higher performance than a single AI tool alone. For the foreseeable future, I don’t see an AI tool playing a significant role on either end in engineering analysis. All of my attempts to test it’s capability for engineering analysis (that requires mathematical understanding) have come up empty-handed10.
Translating the Experience to Students
I’ve been meeting over Zoom about once a month with one of my old and most brilliant students from Astra Nova, who had assembled a team from her new school (Stanford Online High School, where she is now a sophomore) to compete for an entrepreneurial competition called the Conrad Challenge. They won the challenge, and received 20k to apply for a patent for their innovative solar design and were given resources to launch their startup. Their design and proposal seemed really advanced and well researched, but I didn’t have the experience with solar and energy storage systems to guide them through this developmental phase of developmental analysis, build, and test.
The Importance of Networks and Generous Engineers
I began to think of people I could connect them with who might be able to look over the design and provide guidance. It so happened that I worked with a guy named Kevin on the Vanderbilt Aerospace Team who now had a PhD from Stanford in the specific advanced solar material the student team was planning to use (perovskite) and had founded a solar startup. I hadn’t talked to him in about a decade, and yet he immediately got back to me within a few hours and was both willing to meet with them, and had already found some major and simple pitfalls in the design they had proposed. That experience highlighted for me how essential it is for these budding engineers to have access to a network of world class engineers, and that this was one major thing I can offer them. I also saw how mightily important it was to these kids when these practicing engineers are generous with their precious time.
Hand Calcs
What Kevin noticed was a simple mismatch between the size of the solar panel and their proposed supercapacitor for short term energy storage. He was doing what is commonly referred to as “First Principles Thinking” using a simple "Hand Calc” or “Back of the Envelope Calc”. These are foundational for rockstar engineers and used constantly, whether at lunch on a napkin to debate an innovative design idea11, or to explain a complex concept to a visitor to your desk. They are a crucial feature of my philosophy of engineering that I synthesized from my experiences and now teach in my “Engineering from First Principles” class.
His calculation is remarkable simple. Take the watt-hours (an energy unit) generated by your panel in a day of sunlight for a given cross sectional area12. Divide it by the volumetric energy density of the storage device in Wh/m^3 and you should have the necessary thickness of the energy storage device. It is a simple 2 variable division calculation. If this thickness output by the calculation is untenable for production or consumer use (over a kilometer for example), the design needs to be changed, and that is exactly what the calculation showed for their design.
I’m a little humiliated that I didn’t have the wherewithal to propose a calculation like this during my initial meetings with them, after all I had sized a solar panel and battery system for my van that I built out with scrap SpaceX solar panels13. I suspect I was overwhelmed by some of their complex design infographics and the many advanced materials I was unfamiliar with in their proposal slides, and this actually drives home another essential SpaceX lesson learned which is the diligent avoidal of "ChartJunk", leaving behind only “the minimum set of visuals necessary to communicate the information understandably.” I also found it noteworthy that a design that failed a 2-variable, pre-algebra complexity hand calculation made it past not just me, but also past the Conrad Challenge judges and other experienced energy systems leaders. It is clear have a crisis of mastery of the fundamentals in our STEM education and professions, and I am by no means exempt.
Sanity Checks and Peer Review Come Full Circle
I met with the team after their talk with Kevin to try and understand what new direction for them might be best, and we talked about the calculations for other energy storage system designs. They had some results in their spreadsheet that seemed a bit fishy to them, but they had double-checked the calculations and they seem to match Kevin‘s equations as well. They asked if I could look them over too, and as I glanced at them in the Zoom meeting I couldn’t find anything obviously wrong. Converting from watt-hours per liter to watt-hours per meter cubed gave me a bit of pause, it involved dividing by a factor of 1000 to get from liters to meters cubed. I had to double check that there were 1000 L in a meter cubed, but there are.
Things still seemed fishy, though. The calculation for sizing a lithium ion battery in their design still produced a required thickness of over a kilometer. I didn’t see how something like an electric vehicle could be tenable with storage requirements like this. Something still wasn’t right, but it was after 5 pm on a Friday and I was exhausted. I’d need to come back to it later. But after I ended the call with them, I couldn’t get the problem out of my head, I had to figure out what I was missing in this simple solar-storage calculation puzzle.
I began looking at data on Teslas, which also use lithium ion batteries and say they have about 75 kWh (75,000 Wh) of storage. So if we divide by the 0.45 Wh/m^3 that was in their spreadsheet for lithium ion energy density, we get over 100,000 m^3 for a Tesla... no way...? But if we divide by the 450 Wh/L that was in the spreadsheet before the unit conversion, we get a more reasonable 166.6 Liters, which passes a sanity check for the approximate volume of a Tesla battery. So why more m^3 than Liters? Because we had converted liters to m^3 without remembering that the volume term was in the denominator: we needed to multiply by 1000 instead of divide. I was immediately teleported back to my humbling experience with Angela’s peer review, and was thankful that I could extend to them the same relief and lessons that I had learned back then, and a reminder that mistakes like this haunt even the best engineers.
One of Musk’s current obsessions is what he sees is a crisis of competence in careers where high performance is demanded and errors result in catastrophe.
If you suspect I’m getting carried away, these are immediately pressing issues for a practicing engineer facing the dilemma I’ve just painted, and for engineering students tempted to cheat their way through school.
One Journalist describes this as TESCREAL: “transhumanism, extropianism, singularitarianism, cosmism, rationalism, effective altruism and longtermism.”
“Underpinning visions of space colonies, immortality and technological apotheosis, TESCREAL is essentially a theological program.”'
Disclaimer: I love technology I am not against technology. I am against technology when it takes the place of religion, and takes on messianic and salvific responsibilities.
In a revealing moment, I “came out” to my best friend at SpaceX as a new Christian, fearing I’d be laughed at. He, though, responded that he too was a practicing Christian and was shy about it SpaceX.
Even if sometimes this means radical candor and constructive criticism.
Psalm 18:19.
Param WTMASS= 0.00259 for those who want to go down the NASTRAN rabbit hole.
It was always the fresh, young engineers who felt they didn’t need one or that what they were working on was far too urgent to get one.
Through tools like Microsoft’s co-pilot, now rolled out for assistance in software like excel.
Hoping for a forthcoming post to delineate the results of my experiments, and the future prospects commenting on things like the wolfram alpha plugin in ChatGPT and the mysterious future Q* project at OpenAI.
Can we build the Starship rocket entirely out of stainless steel? Is a good example but might take up more than 1 napkin.
This cross sectional area is matched by the integrated energy storage system (part of their design).
My approach was a little more intuitive and based on other van designs, and I did eventually overpower my solar charge controller because I didn’t have exact power numbers for the Scrap SpaceX panels. From this, one would assume correctly that I would not be able to catch this teams design issue with a hand calc like Kevin had.