Testing works because tests are (essentially) a second, crappy implementation of...

mewpmewp2 · 2026-03-11T08:23:35 1773217415

I think there is a difference whether you do TDD or write tests after the fact to avoid regression. TDD can only work decently if you already know your specs very well, but not so much when you still need to figure them out, and need to build something actual to be able to figure it out.

josephg · 2026-03-11T09:38:35 1773221915

Yes; I think this remains true with coding agents. If you need to do some exploration of the solution space, it makes sense to do that before writing tests. Once you have a clear, workable design, you can get the agent to make a battery of tests to make sure the final product works correctly.

aray07 · 2026-03-11T15:42:36 1773243756

This is great. The tests in this case are the spec. When you give the agent something concrete to fail against, it knows what done looks like.

The problem is if you skip that step and ask Claude to write the tests after.

godelski · 2026-03-11T17:55:00 1773251700

  > Tests only pass if both implementations of your software behave the same way.

That's not true.

I even addressed this in my comment as did Dijkstra

josephg · 2026-03-12T00:04:15 1773273855

What is untrue about this statement you quoted?

godelski · 2026-03-12T03:48:24 1773287304

You can have software behave differently while passing the same tests.

Idk man, this is pretty easy to demonstrate. Start with a trivial example: test is that input (2,2) -> 4. Function 1 does multiplication, function 2 does exponentiation. Both functions pass the test.

Sure, simple example but illustrative examples should be simple. But add more complexity and I'll add more examples of functions where the outputs are the same for a given set of inputs. (There's a whole area of mathematics dedicated to this!) It's simple, but you also confidently claimed something that was trivial to disprove.

Your claim is true if and only if your tests have complete coverage. So, your claim is only true if you've done formal verification of your code. Which was what I said in the beginning and is what Dijkstra claimed as well.

josephg · 2026-03-12T07:35:15 1773300915

I mean, yeah, I thought that was obvious. If you want to be a pedant:

> Tests only pass if both implementations of your software behave the same way in the exact area being tested.

As I said in my comment above. Tests are a crappy second implementation. The test in your example isn’t even defined outside the input range of (2,2). Tests are a stochastic tool. Tests can prove the presence of a bug, not their absence. Completeness isn’t something tests alone can provide. But in the choice between yolo coding and yolo coding plus tests, you’re obviously going to get fewer bugs with tests.