> In general, when I talk to software folks about testing, I'm coming from such a different place that they immediately look at me like I'm an alien, so let's talk about how we tested at this hardware company I worked for, Centaur, which informs my biases about how I like to work. Some of the things that we did that were or are unorthodox in the software world are:
> Hired dedicated QA / test engineers, with testing being a first-class career path on par with being a developer - No code review by default - Virtually no hand-written tests - Constant testing via what programmers sometimes called property based testing, randomized testing, fuzzing, etc., although we just called those tests (hand-written tests were called "hand tests"). - Large regeression test suite (3 months wall clock to execute on compute farm) - No unit tests
Anybody here tried that (or a similar) approach? Especially going all-in on property based testing and fuzzing with no unit tests.
I tried that approach somewhere before and the initial results were promising, but ran into political issues so the idea was canned.
That is a massive amount of information even if we are being sloppy with it. You can read The Hobbit and the first Harry Potter book cover-to-cover and still have room to spare. I would deeply struggle to develop a world model this detailed for any business. Anything that needs to get more specific than these narratives can be a SQL query tool into the data warehouse, grep over the codebase, MS graph API lookup, etc.
Giving the business a balanced way to collaborate over this one shared model of the world is a new challenge I am beginning to engage with. I've also noticed that the world model will compound on itself in terms of self-detection of update opportunities. The more constraints there are, the more likely we appear to violate one.
I haven't even begun to try to comprehend how to use fuzzing testing to improve the ability to find bugs, but it sounds really interesting. I've seen mutation testing to be very useful for finding gaps in tests, so I can only imagine that fuzzing + LLMs might produce insane results.
You should talk to https://www.mechanize.work/ for sponsorship/credits and about environments.
This blog is quite unreadable for 27/32" monitors.
Even with it's issues, the latest models are going to disrupt the labor economics.
You're not likely to want to run Fable in a loop any more than you want to take a bunch of dollar bills and light them on fire. Every invocation of Fable has to be intentional, its context carefully managed. I feel like a babysitter.