It goes to show that there's a very large and vocal user base using it for writing, and yet it's not part of the benchmark for Anthropic.
Anyway, try Sonnet 4.5 while it's still available?
This is something it spit out just now (trimmed a 9 line comment though):
let keepSize = 0;
let overBudget = false;
await this.items.orderBy('[priority+dateUpdated+size]')
.reverse()
.eachPrimaryKey((primaryKey, cursor) => {
if (overBudget) {
evictKeys.push(primaryKey as string);
return;
}
const key = cursor.key as [number, number, number];
const itemSize = key[2];
const contribution = itemSize > 0 ? itemSize : 0;
if (keepSize + contribution > maxSize) {
overBudget = true;
evictKeys.push(primaryKey as string);
return;
}
keepSize += contribution;
});
Come on now... what? For a start that entire thing with its boolean flag, two branches, and two early returns could be replaced with: let totalSize = 0;
await this.items.orderBy('[priority+dateUpdated+size]')
.reverse()
.eachPrimaryKey((primaryKey, cursor) => {
const key = cursor.key as [number, number, number];
const itemSize = key[2];
const contribution = itemSize > 0 ? itemSize : 0;
totalSize += contribution;
if (totalSize > maxSize) {
evictKeys.push(primaryKey as string);
}
});
I'm back to 4.6 for now. Seems to require a lot less manual cleanup.I guess they broke continuity with a 0.1 in model version change in some ways.
It is not only the model that affects the end results. Good technical specification, architecture documents, rules, lessons learned, release notes, proper and descriptive prompting are also important.
Regardless of which one. They're too verbose. They repeat information. They lack cohesion. Overly agreeable. The flaws are part of the tool.
Meaning: You managed your ways around the system prompt and usage intention - Congrats! Now it doesn't work any more - Bummer!
Have you tried opus 4.7 in comparison to 4.6 with a general purpose / writing system prompt in the app? Thats where this would make more sense.