There is a particular kind of anxiety in the current discussion around AI-assisted programming. The fear is not only that code has become cheap. It is that the thing many programmers attached their identity to has become cheap with it. If a model can generate implementation quickly, then the old dignity of the craft can appear to dissolve into prompt and output. What used to feel intimate, difficult, and earned starts to feel industrial, and for some people that feels less like leverage than dispossession.
That anxiety is real. But it mistakes the visible layer of the work for the whole of it.
The part that has remained invariant for me is steering.
For the last six months (since September 2025 when I went all-in on AI-augmented development), the models have improved substantially. They produce better code, make better local decisions, and recover from more mistakes than they did even a few months ago. But the need for human judgment has not receded with that progress. If anything, it has become more important. Faster execution does not reduce the value of engineering judgment. It increases the cost of misalignment. When a system can generate large amounts of plausible work very quickly, the main question is no longer whether it can produce code. The question is whether that code is converging on the right reality.
That distinction is not theoretical. It shows up in ordinary engineering work. Let me give you a couple of examples...
While establishing a baseline benchmark recorder in lockd, the implementation converged on something that looked correct from a narrow point of view. The measurements were stable. The run was coherent. The output appeared useful. By the local standard of benchmark correctness, it was doing fine. The problem was that it took over an hour to run. At that point the bottleneck was no longer the system being measured, but the benchmark itself. The implementation had optimized for local measurement quality while dropping a more important system-level constraint: developer iteration velocity.
That matters because a benchmark is not valuable in the abstract. It is valuable insofar as it can participate in the actual tempo of development. A benchmark that is too heavy to run regularly may be methodologically respectable, but it is poorly aligned with the work it is supposed to support. Once that misalignment was made explicit, the task became clear. The benchmark did not need to be discarded. It needed to be steered toward the right constraint hierarchy. Preserve the invariants that make the numbers meaningful, but reshape the implementation so it fits into real engineering time. The result is better than either extreme. We now have baseline coherence and fast iteration. We can benchmark routinely, not ceremonially, and that means regressions can be caught as part of development rather than as a separate event.
The same pattern appeared in liblockdc, which is the upcoming C client library for lockd, but in a different form. Here the issue was not temporal fit but domain shape.
The first implementation worked, but it worked in the wrong shape. It was too flat, too parameter-heavy, and too close to the transport model to serve well as a long-term SDK (almost as if it had been written for Assembly and not C). That would not have corrected itself through further generation. The agent would simply have continued iterating on the same surface. The refactor happened because a human in the loop recognized that the real issue was not functional correctness but domain model fidelity, and steered the implementation accordingly. The better interface did not emerge automatically. It was authored through judgment.
Instead of making every operation reconstruct state manually, the API became handle-oriented and domain-local. A client acquires a lease. The lease carries the relevant identity and lifecycle context. Operations live on the thing they semantically belong to. The result is clearer locality of behavior, less repetitive field plumbing, and a lifecycle model that is easier to reason about in real integration code. That is the substantive difference between an interface that merely works and one that is likely to remain usable.
This snippet is illustrative of the pre-refactor API shape. The implementation worked, but it was not especially idiomatic across real integration scenarios. If the Go SDK for lockd was developed with AI assistance in September 2025, this C library was developed with OpenAI Codex using GPT-5.4 in high reasoning mode, and still required human steering toward a more idiomatic and durable API surface.
memset(&acquire_request, 0, sizeof(acquire_request));
acquire_request.key = "orders/42";
acquire_request.owner = "payments";
acquire_request.ttl_seconds = 60;
rc = lockdc_client_acquire(client, &acquire_request, &acquire_response, &error);
if (rc != LOCKDC_OK)
{
lockdc_client_close(client);
lockdc_error_cleanup(&error);
return rc;
}
memset(&update_request, 0, sizeof(update_request));
update_request.key = acquire_response.key;
update_request.namespace_name = acquire_response.namespace_name;
update_request.lease_id = acquire_response.lease_id;
update_request.txn_id = acquire_response.txn_id;
update_request.fencing_token = acquire_response.fencing_token;
update_request.body = document;
update_request.body_length = strlen(document);
update_request.content_type = "application/json";
rc = lockdc_client_update(client, &update_request, &update_response, &error);
if (rc != LOCKDC_OK)
{
lockdc_acquire_response_cleanup(&acquire_response);
lockdc_client_close(client);
lockdc_error_cleanup(&error);
return rc;
}
// And so it continues...
I have been away from C for nearly twenty years, so this was not a case of simply dropping back into idiomatic C from muscle memory. But this is exactly where an AI coding agent can be valuable: it let me rapidly recover a great deal of dormant C knowledge in the process, while turning judgment about the domain and interface into actual implementation. It did not replace the human role. It made it easier to reconnect intent, recalled knowledge, and execution. That is what made it possible to steer the library toward a more domain-local, idiomatic, and long-term viable API surface. Here is the version after the refactor clearly and concretely illustrating taste and judgment...
acquire.key = "orders/42";
acquire.owner = "payments";
acquire.ttl_seconds = 60;
rc = client->acquire(client, &acquire, &lease, &error);
if (rc != LC_OK) {
client->close(client);
lc_error_cleanup(&error);
return rc;
}
rc = lc_json_from_string(document, &json, &error);
if (rc != LC_OK) {
lease->close(lease);
client->close(client);
lc_error_cleanup(&error);
return rc;
}
rc = lease->update(lease, json, NULL, &error);
lc_json_close(json);
if (rc != LC_OK) {
lease->close(lease);
client->close(client);
lc_error_cleanup(&error);
return rc;
}
rc = lease->describe(lease, &error);
if (rc != LC_OK) {
lease->close(lease);
client->close(client);
lc_error_cleanup(&error);
return rc;
}
/* lease fields are refreshed in place by describe() and update() */
printf("version=%ld etag=%s expires=%ld\n",
lease->version,
lease->state_etag != NULL ? lease->state_etag : "",
lease->lease_expires_at_unix);
release.rollback = 0;
rc = lease->release(lease, &release, &error);
if (rc != LC_OK) {
lease->close(lease);
client->close(client);
lc_error_cleanup(&error);
return rc;
}
client->close(client);
lc_error_cleanup(&error);
return 0;
}
This is the point many people miss when they talk about AI and programming as if the only remaining choices were total delegation or romantic refusal. There is another mode, and it is the one that matters: delegate aggressively, but retain authorship. Let the model compress the mechanical work. Let it accelerate exploration. Let it generate the first pass, the second pass, even the tenth pass. But do not delegate responsibility for the shape of the system, the hierarchy of constraints, the quality of the interface, or the future maintenance burden of what is being built.
That is not a lesser form of craftsmanship. It is craftsmanship relocated to the layer that compounds.
The benchmark example is about preserving the temporal reality of development. The C library example is about preserving the conceptual reality of the domain. In both cases, the missing piece was not syntax, nor basic correctness, nor the ability to emit plausible code. The missing piece was judgment about what the system is for, what constraints matter most, and what form will continue to hold under real use.
This is why I do not think programmers who care about code should become despondent. The dignity did not disappear. It moved. It now lives more visibly in the places where systems either become coherent or quietly accumulate friction: in interfaces, contracts, boundaries, lifecycle models, integration ergonomics, and the ongoing ability to notice when a local optimization has become a system-level mistake.
That is still code-level work. In many cases, it is the most important code-level work there is.
A bad boundary taxes every caller. A bad SDK shape amplifies confusion outward into every integration. A benchmark that ignores development cadence creates blindness by being too expensive to use. These are not managerial concerns floating above implementation. They are implementation concerns seen from the level that actually matters. They determine whether the code participates cleanly in a larger system or whether it merely exists.
The machine can generate implementations. It can often generate them very quickly. What it still does not reliably do is preserve the full hierarchy of intent across time: the interaction between clarity and maintainability, between measurement and iteration, between local explicitness and long-term usability, between what technically works and what will remain feasible when embedded in other systems and other teams. That is where steering remains essential.
And steering is not secondary work. It is not administrative overhead around the real act of programming. It is the act of programming when the system has become large enough, fast enough, and consequential enough that local correctness is no longer sufficient. It is the work of deciding what matters, noticing what drifted, elevating the real bottleneck, and reshaping the implementation accordingly.
If anything deserves professional pride now, it is that. Not that every line was typed by hand, but that the resulting system is more coherent, more usable, and more durable than the machine would have made it on its own.