Matt Asay
Contributing Writer

AI coding tools are your interns, not your replacement

AI canโ€™t replace bad developers because it only works for good developers. Recognizing when code from an LLM will fail requires skill and experience.

Programming, programmer, a person coding
Credit: thinkhubstudio/Shutterstock

โ€œAI models currently shine at helping so-so coders get more stuff done that works in the time they have,โ€ argues engineer David Showalter. But is that right? Showalter was responding to Santiago Valdarramaโ€™s contention that large language models (LLMs) are untrustworthy coding assistants. Valdarrama says, โ€œUntil LLMs give us the same guarantees [as programming languages, which consistently get computers to respond to commands], theyโ€™ll be condemned to be eternal โ€˜cool demos,โ€™ useless for most serious applications.โ€ He is correct that LLMs are decidedly inconsistent in how they respond to prompts. The same prompt will yield different LLM responses. And Showalter is quite possibly incorrect: AI models may โ€œshineโ€ at helping average developers generate more code, but thatโ€™s not the same as generating usable code.

The trick with AI and software development is to know where the rough edges are. Many developers donโ€™t, and they rely too much on an LLMโ€™s output. As oneย HackerNews commentator puts it, โ€œI wonder how much user faith in ChatGPT is based on examples in which the errors are not apparent โ€ฆ to a certain kind of user.โ€ To be able to use AI effectively in software development, you need sufficient experience to know when youโ€™re getting garbage from the LLM.

No simple solutions

Even as I type this, plenty of developers will disagree. Just read through the many comments on the HackerNews thread referenced above. In general, the counterarguments boil down to โ€œof course you canโ€™t put complete trust in LLM output, just as you canโ€™t completely trust code you find on Stack Overflow, your IDE, etc.โ€

This is true, so far as it goes. But sometimes it doesnโ€™t go quite as far as youโ€™d hope. For example, while itโ€™s fair to say developers shouldnโ€™t put absolute faith in their IDE, we can safely assume it wonโ€™t โ€œprang your program.โ€ And what about basic things like not screwing up Lisp brackets? ChatGPT may well get those wrong but your IDE? Not likely.

What about Stack Overflow code? Surely some developers copy and paste unthinkingly, but more likely a savvy developer would first check to see votes and comments around the code. An LLM gives no such signals. You take it on faith. Or not. As one developer suggests, itโ€™s smart to โ€œtreat both [Stack Overflow and LLM output as] probably wrong [and likely written by an] inexperienced developer.โ€ But even in error, such code can โ€œat least move me in the right direction.โ€

Again, this requires the developer to be skilled enough to recognize that the Stack Overflow code sample or the LLM code is wrong. Or perhaps she needs to be wise enough to only use it for something like a โ€œ200-line chunk of boilerplate for something mundane like a big table in a React page.โ€ Here, after all, โ€œyou donโ€™t need to trust it, just test it after itโ€™s done.โ€

In short, as one developer concludes, โ€œTrust it in the same way I trust a junior developer or intern. Give it tasks that I know how to do, can confirm whether itโ€™s done right, but I donโ€™t want to spend time doing it. Thatโ€™s the sweet spot.โ€ The developers who get the most from AI are going to be those who are smart enough to know when itโ€™s wrong but still somewhat beneficial.

Youโ€™re holding it wrong

Back to Datasette founder Simon Wilisonโ€™s early contention that โ€œgetting the best results out of [AI] actually takes a whole bunch of knowledge and experienceโ€ because โ€œa lot of it comes down to intuition.โ€ He advisesย experienced developers to test the limits of different LLMs to gauge their relative strengths and weaknesses and to assess how to use them effectively even when they donโ€™t work.

What about more junior developers? Is there any hope for them to use AI effectively? Doug Seven, director of AI developer experiences at Amazon Web Services, believes so. As he told me, coding assistants such as Amazon Q Developer, formerly CodeWhisperer, can be helpful even for less experienced developers. โ€œTheyโ€™re able to get suggestions that help them figure out where theyโ€™re going, and they end up having to interrupt other people [e.g., to ask for help] less often.โ€

Perhaps the right answer is, as usual, โ€œIt depends.โ€

And, importantly, the right answer to software development is generally not โ€œwrite more code, faster.โ€ Quite the opposite, as Iโ€™ve argued. The best developers spend less time writing code and more time thinking about the problems theyโ€™re trying to solve and the best way to approach them. LLMs can help here, as Willison has suggested: โ€œChatGPT (and GitHub Copilot) save me an enormous amount of โ€˜figuring things outโ€™ time. For everything from writing a for loop in Bash to remembering how to make a cross-domain CORS request in JavaScriptโ€”I donโ€™t need to even look things up anymore, I can just prompt it and get the right answer 80% of the time.โ€

Knowing where to draw the line on that โ€œ80% of the timeโ€ is, as noted, a skill that comes with experience. But the practice of using LLMs to get a general idea of how to write something in, say, Scala, can be helpful to all. As long as you keep one critical eye on the LLMโ€™s output.

Matt Asay

Matt Asay runs developer marketing at Oracle. Previously Asay ran developer relations at MongoDB, and before that he was a Principal at Amazon Web Services and Head of Developer Ecosystem for Adobe. Prior to Adobe, Asay held a range of roles at open source companies: VP of business development, marketing, and community at MongoDB; VP of business development at real-time analytics company Nodeable (acquired by Appcelerator); VP of business development and interim CEO at mobile HTML5 start-up Strobe (acquired by Facebook); COO at Canonical, the Ubuntu Linux company; and head of the Americas at Alfresco, a content management startup. Asay is an emeritus board member of the Open Source Initiative (OSI) and holds a JD from Stanford, where he focused on open source and other IP licensing issues. The views expressed in Mattโ€™s posts are Mattโ€™s, and donโ€™t represent the views of his employer.

More from this author