Cristina Ferrero Castaño
Artificial intelligence learns to programme - will it replace developers?
OpenAI has unveiled Codex, its new artificial intelligence system that develops programming code based on a simple natural language prompt.
Marta Sanz Romero
El Español
August 23, 2021
In the last two years the world has witnessed an unprecedented explosion in the development of artificial intelligence systems. Natural language processing models are emerging that can write in the same style as renowned authors, translate into multiple languages with ease, and dabble in programming.
The OpenAI organisation is behind the most talked-about developments, although companies such as Google and public research projects follow closely behind with their creations. The systems
developed by Open AI generate music, compete in video games and create works of art.
Now its researchers have unveiled Codex, an artificial intelligence programme that could revolutionise the world of programming. It is presented as the perfect assistant for code creators, the key to speeding up the creation of thousands of applications and programmes that will end up on the mobile phones of the entire population. But its future is so promising that it seems to threaten thousands of jobs.
Sam Altman, CEO of OpenAI, describes the new neural network-based system: "I think Codex comes close to what most of us really expect from computers: we say what we want and they do it.
Origin of Codex
Codex was born out of two key events in recent years: the creation of GPT-3 and the collaboration between OpenAI and Microsoft. At the end of 2020, GPT-3 was born, the third version of a natural language processing (NLP) model, the field of artificial intelligence that studies communication between people and machines with the language used by humans.
Its algorithms are trained to recognise patterns in data and learn from examples. To do this, Open AI drew on 175 billion parameters to 'teach GPT-3 to communicate', to put it simply. It immediately became the largest model in history.
Thousands of texts, in all formats, collected from the Internet, from pages such as Wikipedia, for GPT-3 to analyse and learn to express itself like humans. The result was surprising for its versatility, although it is not free of errors that need to be corrected.
Among its many capabilities is that of generating code, "they realised that it was quite skilled at doing many different things, and one of them was developing code, something that brought a lot of queue, because it was one of the tasks that was thought to be difficult to automate in the short term", explains Nerea Luis, PhD in Computer Science and Co-Founder of T3chFest, to OMICRONO.
Among the thousands of texts that this artificial intelligence received, some programming codes slipped through, opening up a new development opportunity. A more feasible and less controversial use case than the fake news texts it could generate at the user's request. Open AI made several decisions, restricting access to GPT-3 to protect it from conflicting use and exploiting this new capability.
Thus the organisation reaches an agreement with Microsoft, whereby the tech giant has exclusive access to the GPT-3 source code in exchange for a $1 billion investment and the ability to use GitHub, the largest code repository that Microsoft purchased in 2018, for OpenAI research projects.
Codex and GitHub Copilot
Under this collaboration, two new tools have emerged, Codex and GitHub Copilot. Both are designed to provide support for programmers and are powered by GitHub's huge database of software.
GitHub Copilot has been released a little earlier than Codex and some users are getting access to work with it. An example of this is the videos on the DocCSV channel of Carlos Santana, an expert in artificial intelligence, where you can see how Microsoft's artificial intelligence helps when developing code.
Codex por el contrario solo se ha podido ver en vídeos oficiales de OpenAI. La compañía ha demostrado el potencial de esta inteligencia artificial en YouTube, los investigadores le dan indicaciones en lenguaje natural y ella desarrolla el software para generar un sencillo juego de ordenador.
Codex, on the other hand, has only been seen in official OpenAI videos. The company has demonstrated the potential of this artificial intelligence on YouTube, the researchers give it natural language prompts and it develops the software to generate a simple computer game.
It can even translate from one programming language to another. His main field of work is Python, but he is also proficient in more than a dozen languages, such as JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript and Shell.
Its main advantage is the time saving it can mean for professionals, explains Nerea Luis: "Instead of going to the code editor and looking up how to do this and then copying it, pasting it or writing it from scratch if you know how, you skip this step".
Just as technology already helps to finish sentences in emails, developers will in the future be supported by a virtual assistant capable of doing the most routine and tedious tasks for them.
A help rather than a threat
These demonstrations of both Codex and GitHub Copilot immediately raise a question in the minds of most observers: Will programmers be out of a job? The answer to this question seems to be clear among those who understand the system the most: no.
Nerea Luis sends a reassuring message: artificial intelligence does not fly alone. "It's difficult for this to replace the work of a programmer, I think we are still at a point where anyone can't use this and suddenly do magic," she explains.
For her, Codex will be a great companion for developers, a tool with which to speed up the initial part of projects, the most repetitive tasks or for the "integration of new services, what we call APIs, here it will be fundamental and will save a lot of time".
The criterion remains the responsibility of the flesh-and-blood professional. In this clearly coincides with the company's own position, OpenAI speaks of two phases in any programming project: in the first phase, humans continue to find the needs or problems and design a solution. Then, in the second phase, Codex generates the code to bring that idea to life.
The tip of the iceberg
It should be borne in mind that at no time do these generators understand what they are doing, they do not have the reading comprehension of humans, they only replicate texts based on millions of examples they have received. "They are non-deterministic, that is, they can give you two different answers to the same question, both valid or perhaps one is incorrect," says Nerea Luis.
In programming languages that ambiguity disappears," he explains, "but since you are writing in natural language, there is an interpretation issue that could lead to errors.
Still, the foundation that these models provide is really promising. "One of the things that Open AI has done well is to be brave and take it out to test it with real use cases, it's the only way to realise that if you don't do it with an intelligent system it's impossible for it to learn in the long run," explains Nerea Luis.