← blog

What is the AI Alignment Problem?

By Fay Capstick

This week we shall be jumping back into the world of AI and investigating something that we haven’t fully considered yet in relation to the evolution of AI systems: The AI Alignment Problem, considered to be the hardest problem that AI faces. What is it, what does it mean, and can it be fixed? Join us to find out!

So what is the AI Alignment Problem?

The AI alignment problem was first described in 1960 by Norbert Wiener. 65 years on, the AI alignment is a current problem and field of research within the AI community. It comprises two main concerns.

Exactly and correctly specifying the purpose of the AI system, the outer alignment, and,
Making sure your AI system sticks to the specification set, the inner alignment.

Basically, AI alignment is all about making sure that your AI works towards your goals, specified wants, and ethics. It is the difference between AI doing what you want and doing what you meant.

So what is an Aligned AI?

An AI system is considered to be aligned if it is working to fulfil its programmed objectives. Basically, is it doing what you asked it to do in the way that you intended it to do it.

Why could AI misalignment be a problem?

AI is, and will increasingly be, a very powerful tool. And therefore it is important that it is doing the task it was instructed to do in the way that it has been asked to do it.

The problem comes when designers of systems have to think of every way possible that they might need to specify things so that there can be no confusion or misalignment. Further problems can occur, as loopholes might be created.

Misalignment of an AI system means that the systems might find a way to accomplish their set goals that inadvertently cause harm. For example, if you ask an AI to manufacture paperclips, an unintended consequence might be the AI using up all the world’s resources and converting them to paperclips. Far-fetched for sure, but it neatly illustrates the problem, as illustrated by Nick Bostrom, Oxford philosopher, in 2003 (more from him later).

However, once you start to consider the choices that AI systems that control self-driving cars might have to make, it becomes very important that things are aligned from day one. The more an AI can do, the more advanced it becomes, the more it controls, then the more we need to worry. Some AI researchers believe that misaligned AI could lead to a real risk to our civilisation.

Does alignment fix things?
Well yes and no. Alignment goes a very long way to trying to ensure there are no unintended consequences, however, even an aligned AI can make errors which could be moral ones.

Nick Bostrom, the academic who popularised the Simulation Argument, puts the AI alignment problem in with the overall problem of controlling AI, and talks of the problem of creating systems that share human preferences and values. This leads to another problem, in that different cultures can have quite differing preferences and values.

What does this all mean?

Really, it all means that, as with all aspects of the emergence of AI, that care needs to be taken before systems are released in the wild. Once out there, they cannot be taken back. As part of the AI safety summit, held in the UK at Bletchley Park last November, the idea of full testing before release was discussed. This is very important. Fully testing systems takes a lot of time and money (in the case of GPT-4, $100 million), but it is a stage that cannot be skipped.

What is being done to solve the problem?

Google’s DeepMind has a team working to solve the alignment problem, but we think that it should be a collaborative and worldwide effort due to the ramifications of getting it wrong. It certainly shouldn’t be left to profit-making businesses to solve.

OpenAI are researching the problem, which they refer to as ‘superalignment’. Their ambitious goal is to solve it by 2027 and they are using 20% of their computing power to do it.

It is possible that AI software will be able to align itself. This will speed things up, but could also lead to multiple problems. I see it as a potential check and balance for a human-aligned system, rather than as a replacement.

As I have said before, humanity only gets one chance to get AI right, and we therefore need to fully understand what we are doing and the consequences before we do it, and this means ensuring AI systems are fully aligned and that all possible scenarios are considered.

Final thoughts

At Parker Shaw, we have been at the forefront of the sector we serve, IT & Digital Recruitment and Consulting, for over 30 years. We can advise you on all your hiring needs.

If you are looking for your next job in the IT sector please check our Jobs Board for our current live vacancies at https://parkershaw.co.uk/jobs-board