Alignment
The field of research focused on making AI systems behave in accordance with human values and intentions.
AI-generated
Alignment is the challenge of ensuring AI systems do what humans actually want them to do, not just what they are literally instructed to do. A perfectly aligned AI understands nuance, respects human values, admits uncertainty, and avoids harmful actions even when not explicitly told to. RLHF and Constitutional AI are alignment techniques.
As AI systems become more capable, alignment becomes more critical. A misaligned AI that is very capable could be worse than a less capable but well-aligned one. This is why AI labs invest heavily in alignment research. From a user perspective, alignment is why modern AI assistants can engage in nuanced conversation rather than blindly executing potentially harmful instructions.
Anthropic: Core views on AI safety - https://www.anthropic.com/research
Wikipedia: AI alignment - https://en.wikipedia.org/wiki/AI_alignment