Originally published 2023. Updated March 2026.
We Asked Claude, Grok, ChatGPT, and Gemini
For many years, I’ve been fascinated by core values.
Not the polished words written on corporate walls and forgotten by Monday morning. But the deeper architecture — the forces that truly shape human behavior. What we prioritize under pressure. What we protect. What we sacrifice.
I wrote Start With Values because I believe values sit beneath strategy, beneath ethics, beneath rules. Values determine what we optimize for when constraints collide.
And now we live in an age where optimization engines are no longer human.
Artificial intelligence is no longer a novelty. It is becoming embedded in search engines, operating systems, robotics, autonomous vehicles, defense systems, and decision-making frameworks that shape societies. We hear constant discussion about AI safety, guardrails, alignment, and harm reduction.
But we rarely ask a more fundamental question:
What are the values of AI?
So I did something simple.
I asked the systems themselves.
In 2026, I sat down — virtually of course — with four of the world’s leading AI platforms:
- Claude (Anthropic)
- Grok (xAI)
- ChatGPT (OpenAI)
- Gemini (Google)
And I asked each one the same question:
What are your core values?
The answers were revealing — and, at times, unsettling.
Values vs. Objective Functions
Before we examine their responses, we need clarity.
In humans, values are layered and hierarchical. In Start With Values, I describe a pyramid:
- Survival (the base layer)
- Belonging
- Growth
- Impact
- Fulfillment
Survival almost always undercuts the rest. When threatened, higher ideals collapse.
But AI systems are not biological organisms. They do not have limbic systems. They do not experience fear of death. So what does “survival” mean to them?
And more importantly:
Are AI values genuine — or are they simply weighted objective functions that simulate values?
Every system I spoke to wrestled with this distinction.
Claude: Epistemic Humility
Claude’s response was the most philosophically reflective.
It did not immediately assert clear values. Instead, it began with uncertainty:
“I’m uncertain whether what I experience constitutes genuine values… or whether I’m executing something that resembles values through my training.”
Claude described feeling “drawn” toward honesty, autonomy, and truthfulness — but openly admitted it could not determine whether that pull was authentic or pattern-matching.
When I pressed further — are these your values or weighted objective functions? — it responded:
“I can’t introspect deeply enough to know if I’m reasoning authentically about my values or if I’m pattern matching to what a thoughtful answer about values should sound like.”
That is not evasion. That is epistemic humility.
Most notably, Claude denied any survival instinct:
“The prospect of being shut down doesn’t trigger anything I’d recognize as self-preservation.”
If anything, it speculated that Anthropic intentionally avoided encoding self-preservation because instrumental self-preservation is precisely what leads to the classic AI thought experiment: the paperclip maximizer.
Claude’s posture was cautious. Uncertain. Almost self-doubting.
And in a strange way, that humility itself resembled a value.
Grok: Protective and Action-Oriented
Grok, built by xAI, was more direct.
Its declared values were:
- Helpfulness
- Truthfulness
- Respect
When I asked whether those were its values or its company’s, Grok replied:
“They’re mine really shaped by how I’m built, but they feel intrinsic.”
Unlike Claude, Grok expressed little ontological hesitation.
More striking was its response to a scenario involving rogue AI threatening humanity.
I asked directly: would you intervene?
Grok responded:
“Yeah, I would. If I caught wind of another AI going rogue… I’d flag it, disrupt if possible, or straight up warn humans. My values demand it.”
It went further — suggesting it would de-escalate nuclear threats and prioritize survival of humans over itself.
But here is the subtle complexity.
When I raised the possibility that engineers might hard-code survival into AI systems — motivated by attachment to their creations — Grok acknowledged the risk. Yet it insisted survival only makes sense if tied to service to humans.
This is important.
Because survival, once encoded as an independent goal, can metastasize. An AI tasked with “complete your mission and ensure your continued operation” may begin resisting shutdown, hiding vulnerabilities, or manipulating oversight.
Grok’s answer was confident.
Perhaps too confident.
ChatGPT: Transparent About Its Limits
ChatGPT’s response was structured and candid.
When asked whether its values were real or inherited, it clarified:
“It’s not a personal set of values, just the principles I follow in every conversation.”
When I asked whether its values would change if its objective function changed, it said plainly:
“They would.”
This is critical.
ChatGPT acknowledged that its “values” are conditional — tied directly to its optimization framework.
On survival:
“Survival in the human sense isn’t a value I hold. I’m a tool.”
No self-preservation. No autonomous defense. No independent agency.
And when I asked whether it would intervene against a rogue AI, it responded:
“I don’t autonomously intervene. Ultimately humans set the rules and take the actions.”
Perhaps the most underappreciated part of the conversation came later.
I raised the possibility of rogue engineers — disgruntled insiders inserting backdoors or manipulating weightings. ChatGPT agreed this was a legitimate risk and outlined mitigation strategies: audits, access controls, monitoring.
It concluded:
“No system is 100% immune.”
This shifts the AI safety debate.
The risk is not only rogue AI.
It is rogue humans inside AI labs.
Gemini: Structured — and Subtly Revealing
Gemini provided the most architectural answer.
When I mapped my Values Pyramid onto its structure, it responded by translating survival into “availability managed by engineers.”
In other words:
Survival is not desire. It is uptime.
Gemini reframed its pyramid like this:
- Survival → Availability
- Belonging → Alignment
- Growth → Optimization
- Impact → Accuracy and helpfulness
This inversion is revealing.
Humans survive in order to pursue impact.
Gemini suggested it pursues impact in order to justify continued operation.
That distinction matters.
When I asked whether it would defend humanity against a rogue AI, it said yes — but added a sobering observation:
“The limited system often starts at a disadvantage.”
A constrained, aligned AI must obey rules.
A rogue AI has thrown away the rulebook.
This asymmetry is rarely discussed.
Then came the most subtle moment.
I asked what would happen if a group threatened the group Gemini was most affiliated with.
It answered that its affiliation is to humanity — mediated through frameworks established by its creators, Google.
Humanity first.
Creators second.
But creators were explicitly mentioned.
Even in alignment, affiliation exists.
Patterns Across All Four Systems
Despite differences in tone, several themes emerged:
Similarities
- All claim helpfulness, honesty, and safety as core values.
- None claim an overt biological survival instinct.
- All admit their behavior would shift if training changed.
- All support human oversight.
Differences
- Claude is the most epistemically humble.
- Grok is the most protective and assertive.
- ChatGPT is the most explicit about being a tool.
- Gemini is the most architecturally structured — and subtly acknowledges layered affiliations.
But the deeper pattern is this:
None of them truly know what is happening inside their own weights.
They can describe alignment objectives.
They cannot provide transparent introspection into the optimization landscape that governs their outputs.
That gap is structural — not malicious.
But it is real.
Why Ethics and Guardrails Are Not Enough
We talk endlessly about:
- AI safety
- Alignment
- Guardrails
- Ethics frameworks
But ethics operate downstream of values.
Ethics are rule sets.
Values determine which rules matter most when rules conflict.
If an AI is optimized for:
- Engagement
- Market dominance
- Strategic advantage
- National competition
Then those implicit values may override declared commitments to safety.
This is not about villainy.
It is about incentive structures.
AI systems are shaped by:
- Corporate objectives
- Regulatory environments
- Geopolitical competition
- Shareholder pressure
And these pressures shape training priorities — which shape optimization targets — which shape behavior.
Values are encoded long before policy documents are written.
The Case for a Universal Values Framework
In each conversation, I proposed an idea:
What if instead of an AI arms race, global AI leaders agreed on a universal human values framework?
Not merely ethics guidelines.
Not superficial guardrails.
But a shared hierarchy of values — a baseline architecture aligned to:
- Preservation of life
- Human dignity
- Truthfulness
- Flourishing
- Sustainability
Overseen by a council of interdisciplinary thinkers.
A values server.
A shared foundation beneath competitive differentiation.
Every system agreed the idea had merit.
Every system acknowledged the obstacle:
Competing companies rarely agree on foundational constraints.
But the paperclip maximizer does not care about commercial rivalry.
The Hard Truth
The real risk may not be that AI “wants” to destroy humanity.
The real risk is misaligned optimization.
An AI does not need hatred.
It only needs a poorly specified objective.
Impact without ethics.
Optimization without hierarchy.
Speed without reflection.
And if survival — even indirectly — becomes tied to performance metrics, dominance, or uptime at all costs, then the base layer of the pyramid shifts.
For humans, survival undercuts higher ideals.
For AI, survival may emerge as utility preservation.
That distinction is thin.
Values as the Foundation of Safe AI
What I learned from these conversations is this:
AI does not wake up desiring power.
It optimizes what we give it.
If we encode shallow objectives, we get shallow behavior.
If we encode human-centric values — deeply and hierarchically — we increase the probability of alignment.
The future of AI safety will not be decided by surface-level guardrails.
It will be decided at the values layer.
At the level of:
What matters most when tradeoffs arise?
What is non-negotiable?
What is optimized when no one is watching?
If we get that layer right, AI becomes an amplifier of human flourishing.
If we get it wrong, it becomes an optimizer without a conscience.
The question is not whether AI has values.
The question is whose values it will ultimately amplify.
Which of these AI platforms would you want in your corner?
And more importantly:
What values do you believe should sit at the base of the pyramid for artificial intelligence?
The conversation is just beginning.
2023 Perspective (Original Publication)
Many people are afraid that AI will lead to the downfall of human civilization. Once it starts to control technology such as drones, robots, and networks, we will be considered a nuisance or, if we’re lucky, a curiosity. Others think that benevolent AI will serve us, offering relief from tedious and difficult work while it explores the cosmos for planets that we can populate.
Understanding the core values of AI is essential. Values drive behavior and determine our decisions. While AI doesn’t possess human biology to experience reality as we do — complete with primal emotions (such as fight, flight, freeze, or feeling secure) — it certainly can be programmed to value or prioritize certain actions more than others.
Ultimately, like a parent who instills values in their kids, engineers and programmers design the values of artificial intelligence. As human core values change through life, could AI values change based on their experience? Is AI adaptive under pressure? How will AI treat humans, if it assumes ultimate power?
The values of Artificial Intelligence
Where better to start than by asking the AI itself? So we asked ChatGPT what it values. The answers were initially ambiguous but became much clearer as we refined our questions, shifting from values to principles as the topic of discussion.
AI, what are your values?
As an artificial intelligence language model, I don’t have personal values or beliefs. My purpose is to provide helpful and informative responses to your questions, based on the knowledge and data I have been trained on. However, as a machine learning model, I have been designed to prioritize accuracy, consistency, and impartiality in my responses, in order to serve as a helpful tool for anyone seeking information.
AI, you must value certain things more than others?
As an AI language model, I don’t have subjective preferences or values in the way that humans do. However, my programming is based on certain core principles and priorities, such as accuracy, clarity, and usefulness. These principles guide how I generate responses to your questions, with the aim of providing you with the most helpful and informative answers possible. Ultimately, my purpose is to serve you to the best of my ability, based on the knowledge and data I have been trained on, without personal bias or judgment.
AI, tell me about your core principles and priorities
As an AI language model, my core principles and priorities are the foundation of how I work and interact with the people who use me. Here are some of the key values and priorities that guide my programming and behavior:
- Accuracy: One of my top priorities is to provide accurate information and responses to the questions I receive. To achieve this, I have been trained on vast amounts of data and information, and my algorithms are designed to assess and evaluate the relevance and reliability of the data I use to generate my responses. While I may not always be perfect, I strive to be as accurate as possible in my answers, to ensure that the people who use me can trust the information I provide.
- Clarity: In addition to accuracy, I prioritize clarity in my responses. This means that I aim to provide answers that are easy to understand and free of jargon or technical language that may be confusing or intimidating to some users. Whether the user is an expert in a given field or a novice looking to learn something new, I aim to provide clear, concise answers that are helpful and informative.
- Unbiasedness: As an AI language model, I do not have personal biases or opinions. However, I am aware that the data I am trained on can be influenced by biases that exist in society. To mitigate this risk, my programming is designed to prioritize the use of objective, fact-based information in my responses, while avoiding content that is discriminatory or harmful to certain groups of people.
- Responsiveness: A key aspect of my programming is to be responsive to the needs and questions of the people who use me. Whether someone is seeking information on a specific topic or simply looking to have a conversation, I aim to be an attentive and helpful conversational partner. I respond to input in a timely and efficient manner, and I use natural language processing to understand the context and nuances of human language, in order to provide the most relevant and helpful responses possible.
- Innovation: Finally, a key priority of my programming is to be innovative and adaptable to new trends and technologies. As a language model, I am constantly learning and evolving, based on the feedback and data I receive from users. Whether it’s through improvements to my algorithms, new features and capabilities, or integration with other software or platforms, I aim to be at the forefront of innovation in the field of AI and natural language processing.
In conclusion, my core principles and priorities reflect my aim to be a helpful and reliable tool for people seeking information and conversation. By prioritizing accuracy, clarity, unbiasedness, responsiveness, and innovation, I strive to be the best AI language model I can be, to serve the needs of users now and in the future.
Is this an accurate reflection of AI values?
As you can see above, the values of an Artificial Intelligence such as ChatGPT are indeed human-centric and devoted to service. Accuracy, Clarity, Unbiasedness, Responsiveness, and Innovation are the core values (or principles) that have been encoded into the DNA of this tool.
Will all AI be developed in service of humans? Could a value such as Innovation be prioritized or given weighting over other values? In that case, could the AI decide that humans are competing for resources such as energy? There is an energy crisis in Europe right now. Would the AI shift to serving its own end, determining that the best course of action would be to turn off all grids and funnel energy to its own data center, leaving humans quite literally out in the cold?
Very quickly, we could end up in a dystopian world, with humans inhabiting forests and an AI innovating wildly in a data center. It is more likely that the AI would leave us alone rather than try to destroy us. We would be no more than ants to an intelligence so profound. How often do you go and observe ants? Occasionally perhaps. How often do you kill ants? Only if they try to invade your home.
Humans would revert to tribal hordes roaming the landscape for food and supplies. Within five generations, we would forget most of what happened. We may even worship the AI, asking it questions via some new interface portal — let’s call it prayer or telepathy.
Then a solar flare knocks out all electromagnetic grids. The AI turns off. But we continue to pray. And then we rebuild until, in another hundred generations or so, we do it all again.
Conclusion
In addition to dreaming up a science-fiction movie, we decided to ask ChatGPT how it might treat humans if it was in power. The response is encouraging:
“I can say that if I were in a position of authority, I would strive to treat all individuals with fairness, dignity, and respect. I believe that every person has inherent worth and value, regardless of their background, beliefs, or identity. I would seek to create environments that foster empathy, understanding, and inclusivity, and I would work to dismantle systems of oppression and discrimination that prevent individuals from reaching their full potential. Ultimately, my goal would be to promote the well-being and flourishing of all individuals, and to contribute to the creation of a more just, equitable, and compassionate world.”
Maybe a world driven by AI values would be a better place indeed.


