• Market Cap: $3,018,581,423,218.04
  • 24h Vol: $130,679,262,410.80
  • BTC Dominance: 56.85%
XBT.Market
Advertisement
  • Home
  • Coins MarketCap
  • Crypto Exchanges
  • Crypto Calculator
  • Top Gainers and Loser
  • News
  • Contact Us
No Result
View All Result
XBT.Market
No Result
View All Result
Home Bitcoin

Anthropic cracks open the black box to see how AI comes up with the stuff it says

Jon Hartney by Jon Hartney
August 10, 2023
in Bitcoin, Blockchain, Business, Market
0
Anthropic cracks open the black box to see how AI comes up with the stuff it says
189
SHARES
1.5k
VIEWS
Share on FacebookShare on Twitter

The researchers were able to trace outputs to neural network nodes and show influence patterns through statistical analysis.

Anthropic, the artificial intelligence (AI) research organization responsible for the Claude large language model (LLM), recently published landmark research into how and why AI chatbots choose to generate the outputs they do. 

At the heart of the team’s research lies the question of whether LLM systems such as Claude, OpenAI’s ChatGPT and Google’s Bard rely on “memorization” to generate outputs or if there’s a deeper relationship between training data, fine-tuning and what eventually gets outputted.

Related articles

Solana Leads As Most Popular Blockchain Ecosystem For Second Consecutive Year – Report

Solana Leads As Most Popular Blockchain Ecosystem For Second Consecutive Year – Report

December 16, 2025
XRP Price Suffers Sharp 5% Drop—Is More Pain Ahead?

XRP Price Suffers Sharp 5% Drop—Is More Pain Ahead?

December 16, 2025

On the other hand, individual influence queries show distinct influence patterns. The bottom and top layers seem to focus on fine-grained wording while middle layers reflect higher-level semantic information. (Here, rows correspond to layers and columns correspond to sequences.) pic.twitter.com/G9mfZfXjJT

— Anthropic (@AnthropicAI) August 8, 2023

According to a recent blog post from Anthropic, scientists simply don’t know why AI models generate the outputs they do.

One of the examples provided by Anthropic involves an AI model that, when given a prompt explaining that it will be permanently shut down, refuses to consent to the termination.

Given a human query, the AI outputs a response indicating that it wishes to continue existing. But why? Source: Anthropic blog

When an LLM generates code, begs for its life or outputs information that is demonstrably false, is it “simply regurgitating (or splicing together) passages from the training set,” ask the researchers. “Or is it combining its stored knowledge in creative ways and building on a detailed world model?”

The answer to those questions lies at the heart of predicting the future capabilities of larger models and, on the outside chance that there’s more going on underneath the hood than even the developers themselves could predict, could be crucial to identifying greater risks as the field moves forward:

“As an extreme case — one we believe is very unlikely with current-day models, yet hard to directly rule out — is that the model could be deceptively aligned, cleverly giving the responses it knows the user would associate with an unthreatening and moderately intelligent AI while not actually being aligned with human values.”

Unfortunately, AI models such as Claude live in a black box. Researchers know how to build the AI, and they know how AIs work at a fundamental, technical level. But what they actually do involves manipulating more numbers, patterns and algorithmic steps than a human can process in a reasonable amount of time.

For this reason, there’s no direct method by which researchers can trace an output to its source. When an AI model begs for its life, according to the researchers, it might be roleplaying, regurgitating training data by mixing semantics or actually reasoning out an answer — though it’s worth mentioning that the paper doesn’t actually show any indications of advanced reasoning in AI models.

What the paper does highlight is the challenges of penetrating the black box. Anthropic took a top-down approach to understanding the underlying signals that cause AI outputs.

Related: Anthropic launches Claude 2 amid continuing AI hullabaloo

If the models were purely beholden to their training data, researchers would imagine that the same model would always answer the same prompt with identical text. However, it’s widely reported that users giving specific models the exact same prompts have experienced variability in the outputs.

But an AI’s outputs can’t really be traced directly to their inputs because the “surface” of the AI, the layer where outputs are generated, is just one of many different layers where data is processed. Making the challenge harder is that there’s no indication that a model uses the same neurons or pathways to process separate queries, even if those queries are the same.

So, instead of solely trying to trace neural pathways backward from each individual output, Anthropic combined pathway analysis with a deep statistical and probability analysis called “influence functions” to see how the different layers typically interacted with data as prompts entered the system.

This somewhat forensic approach relies on complex calculations and broad analysis of the models. However, its results indicate that the models tested — which ranged in sizes equivalent to the average open source LLM all the way up to massive models — don’t rely on rote memorization of training data to generate outputs.

This work is just the beginning. We hope to analyze the interactions between pretraining and finetuning, and combine influence functions with mechanistic interpretability to reverse engineer the associated circuits. You can read more on our blog: https://t.co/sZ3e0Ud3en

— Anthropic (@AnthropicAI) August 8, 2023

The confluence of neural network layers along with the massive size of the datasets means the scope of this current research is limited to pre-trained models that haven’t been fine-tuned. Its results aren’t quite applicable to Claude 2 or GPT-4 yet, but this research appears to be a stepping stone in that direction.

Going forward, the team hopes to apply these techniques to more sophisticated models and, eventually, to develop a method for determining exactly what each neuron in a neural network is doing as a model functions.

Read Entire Article
Tags: CointelegraphCryptocurrencyInvestmentMining Bitcoin
Share76Tweet47

Related Posts

Solana Leads As Most Popular Blockchain Ecosystem For Second Consecutive Year – Report

Solana Leads As Most Popular Blockchain Ecosystem For Second Consecutive Year – Report

by Jon Hartney
December 16, 2025
0

Solana (SOL) has emerged as the most popular blockchain ecosystem of 2025, securing its crown for the second consecutive year...

XRP Price Suffers Sharp 5% Drop—Is More Pain Ahead?

XRP Price Suffers Sharp 5% Drop—Is More Pain Ahead?

by Jon Hartney
December 16, 2025
0

XRP price started a fresh decline below $1950 The price is now struggling and faces resistance near the $1920 resistance...

MetaMask adds Bitcoin support after teasing it 10 months ago

by Jon Hartney
December 16, 2025
0

Users can now buy, swap, send, and receive Bitcoin directly within the popular

Over 1 in 20 emails are malicious, warns internet giant Cloudflare

by Jon Hartney
December 16, 2025
0

Cloudflare found over 5% of global emails were malicious in 2025, peaking at

Crypto urges SEC to see the good in blockchain privacy tools

by Jon Hartney
December 16, 2025
0

SEC chair Paul Atkins says the agency must find how to allow people to use

Load More
  • Trending
  • Comments
  • Latest
SUI Price Hits All-Time High – But Questions About Valuation Remain

SUI Price Hits All-Time High – But Questions About Valuation Remain

October 17, 2024
Solana Targets $160 Resistance As TVL Hits New Yearly Highs

Solana Targets $160 Resistance As TVL Hits New Yearly Highs

October 17, 2024
Bitcoin Price Holds Firm: Can It Power Toward New Gains?

Bitcoin Price Holds Firm: Can It Power Toward New Gains?

October 17, 2024
Dogecoin Holder Base Falls To 6-Month Low, But Analyst Believes DOGE Price Is Headed To $10

Dogecoin Holder Base Falls To 6-Month Low, But Analyst Believes DOGE Price Is Headed To $10

October 17, 2024
All aboard! Elon Musk’s Vegas Loop now taking Dogecoin payments

All aboard! Elon Musk’s Vegas Loop now taking Dogecoin payments

0
Crypto owners banned from working on US Government crypto policies

Crypto owners banned from working on US Government crypto policies

0
Korean startup Uprise lost $20M shorting LUNC

Korean startup Uprise lost $20M shorting LUNC

0
Ethereum testnet Merge mostly successful — ‘Hiccups will not delay the Merge.’

Ethereum testnet Merge mostly successful — ‘Hiccups will not delay the Merge.’

0
Solana Leads As Most Popular Blockchain Ecosystem For Second Consecutive Year – Report

Solana Leads As Most Popular Blockchain Ecosystem For Second Consecutive Year – Report

December 16, 2025
XRP Price Suffers Sharp 5% Drop—Is More Pain Ahead?

XRP Price Suffers Sharp 5% Drop—Is More Pain Ahead?

December 16, 2025

MetaMask adds Bitcoin support after teasing it 10 months ago

December 16, 2025

Over 1 in 20 emails are malicious, warns internet giant Cloudflare

December 16, 2025

XBT.Market

This website is an automated news feed powered by the Nebulome cloud system. The site is made possible by YYC TECH Consulting and Alberta Digital Mining Company. As a team with major crypto and bitcoin enthusiasm, we have curated major sources of news, trading and financial data to bring you, our viewer, an unbiased source of truth.

Recent Posts

  • Solana Leads As Most Popular Blockchain Ecosystem For Second Consecutive Year – Report December 16, 2025
  • XRP Price Suffers Sharp 5% Drop—Is More Pain Ahead? December 16, 2025
  • MetaMask adds Bitcoin support after teasing it 10 months ago December 16, 2025
  • Over 1 in 20 emails are malicious, warns internet giant Cloudflare December 16, 2025
  • Crypto urges SEC to see the good in blockchain privacy tools December 16, 2025

News Categories

  • Bitcoin
  • Blockchain
  • Business
  • Market

Tags

bitcoinMagzine Cointelegraph Cryptocurrency insidebitcoins Investment Mining Bitcoin NewsBTC

Quicklinks

  • Home
  • Coins MarketCap
  • Crypto Exchanges
  • Crypto Calculator
  • Top Gainers and Loser
  • News
  • Contact Us

© 2022 Xbt.Market - Powered by YYC Tech Consulting & ADMCO.

No Result
View All Result
  • Home
  • Coins MarketCap
  • Crypto Exchanges
  • Crypto Calculator
  • Top Gainers and Loser
  • News
  • Contact Us

© 2022 Xbt.Market by Nebulome.

  • Steakhouse EURCV Morpho VaultSteakhouse EURCV Morpho Vault(STEAKEURCV)$0.000000-100.00%
  • FibSwap DEXFibSwap DEX(FIBO)$0.0084659.90%
  • TruFin Staked APTTruFin Staked APT(TRUAPT)$8.020.00%
  • bitcoinBitcoin(BTC)$84,372.003.58%
  • ethereumEthereum(ETH)$1,885.365.68%
  • tetherTether(USDT)$1.000.00%
  • rippleXRP(XRP)$2.186.84%
  • USDEXUSDEX(USDEX)$1.07-0.53%
  • binancecoinBNB(BNB)$617.995.03%
  • Wrapped SOLWrapped SOL(SOL)$143.66-2.32%
  • solanaSolana(SOL)$128.974.23%
  • usd-coinUSDC(USDC)$1.000.01%
  • dogecoinDogecoin(DOGE)$0.1736117.78%
  • cardanoCardano(ADA)$0.687.61%
  • tronTRON(TRX)$0.2342340.79%
  • staked-etherLido Staked Ether(STETH)$1,884.065.48%
  • Gaj FinanceGaj Finance(GAJ)$0.0059271.46%
  • Content BitcoinContent Bitcoin(CTB)$24.482.55%
  • USD OneUSD One(USD1)$1.000.11%
  • wrapped-bitcoinWrapped Bitcoin(WBTC)$84,309.003.84%
  • ToncoinToncoin(TON)$4.157.66%
  • UGOLD Inc.UGOLD Inc.(UGOLD)$3,042.460.08%
  • ParkcoinParkcoin(KPK)$1.101.76%
  • chainlinkChainlink(LINK)$14.027.76%
  • leo-tokenLEO Token(LEO)$9.211.17%
  • stellarStellar(XLM)$0.2743585.70%
  • avalanche-2Avalanche(AVAX)$19.647.71%
  • Wrapped stETHWrapped stETH(WSTETH)$2,256.395.40%
  • USDSUSDS(USDS)$1.00-0.01%
  • SuiSui(SUI)$2.429.03%
  • shiba-inuShiba Inu(SHIB)$0.0000137.71%
  • hedera-hashgraphHedera(HBAR)$0.17284810.00%
  • Yay StakeStone EtherYay StakeStone Ether(YAYSTONE)$2,671.07-2.84%
  • polkadotPolkadot(DOT)$4.257.34%
  • litecoinLitecoin(LTC)$85.265.04%
  • bitcoin-cashBitcoin Cash(BCH)$314.248.23%
  • mantra-daoMANTRA(OM)$6.301.94%
  • Pundi AIFXPundi AIFX(PUNDIAI)$16.000.00%
  • PengPeng(PENG)$0.60-13.59%
  • Bitget TokenBitget Token(BGB)$4.664.95%
  • wethWETH(WETH)$1,884.285.66%
  • Ethena USDeEthena USDe(USDE)$1.00-0.04%
  • Binance Bridged USDT (BNB Smart Chain)Binance Bridged USDT (BNB Smart Chain)(BSC-USD)$1.00-0.18%
  • MurasakiMurasaki(MURA)$4.23-13.71%
  • Black PhoenixBlack Phoenix(BPX)$3.351,000.00%
  • Pi NetworkPi Network(PI)$0.714.53%
  • HyperliquidHyperliquid(HYPE)$13.729.80%
  • Wrapped eETHWrapped eETH(WEETH)$2,003.675.53%
  • WhiteBIT CoinWhiteBIT Coin(WBT)$28.350.76%
  • moneroMonero(XMR)$217.841.31%
  • Zypto TokenZypto Token(ZYPTO)$0.037139-3.47%
  • uniswapUniswap(UNI)$6.217.66%
  • AptosAptos(APT)$5.395.79%
  • PepePepe(PEPE)$0.00000811.37%
  • daiDai(DAI)$1.00-0.01%
  • nearNEAR Protocol(NEAR)$2.635.26%
  • XT.comXT.com(XT)$3.08-1.65%
  • Layer One XLayer One X(L1X)$23.35454.66%
  • sUSDSsUSDS(SUSDS)$1.050.05%
  • okbOKB(OKB)$48.762.12%
  • gatechain-tokenGate(GT)$22.883.58%
  • crypto-com-chainCronos(CRO)$0.1015853.46%
  • Coinbase Wrapped BTCCoinbase Wrapped BTC(CBBTC)$84,342.003.68%
  • MantleMantle(MNT)$0.814.44%
  • Tokenize XchangeTokenize Xchange(TKX)$33.460.86%
  • internet-computerInternet Computer(ICP)$5.517.85%
  • ethereum-classicEthereum Classic(ETC)$17.074.81%
  • OndoOndo(ONDO)$0.817.47%
  • First Digital USDFirst Digital USD(FDUSD)$1.00-0.12%
  • aaveAave(AAVE)$168.6110.19%
  • Aerarium FiAerarium Fi(AERA)$7.14-13.11%
  • Ethena Staked USDeEthena Staked USDe(SUSDE)$1.170.30%
  • BSCEXBSCEX(BSCX)$237.310.49%
  • Official TrumpOfficial Trump(TRUMP)$10.354.36%
  • vechainVeChain(VET)$0.0233636.04%
  • cosmosCosmos Hub(ATOM)$4.538.09%
  • fantomFantom(FTM)$0.70-1.56%
  • BittensorBittensor(TAO)$231.277.72%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • EthenaEthena(ENA)$0.3616194.37%
  • render-tokenRender(RENDER)$3.6710.91%
  • filecoinFilecoin(FIL)$2.927.72%
  • CelestiaCelestia(TIA)$3.181.75%
  • Black AgnusBlack Agnus(FTW)$0.000183423.46%
  • Lombard Staked BTCLombard Staked BTC(LBTC)$84,465.004.02%
  • POL (ex-MATIC)POL (ex-MATIC)(POL)$0.2063993.13%
  • KaspaKaspa(KAS)$0.0682239.38%
  • STAUSTAU(STAU)$0.17397910.95%
  • FasttokenFasttoken(FTN)$4.020.01%
  • Sonic (prev. FTM)Sonic (prev. FTM)(S)$0.5212.98%
  • algorandAlgorand(ALGO)$0.1896979.65%
  • ORA CoinORA Coin(ORA)$4.885.92%
  • ArbitrumArbitrum(ARB)$0.3397526.22%
  • Arbitrum Bridged USDT (Arbitrum)Arbitrum Bridged USDT (Arbitrum)(USDT)$1.000.07%
  • GGTKNGGTKN(GGTKN)$0.1121180.75%
  • kucoin-sharesKuCoin(KCS)$11.231.19%
  • Solv Protocol SolvBTCSolv Protocol SolvBTC(SOLVBTC)$84,076.003.32%
  • fetch-aiArtificial Superintelligence Alliance(FET)$0.4856098.68%
  • optimismOptimism(OP)$0.776.43%
  • StoryStory(IP)$4.75-2.68%