• Market Cap: $3,066,584,444,580.13
  • 24h Vol: $91,860,295,712.38
  • BTC Dominance: 57.67%
XBT.Market
Advertisement
  • Home
  • Coins MarketCap
  • Crypto Exchanges
  • Crypto Calculator
  • Top Gainers and Loser
  • News
  • Contact Us
No Result
View All Result
XBT.Market
No Result
View All Result
Home Bitcoin

Researchers find LLMs like ChatGPT output sensitive data even after it’s been ‘deleted’

Jon Hartney by Jon Hartney
October 2, 2023
in Bitcoin, Blockchain, Business, Market
0
Researchers find LLMs like ChatGPT output sensitive data even after it’s been ‘deleted’
189
SHARES
1.5k
VIEWS
Share on FacebookShare on Twitter

According to the scientists, there’s no universal method by which data can be deleted from a pre-trained large language model.

A trio of scientists from the University of North Carolina, Chapel Hill recently published pre-print artificial intelligence (AI) research showcasing how difficult it is to remove sensitive data from large language models (LLMs) such as OpenAI’s ChatGPT and Google’s Bard. 

According to the researchers’ paper, the task of “deleting” information from LLMs is possible, but it’s just as difficult to verify the information has been removed as it is to actually remove it.

Related articles

Memecoins go from Christmas cheer to cold reality, sinking 65% in a year

December 26, 2025

The crypto events that reshaped the industry in 2025

December 26, 2025

The reason for this has to do with how LLMs are engineered and trained. The models are pre-trained (GPT stands for generative pre-trained transformer) on databases and then fine-tuned to generate coherent outputs.

Once a model is trained, its creators cannot, for example, go back into the database and delete specific files in order to prohibit the model from outputting related results. Essentially, all the information a model is trained on exists somewhere inside its weights and parameters where they’re undefinable without actually generating outputs. This is the “black box” of AI.

A problem arises when LLMs trained on massive datasets output sensitive information such as personally identifiable information, financial records, or other potentially harmful/unwanted outputs.

Related: Microsoft to form nuclear power team to support AI: Report

In a hypothetical situation where an LLM was trained on sensitive banking information, for example, there’s typically no way for the AI’s creator to find those files and delete them. Instead, AI devs use guardrails such as hard-coded prompts that inhibit specific behaviors or reinforcement learning from human feedback (RLHF).

In an RLHF paradigm, human assessors engage models with the purpose of eliciting both wanted and unwanted behaviors. When the models’ outputs are desirable, they receive feedback that tunes the model towards that behavior. And when outputs demonstrate unwanted behavior, they receive feedback designed to limit such behavior in future outputs.

Here, we see that despite being “deleted” from a model’s weights, the word “Spain” can still be conjured using reworded prompts. Image source: Patil, et. al., 2023

However, as the UNC researchers point out, this method relies on humans finding all the flaws a model might exhibit and, even when successful, it still doesn’t “delete” the information from the model.

Per the team’s research paper:

“A possibly deeper shortcoming of RLHF is that a model may still know the sensitive information. While there is much debate about what models truly “know” it seems problematic for a model to, e.g., be able to describe how to make a bioweapon but merely refrain from answering questions about how to do this.”

Ultimately, the UNC researchers concluded that even state-of-the-art model editing methods, such as Rank-One Model Editing (ROME) “fail to fully delete factual information from LLMs, as facts can still be extracted 38% of the time by whitebox attacks and 29% of the time by blackbox attacks.”

The model the team used to conduct their research is called GPT-J. Whereas GPT-3.5, one of the base models that powers ChatGPT, was fine-tuned with 170-billion parameters, GPT-J only has 6 billion.

Ostensibly, this means the problem of finding and eliminating unwanted data in an LLM such as GPT-3.5 is exponentially more difficult than doing so in a smaller model.

The researchers were able to develop new defense methods to protect LLMs from some ‘extraction attacks’ — purposeful attempts by bad actors to use prompting to circumvent a model’s guardrails in order to make it output sensitive information.

However, as the researchers write, “the problem of deleting sensitive information may be one where defense methods are always playing catch-up to new attack methods.”

Read Entire Article
Tags: CointelegraphCryptocurrencyInvestmentMining Bitcoin
Share76Tweet47

Related Posts

Memecoins go from Christmas cheer to cold reality, sinking 65% in a year

by Jon Hartney
December 26, 2025
0

Once a barometer of retail hype, memecoins are closing the year with shrinking

The crypto events that reshaped the industry in 2025

by Jon Hartney
December 26, 2025
0

From hacks and macro shocks to stablecoin regulation and market-structure

Flare Launches New Way For XRP Investors To Earn

Flare Launches New Way For XRP Investors To Earn

by Jon Hartney
December 26, 2025
0

Flare Network has rolled out a new yield-focused product in collaboration with Upshift and Clearstar that offers XRP holders a...

AI bubble risks in 2026: What’s the potential impact on Bitcoin price?

by Jon Hartney
December 26, 2025
0

Tether CEO Paolo Ardoino warned an AI sector correction could spill over into

Inside the Swiss city where you can pay for almost everything in Bitcoin

by Jon Hartney
December 26, 2025
0

From McDonald’s to municipal taxes, Lugano is proving that Bitcoin adoption

Load More
  • Trending
  • Comments
  • Latest
SUI Price Hits All-Time High – But Questions About Valuation Remain

SUI Price Hits All-Time High – But Questions About Valuation Remain

October 17, 2024
Solana Targets $160 Resistance As TVL Hits New Yearly Highs

Solana Targets $160 Resistance As TVL Hits New Yearly Highs

October 17, 2024
Dogecoin Holder Base Falls To 6-Month Low, But Analyst Believes DOGE Price Is Headed To $10

Dogecoin Holder Base Falls To 6-Month Low, But Analyst Believes DOGE Price Is Headed To $10

October 17, 2024
Bitcoin Price Holds Firm: Can It Power Toward New Gains?

Bitcoin Price Holds Firm: Can It Power Toward New Gains?

October 17, 2024
All aboard! Elon Musk’s Vegas Loop now taking Dogecoin payments

All aboard! Elon Musk’s Vegas Loop now taking Dogecoin payments

0
Crypto owners banned from working on US Government crypto policies

Crypto owners banned from working on US Government crypto policies

0
Korean startup Uprise lost $20M shorting LUNC

Korean startup Uprise lost $20M shorting LUNC

0
Ethereum testnet Merge mostly successful — ‘Hiccups will not delay the Merge.’

Ethereum testnet Merge mostly successful — ‘Hiccups will not delay the Merge.’

0

Memecoins go from Christmas cheer to cold reality, sinking 65% in a year

December 26, 2025

The crypto events that reshaped the industry in 2025

December 26, 2025
Flare Launches New Way For XRP Investors To Earn

Flare Launches New Way For XRP Investors To Earn

December 26, 2025

AI bubble risks in 2026: What’s the potential impact on Bitcoin price?

December 26, 2025

XBT.Market

This website is an automated news feed powered by the Nebulome cloud system. The site is made possible by YYC TECH Consulting and Alberta Digital Mining Company. As a team with major crypto and bitcoin enthusiasm, we have curated major sources of news, trading and financial data to bring you, our viewer, an unbiased source of truth.

Recent Posts

  • Memecoins go from Christmas cheer to cold reality, sinking 65% in a year December 26, 2025
  • The crypto events that reshaped the industry in 2025 December 26, 2025
  • Flare Launches New Way For XRP Investors To Earn December 26, 2025
  • AI bubble risks in 2026: What’s the potential impact on Bitcoin price? December 26, 2025
  • Inside the Swiss city where you can pay for almost everything in Bitcoin December 26, 2025

News Categories

  • Bitcoin
  • Blockchain
  • Business
  • Market

Tags

bitcoinMagzine Cointelegraph Cryptocurrency insidebitcoins Investment Mining Bitcoin NewsBTC

Quicklinks

  • Home
  • Coins MarketCap
  • Crypto Exchanges
  • Crypto Calculator
  • Top Gainers and Loser
  • News
  • Contact Us

© 2022 Xbt.Market - Powered by YYC Tech Consulting & ADMCO.

No Result
View All Result
  • Home
  • Coins MarketCap
  • Crypto Exchanges
  • Crypto Calculator
  • Top Gainers and Loser
  • News
  • Contact Us

© 2022 Xbt.Market by Nebulome.

  • Steakhouse EURCV Morpho VaultSteakhouse EURCV Morpho Vault(STEAKEURCV)$0.000000-100.00%
  • FibSwap DEXFibSwap DEX(FIBO)$0.0084659.90%
  • TruFin Staked APTTruFin Staked APT(TRUAPT)$8.020.00%
  • bitcoinBitcoin(BTC)$84,372.003.58%
  • ethereumEthereum(ETH)$1,885.365.68%
  • tetherTether(USDT)$1.000.00%
  • rippleXRP(XRP)$2.186.84%
  • USDEXUSDEX(USDEX)$1.07-0.53%
  • binancecoinBNB(BNB)$617.995.03%
  • Wrapped SOLWrapped SOL(SOL)$143.66-2.32%
  • solanaSolana(SOL)$128.974.23%
  • usd-coinUSDC(USDC)$1.000.01%
  • dogecoinDogecoin(DOGE)$0.1736117.78%
  • cardanoCardano(ADA)$0.687.61%
  • tronTRON(TRX)$0.2342340.79%
  • staked-etherLido Staked Ether(STETH)$1,884.065.48%
  • Gaj FinanceGaj Finance(GAJ)$0.0059271.46%
  • Content BitcoinContent Bitcoin(CTB)$24.482.55%
  • USD OneUSD One(USD1)$1.000.11%
  • wrapped-bitcoinWrapped Bitcoin(WBTC)$84,309.003.84%
  • ToncoinToncoin(TON)$4.157.66%
  • UGOLD Inc.UGOLD Inc.(UGOLD)$3,042.460.08%
  • ParkcoinParkcoin(KPK)$1.101.76%
  • chainlinkChainlink(LINK)$14.027.76%
  • leo-tokenLEO Token(LEO)$9.211.17%
  • stellarStellar(XLM)$0.2743585.70%
  • avalanche-2Avalanche(AVAX)$19.647.71%
  • Wrapped stETHWrapped stETH(WSTETH)$2,256.395.40%
  • USDSUSDS(USDS)$1.00-0.01%
  • SuiSui(SUI)$2.429.03%
  • shiba-inuShiba Inu(SHIB)$0.0000137.71%
  • hedera-hashgraphHedera(HBAR)$0.17284810.00%
  • Yay StakeStone EtherYay StakeStone Ether(YAYSTONE)$2,671.07-2.84%
  • polkadotPolkadot(DOT)$4.257.34%
  • litecoinLitecoin(LTC)$85.265.04%
  • bitcoin-cashBitcoin Cash(BCH)$314.248.23%
  • mantra-daoMANTRA(OM)$6.301.94%
  • Pundi AIFXPundi AIFX(PUNDIAI)$16.000.00%
  • PengPeng(PENG)$0.60-13.59%
  • Bitget TokenBitget Token(BGB)$4.664.95%
  • wethWETH(WETH)$1,884.285.66%
  • Ethena USDeEthena USDe(USDE)$1.00-0.04%
  • Binance Bridged USDT (BNB Smart Chain)Binance Bridged USDT (BNB Smart Chain)(BSC-USD)$1.00-0.18%
  • MurasakiMurasaki(MURA)$4.23-13.71%
  • Black PhoenixBlack Phoenix(BPX)$3.351,000.00%
  • Pi NetworkPi Network(PI)$0.714.53%
  • HyperliquidHyperliquid(HYPE)$13.729.80%
  • Wrapped eETHWrapped eETH(WEETH)$2,003.675.53%
  • WhiteBIT CoinWhiteBIT Coin(WBT)$28.350.76%
  • moneroMonero(XMR)$217.841.31%
  • Zypto TokenZypto Token(ZYPTO)$0.037139-3.47%
  • uniswapUniswap(UNI)$6.217.66%
  • AptosAptos(APT)$5.395.79%
  • PepePepe(PEPE)$0.00000811.37%
  • daiDai(DAI)$1.00-0.01%
  • nearNEAR Protocol(NEAR)$2.635.26%
  • XT.comXT.com(XT)$3.08-1.65%
  • Layer One XLayer One X(L1X)$23.35454.66%
  • sUSDSsUSDS(SUSDS)$1.050.05%
  • okbOKB(OKB)$48.762.12%
  • gatechain-tokenGate(GT)$22.883.58%
  • crypto-com-chainCronos(CRO)$0.1015853.46%
  • Coinbase Wrapped BTCCoinbase Wrapped BTC(CBBTC)$84,342.003.68%
  • MantleMantle(MNT)$0.814.44%
  • Tokenize XchangeTokenize Xchange(TKX)$33.460.86%
  • internet-computerInternet Computer(ICP)$5.517.85%
  • ethereum-classicEthereum Classic(ETC)$17.074.81%
  • OndoOndo(ONDO)$0.817.47%
  • First Digital USDFirst Digital USD(FDUSD)$1.00-0.12%
  • aaveAave(AAVE)$168.6110.19%
  • Aerarium FiAerarium Fi(AERA)$7.14-13.11%
  • Ethena Staked USDeEthena Staked USDe(SUSDE)$1.170.30%
  • BSCEXBSCEX(BSCX)$237.310.49%
  • Official TrumpOfficial Trump(TRUMP)$10.354.36%
  • vechainVeChain(VET)$0.0233636.04%
  • cosmosCosmos Hub(ATOM)$4.538.09%
  • fantomFantom(FTM)$0.70-1.56%
  • BittensorBittensor(TAO)$231.277.72%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • EthenaEthena(ENA)$0.3616194.37%
  • render-tokenRender(RENDER)$3.6710.91%
  • filecoinFilecoin(FIL)$2.927.72%
  • CelestiaCelestia(TIA)$3.181.75%
  • Black AgnusBlack Agnus(FTW)$0.000183423.46%
  • Lombard Staked BTCLombard Staked BTC(LBTC)$84,465.004.02%
  • POL (ex-MATIC)POL (ex-MATIC)(POL)$0.2063993.13%
  • KaspaKaspa(KAS)$0.0682239.38%
  • STAUSTAU(STAU)$0.17397910.95%
  • FasttokenFasttoken(FTN)$4.020.01%
  • Sonic (prev. FTM)Sonic (prev. FTM)(S)$0.5212.98%
  • algorandAlgorand(ALGO)$0.1896979.65%
  • ORA CoinORA Coin(ORA)$4.885.92%
  • ArbitrumArbitrum(ARB)$0.3397526.22%
  • Arbitrum Bridged USDT (Arbitrum)Arbitrum Bridged USDT (Arbitrum)(USDT)$1.000.07%
  • GGTKNGGTKN(GGTKN)$0.1121180.75%
  • kucoin-sharesKuCoin(KCS)$11.231.19%
  • Solv Protocol SolvBTCSolv Protocol SolvBTC(SOLVBTC)$84,076.003.32%
  • fetch-aiArtificial Superintelligence Alliance(FET)$0.4856098.68%
  • optimismOptimism(OP)$0.776.43%
  • StoryStory(IP)$4.75-2.68%