lingo.lol is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for linguists, philologists, and other lovers of languages.

Server stats:

72
active users

#ComputerScience

6 posts6 participants0 posts today

@edutooters
Reflecting on #teaching #HighSchool #ComputerScience led to this question: "What would be the best first language to teach computer science?" (CS != programming)
Constraints:
1. It must have a formal and accepted definition, and be open source & widely available
2. It must embody all the elements (&issues) of "classical" 3G languages
3. It must be accessible to teachers and students alike, worldwide (this would be in the context of a global syllabus (CIE))

Boosts welcome!

🔴 **Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation**

“_While we show that integrating CoT monitors into the reinforcement learning reward can indeed produce more capable and more aligned agents in the low optimization regime, we find that with too much optimization, agents learn obfuscated reward hacking, hiding their intent within the CoT while still exhibiting a significant rate of reward hacking._”

Baker, B. et al. (2025) Monitoring reasoning models for misbehavior and the risks of promoting obfuscation. arxiv.org/abs/2503.11926.

#AI #ArtificialIntelligence #LLM #LLMS #ComputerScience #Obfuscation #Preprint #Academia #Academics @ai @computerscience

arXiv logo
arXiv.orgMonitoring Reasoning Models for Misbehavior and the Risks of Promoting ObfuscationMitigating reward hacking--where AI systems misbehave due to flaws or misspecifications in their learning objectives--remains a key challenge in constructing capable and aligned models. We show that we can monitor a frontier reasoning model, such as OpenAI o3-mini, for reward hacking in agentic coding environments by using another LLM that observes the model's chain-of-thought (CoT) reasoning. CoT monitoring can be far more effective than monitoring agent actions and outputs alone, and we further found that a LLM weaker than o3-mini, namely GPT-4o, can effectively monitor a stronger model. Because CoT monitors can be effective at detecting exploits, it is natural to ask whether those exploits can be suppressed by incorporating a CoT monitor directly into the agent's training objective. While we show that integrating CoT monitors into the reinforcement learning reward can indeed produce more capable and more aligned agents in the low optimization regime, we find that with too much optimization, agents learn obfuscated reward hacking, hiding their intent within the CoT while still exhibiting a significant rate of reward hacking. Because it is difficult to tell when CoTs have become obfuscated, it may be necessary to pay a monitorability tax by not applying strong optimization pressures directly to the chain-of-thought, ensuring that CoTs remain monitorable and useful for detecting misaligned behavior.
Replied in thread

@SocraticEthics The #BrainDrain from #Hungary to the #UnitedStates during #WorldWar2 and the following #ColdWar catapulted #American #Mathematics and #ComputerScience (John Von Neumann…), #Chemistry (George Olah…), #Nuclear #Physics (Leo Szilard…) to become the undisputed world leader in #science and #technology.

That brain drain resulted in the Russia-aligned country full of morons that you see today.

#DonaldTheDeplorable is also a #Putin puppet and starting a brain drain FROM here.

#TIL about Sloot digital coding system - the inventor claimed that he could compress a movie to just 8 KB.

It got a lot of high profile investors but just a few days before Sloot sold his invention, he died of a heart attack.

A hidden hard drive was found in his demonstration devices. Posthumous analysis showed his code uses a known technique and that he probably faked his demonstrations to buy time thinking he could improve the algorithm beyond its mathematical limits.

#compression #computerscience #math #theranos #programming #algorithms

I'm thrilled to announce that I've been accepted as a student at #ColoradoChristianUniversityOnline! I'll be pursuing a B.S. in #ComputerScience. The school's class format seems much more manageable for me than other schools I've attended, and I'm feeling optimistic about this new chapter.

Now begins the challenging process of hunting for #scholarships. Unfortunately, I'm discovering just how inaccessible most scholarship websites are for screen reader users. If anyone knows of #accessible scholarship resources or platforms, I'd greatly appreciate your recommendations!