December 09, 2025In Defense of CuriosityAt the NeurIPS Mechanistic Interpretability Workshop, I was asked to give an opinion on Neel Nanda's recent blog post on "pragmatic interpretability." I chose to respond by recounting the story of Venetian glassmaking. Continue reading "In Defense of Curiosity"Posted by David at 01:08 PM
| Comments (4)
October 31, 2025A Halloween Investment ThoughtWhy are AI stocks rising so quickly? Maybe it's because AI investors (and CEOs etc) are the ones who talk with ChatGPT all day. And ChatGPT has convinced them all that their investment ideas are all genius. Spooky! Happy Halloween. Posted by David at 08:43 PM
| Comments (0)
October 21, 2025The Two MerchantsA traveler came upon two merchants selling magical lamps at a crossroads. The first merchant proclaimed: "My lamp contains a perfect genie! It will deduce your deepest desires from watching your every action - how you spend your gold, where you walk, what makes you smile. Without you speaking a single wish, it will fulfill what you truly want!" The second merchant said quietly: "My lamp contains no genie, only a strange light. When you hold it up to examine your life, it shows you the threads connecting your choices to their consequences, the paths you didn't see, the weight of what you carry. It answers no wishes but asks questions you haven't thought to ask." The crowd flocked to the first merchant. "At last!" they cried, "No more agonizing over what to wish for! The perfect genie will know!" But an old woman approached the second merchant. "I've had many wishes granted in my life," she said. "What I lacked was understanding which wishes were worth making." Years later, the traveler returned. Those who bought the first lamp lived in beautiful palaces that felt strangely empty, surrounded by everything they'd unknowingly revealed they wanted - endless sweets for those who snacked when nervous, mountains of gold for those who hoarded pennies, solitude for those who avoided neighbors. They had become caricatures of their unconsidered habits. Those who bought the second lamp lived more simply but with purpose. They had learned to see their true faces, not in a mirror, but in understanding. The lamp had taught them to wish wisely by first teaching them to see clearly. The story above was written by Claude when I exposed it to my own research and writing, and then asked it to critique the modern AI conception of AI alignment. In our field, we are busy building the first kind of lamp, chasing the belief that AI can figure out how to do what people want. That vision is clearly myopic. It avoids the central challenge of being human, which is: we don't really know what we want. The amazing opportunity in AI is that it might actually be able help us develop the insight and wisdom to understand ourselves. We don't need a genie to think for us. We need AI that can improve our thinking. We need AI that can serve as the second kind of lamp. Posted by David at 02:32 PM
| Comments (0)
October 01, 2025When the Exits CloseLessons from Financial Survival Under Authoritarian Regimes The Hamburg Banker's DilemmaIn the autumn of 1933, Max Warburg sat in his mahogany-paneled office at M.M. Warburg & Co., the bank his family had operated in Hamburg since 1798. Outside, Nazi brownshirts marched through the streets, but inside the bank, Warburg clung to a belief that would ultimately cost him dearly: This too shall pass. Five years later, in August 1938, he would finally flee Germany after the forced sale of his family's bank to "Aryan" associates. His American cousins, who had begun moving assets abroad when Hitler first rose to power, preserved much of their wealth. The difference between partial and total loss? The courage to act on early warnings rather than wait for certainty. This pattern—the gradual tightening of financial controls, the windows of opportunity that slowly close, the devastating cost of optimism—repeats across history with remarkable consistency.... Posted by David at 05:47 AM
| Comments (0)
September 20, 2025The Truth is Our SuperpowerThe firing of Jimmy Kimmel is shocking, but in the wake of the firings of Lisa Cook and Susan Monarez, it is also ridiculous. It perfectly showcases the weakness of authoritarianism. There is no silencing the truth: every time Trump fires another truth-teller, he looks more fearful and incompetent. An obese emperor with no clothes. Continue reading "The Truth is Our Superpower"Posted by David at 09:42 AM
| Comments (2)
August 28, 2025Starvation, Cook, and BaconIn 1932, Stalin's authoritarian central planning program wrecked Soviet agriculture, starving millions. As the crisis deepened, Trofim Lysenko, a mediocre agronomist backed by Stalin, rejected established genetics as "bourgeois pseudoscience" and reorganized Soviet agriculture around his ideological theories. Plants could be trained by their environment, he claimed. Wheat could learn to resist cold through exposure. Scientific evidence was capitalist propaganda. When the predictable disasters struck, Lysenko did not admit error. Instead, he blamed the scientists he had silenced. The geneticists he had purged were "wreckers" and "saboteurs" whose treachery explained why his methods failed. Thousands of real scientists were imprisoned or executed while millions of Soviet citizens starved. What happened next reveals a three-step method that authoritarians use today.... Continue reading "Starvation, Cook, and Bacon"Posted by David at 09:01 PM
| Comments (0)
August 13, 2025Perplexity Chrome would be a DisasterPerplexity has offered to purchase the Chrome browser if the DoJ forces a split from Google. Posted by David at 07:45 PM
| Comments (3)
May 25, 2025Black Box, Blood MoneyIn May 2025, in a luxury Manhattan townhouse, a man hung suspended over a five-story stairwell. His captors—led by crypto investor John Woeltz—had already beaten him and held a gun to his head...
Continue reading "Black Box, Blood Money"
Posted by David at 08:46 AM
| Comments (0)
April 13, 2025Credibility, not CapabilityThe most important thing we build in technology and academia is not capability, but credibility. It does not matter how fast we calculate, how smart we are, or the brilliance of the products or papers we make, if we cannot answer the question "Why should anybody believe anything we say?" Continue reading "Credibility, not Capability"Posted by David at 10:03 AM
| Comments (2)
March 29, 2025MisgivingsIn my life I have paid a lot of tax. And every year, after participating in debates about how to spend it all—town, state, and country—I have been proud to write each tax check even when I disagree with the decisions. This is the first year I have had serious misgivings. Posted by David at 05:28 AM
| Comments (0)
March 25, 2025Freedom and PurposeI spent 20 years making products in industry before switching to teach in academia, so I am frequently asked to compare the two paths by PhD students (and prosepctive PhD students) who are facing the choice between them. Here is my answer: industry and academia fundamentally have two different missions, and when choosing between them you should think about what kind of impact you would like to have on the world.... ![]() Posted by David at 07:17 AM
| Comments (1)
February 21, 2025What it Means to be HumanMy academic field of artificial intelligence continues to barrel ahead, unrelenting, towards the goal of surpassing human cognition. So in my work I frequently confront the question: what do we envision as the purpose of the human in the world that we are creating? Already an AI can plan, reason, write, and solve complex problems faster and better than a human mind. As these capabilities continue to grow, what role do we envision for the humans? Continue reading "What it Means to be Human"Posted by David at 08:29 PM
| Comments (3)
March 28, 2024The Right Kind of Openness for AIThere is a false dichotomy between two alternatives facing us in the burgeoning AI industry today: "open" versus "closed." This dichotomy is being promoted by both sides: Closed-AI advocates (oddly, including the company named "Open AI") justifiably warn about the misuse risks posed by unregulated use of AI and the geopolitical risks posed by exfiltration of weights of large-scale pretrained models, but then they falsely imply that the only solution to these risks is to lock their AI behind an opaque service interface, with no visibility to the internals provided to outsiders. On the other hand, open-AI advocates (including Yann LeCun, one of the giants of our field) correctly point out the huge community benefits that come from transparency and competition, but then they make the mistake of assuming that benefits will be guaranteed if they throw their trained models over the wall to the public, releasing full model weights openly. Both sides are bankrolled by massive monetary investments and project the polished air of billion-dollar confidence. But the ugly truth is that the AI industry is built around an extraordinary uncertainty: although the industry has become expert in the science of creating AI, we are pitifully unequipped to meet the challenge of understanding AI. This unprecedented state of affairs is a direct outgrowth of the nature of modern machine learning: our clever training processes have created systems that contain orders of magnitude more complexity than has ever been created in software before, but no human has examined it. Beyond a superficial level, we do not currently understand what is good or bad or smart or stupid inside these systems. The long-term risk for humanity comes from our ignorance about the limitations, capabilities, and societal impacts of AI as we continue to develop it. Neither the open nor closed models on their own offer a credible path to cracking this problem. Thus we ask: what is the right kind of openness? What ecosystem will lead to a healthy AI industry, built on strong science, transparency, accountability, and innovation? In the talk and paper I have posted at resilience.baulab.info, I discuss the need for a middle path. We do not need to foreclose either or nor closed strategies, but we need a framework of standards and services that will create healthy incentives for companies to pursue vigorous innovation, meaningful transparency, and safety in the public interest. Posted by David at 06:08 AM
| Comments (2)
March 16, 2024ReinventedFollowing my 2017 blog entry, Reinvention, where I had looked back to recount my jump from industry back to academia. Here is a video from the CSAIL 60th anniversary celebration where I finish telling my personal academic story about a career reinvention. If you watch it to the end, you can see the three big lessons about how to do research that I learned during my PhD - and how I learned those lessons. Continue reading "Reinvented"Posted by David at 05:27 PM
| Comments (2)
October 28, 2023Function Vectors in Large Language ModelsIn 1936, Alonzo Church made an amazing discovery: if a function can treat other functions as data, then it becomes so powerful that it can even express unsolvable problems. We know that deep neural networks learn to represent many concepts as data. Do they also learn to treat functions as data? In a new preprint, my student Eric Todd finds evidence that deep networks do contain function references. Inside large transformer language models (like GPT) trained on ordinary text, he discovers internal vectors that behave like functions. These Function Vectors (FVs) can be created from examples, invoked in different contexts, and even composed using vector algebra. But they are different from regular word-embedding vector arithmetic because they trigger complex calculations, rather than just making linear steps in representation space. Read and retweet the Twitter thread Posted by David at 11:17 AM
| Comments (0)
April 02, 2023Is Artificial Intelligence Intelligent?The idea that large language models could be capable of cognition is not obvious. Neural language modeling has been around since Jeff Elman’s 1990 structure-in-time work, but 33 years passed between that initial idea and first contact with ChatGPT. What took so long? In this blog I write about why few saw it coming, why some remain skeptical even in the face of amazing GPT-4 behavior, why machine cognition may be emerging anyway, and what we should study next. Read more at The Visible Net. Posted by David at 03:08 PM
| Comments (0)
March 28, 2023Catching UpToday, I received an email from my good college friend David Maymudes. David got his math degree from Harvard a few years ahead of me, and we have both worked at Microsoft and Google at overlapping times. He is still at Google now. We have both witnessed and helped drive major cycles of platform innovation in the industry in the past (David designed the video API for windows and created the AVI format! And we both worked on Internet Explorer), so David is well aware of the important pieces of work that go into building a new technology ecosystem. From inside Google today, he is a direct witness to the transformation of that company as the profound new approaches to artificial intelligence become a corporate priority. It is obvious that something major is afoot: a new ecosystem is being created. Although David does not directly work on large-scale machine learning, it touches on his work, because it is touching everybody. Despite being an outsider to our field, David reached out to ask some clarifying questions about some specific technical ideas, including RLHF, AI safety, and the new ChatGPT plug-in model. There is so much to catch up on. In response to David’s questions, I wrote up a crash-course in modern large language modeling, which we will delve into in a new blog I am creating. Read more at The Visible Net. Posted by David at 05:44 AM
| Comments (0)
December 28, 2021Running Statistics for PytorchHere is runningstats.py, a useful little module for computing efficient online GPU statistics in Pytorch. Pytorch is great for working with small batches of data: if you want to do some calculations over 100 small images, all the features fit into a single GPU and the pytorch functions are perfect. But what if your data doesn't fit in the GPU all at once? What if they don't even fit into CPU RAM? For example, how would you calculate the median values of a set of a few thousand language features over all of Wikipedia tokens? If the data is small, it's easy: just sort them all and take the middle. But if they don't fit - what to do?
import datasets, runningstats
ds = datasets.load_dataset('wikipedia', '20200501.en')['train']
q = runningstats.Quantile()
for batch in tally(q, ds, batch_size=100, cache='quantile.npz'):
feats = compute_features_from_batch(batch)
q.add(feats) # dim 0 is batch dim; dim 1 is feature dim.
print('median for each feature', q.quantile(0.5))
Here, online algorithms come to the rescue. These are economical algorithms that summarize an endless stream of data using only a small amount of memory. Online algorithms are particularly handy for digesting big data on a GPU where memory is precious. runningstats.py includes running Stat objects for Mean, Variance, Covariance, TopK, Quantile, Bincount, IoU, SecondMoment, CrossCovariance, CrossIoU, as well as an object to accumulate CombinedStats.... Continue reading "Running Statistics for Pytorch"Posted by David at 02:23 PM
| Comments (0)
|
||||||||||||||||||||||||||||||||||||||||||
|
Calendar
Projects
Search
Recent Entries
In Defense of Curiosity
A Halloween Investment Thought The Two Merchants When the Exits Close The Truth is Our Superpower Starvation, Cook, and Bacon Perplexity Chrome would be a Disaster Black Box, Blood Money Credibility, not Capability Misgivings Freedom and Purpose What it Means to be Human The Right Kind of Openness for AI Reinvented Function Vectors in Large Language Models Is Artificial Intelligence Intelligent? Catching Up Running Statistics for Pytorch Reddit AMA Assistant Professor at NEU Khoury PhD Defense Antivax, Antiglyphosate Global Catastrophizing Passwords should be illegal Deception is a Bug Rewriting a Deep Generative Model David's Tips on How to Read Pytorch A COVID Battle Map COVID-19 Chart API The Beginning No Testing is not Cause for Optimism Two Views of the COVID-19 Crisis The Purpose of AI npycat for npy and npz files In Code We Trust? Net Kleptocracy It's Our Responsibility Volo Ergo Sum A Crisis of Purpose Reinvention Government is Not the Problem Oriental Exclusion David Hong-Toh Bau, Sr Dear Senator Collins Trump is a Two-Bit Dictator Network Dissection Learnable Programming Beware the Index Fund Does Watching Fox News Kill You? Our National Identity Outrage is Not Enough A Warning From 1937 Nativist? A Demon-Haunted World By the People, For the People Integrity in Government Thinking Slow Whose Country? Starting at MIT When to Sell One-Off Depreciation Confidence Games Making a $400 Linux Laptop Teaching About Data Code Gym Musical.js Pencil Code at Worcester Technical High School A Bad Chrome Bug PhantomJS and Node.JS Integration Testing in Node.js Second Edition of Pencil Code Learning to Program with CoffeeScript Teaching Math Through Pencil Code Hour of Code at Lincoln Hour of Code at AMSA A New Book and a Thanksgiving Wish Pencil Code: Lesson on Angles Pencil Code: Lesson on Lines Pencil Code: a First Look CoffeeScript Syntax for Kids CSS Color Names For Versus Repeat Book Sample Page Teaching Programming and Defending the Middle Class TurtleBits at Beaver Country Day Book Writing Progress Lessons from Kids Await and Defer Ticks, Animation, and Queueing in TurtleBits Using the TurtleBits Editor Starting with Turtlebits Turtle Bits No Threshold, No Limit Local Variable Debugging with see.js Mapping the Earth with Complex Numbers Conformal Map Viewer Jobs in 1983 The Problem With China Omega Improved Made In America Again
Archives
All Articles
December 2025 October 2025 September 2025 August 2025 May 2025 April 2025 March 2025 February 2025 March 2024 October 2023 April 2023 March 2023 December 2021 November 2021 August 2021 March 2021 October 2020 August 2020 July 2020 April 2020 March 2020 January 2018 December 2017 November 2017 June 2017 May 2017 April 2017 March 2017 January 2017 December 2016 November 2016 October 2016 June 2016 May 2016 September 2015 August 2015 July 2015 October 2014 July 2014 May 2014 January 2014 December 2013 November 2013 October 2013 September 2013 August 2013 April 2013 February 2013 October 2012 September 2012 December 2011 November 2011 October 2011 September 2011 March 2011 February 2011 January 2011 December 2010 November 2010 October 2010 September 2010 June 2010 May 2010 April 2010 March 2010 February 2010 January 2010 November 2009 September 2009 August 2009 July 2009 June 2009 May 2009 April 2009 March 2009 February 2009 January 2009 December 2008 November 2008 October 2008 September 2008 August 2008 June 2008 May 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006 May 2006 April 2006 March 2006 February 2006 January 2006 December 2005 October 2005 September 2005 August 2005 July 2005 June 2005 May 2005 April 2005 January 2004 December 2003 November 2003
Links
Bau family website
Joe
Gary
Eric
Gayle
Reza
Ulysses
Blossom
Howie
Nelson
Glenn
Sacca
Davidmay
Pop
Wag
Physics
Nature
MG
LegoEd
Cedric
Adam
Mark
Scott
Ted
Joel
XMLBeans
Quick Search Bar
Battelle
Bricklin
Digg
Jake
Gilmour
Googlers
HotLinks
Mini
Raymond
RB
RMack
Sam
TM
Volkh
Wonkette
Waxy
Witt
Xooglers
Zawodny
EconView
UChicagoLaw
Older Writing
About
|
||||||||||||||||||||||||||||||||||||||||||
| Copyright 2025 © David Bau. All Rights Reserved. | ||||||||||||||||||||||||||||||||||||||||||