Theoretical Computer Science for AI Alignment … and More
In this terrifying time for the world, I’m delighted to announce a little glimmer of good news. I’m receiving a large grant from the wonderful Open Philanthropy, to build up a group of students and postdocs over the next few years, here at UT Austin, to do research in theoretical computer science that’s motivated by AI alignment. We’ll think about some of the same topics I thought about in my time at OpenAI—interpretability of neural nets, cryptographic backdoors, out-of-distribution generalization—but we also hope to be a sort of “consulting shop,” to whom anyone in the alignment community can come with theoretical computer science problems.
I already have two PhD students and several undergraduate students working in this direction. If you’re interested in doing a PhD in CS theory for AI alignment, feel free to apply to the CS PhD program at UT Austin this coming December and say so, listing me as a potential advisor.
Meanwhile, if you’re interested in a postdoc in CS theory for AI alignment, to start as early as this coming August, please email me your CV and links to representative publications, and arrange for two recommendation letters to be emailed to me.
The Open Philanthropy project will put me in regular contact with all sorts of people who are trying to develop complexity theory for AI interpretability and alignment. One great example of such a person is Eric Neyman—previously a PhD student of Tim Roughgarden at Columbia, now at the Alignment Research Center, the Berkeley organization founded by my former student Paul Christiano. Eric has asked me to share an exciting announcement, along similar lines to the above:
The Alignment Research Center (ARC) is looking for grad students and postdocs for its visiting researcher program. ARC is trying to develop algorithms for explaining neural network behavior, with the goal of advancing AI safety (see here for a more detailed summary). Our research approach is fairly theory-focused, and we are interested in applicants with backgrounds in CS theory or ML. Visiting researcher appointments are typically 10 weeks long, and are offered year-round.
If you are interested, you can apply here. (The link also provides more details about the role, including some samples of past work done by ARC.) If you have any questions, feel free to email hiring@alignment.org.
Some of my students and I are working closely with the ARC team. I like what I’ve seen of their research so far, and would encourage readers with the relevant background to apply.
Meantime, I of course continue to be interested in quantum computing! I’ve applied for multiple grants to continue doing quantum complexity theory, though whether or not I can get such grants will alas depend (among other factors) whether the US National Science Foundation continues to exist, as more than a shadow of what it was. The signs look ominous; Science magazine reports that the NSF just cut by half the number of awarded graduate fellowships, and this has almost certainly directly affected students who I know and care about.
Meantime we all do the best we can. My UTCS colleague, Chandrajit Bajaj, is currently seeking a postdoc in the general area of Statistical Machine Learning, Mathematics, and Statistical Physics, for up to three years. Topics include:
- Learning various dynamical systems through their Stochastic Hamiltonians. This involves many subproblems in geometry, stochastic optimization and stabilized flows which would be interesting in their own right.
- Optimizing task dynamics on different algebraic varieties of applied interest — Grassmanians, the Stiefel and Flag manifolds, Lie groups, etc.
If you’re interested, please email Chandra at bajaj@cs.utexas.edu.
Thanks so much to the folks at Open Philanthropy, and to everyone else doing their best to push basic research forward even while our civilization is on fire.