#132 – Nova DasSarma on why information security may be critical to the safe development of AI systems

By Robert Wiblin and Keiran Harris · Published June 14th, 2022 ·

#132 – Nova DasSarma on why information security may be critical to the safe development of AI systems

By Robert Wiblin and Keiran Harris · Published June 14th, 2022

If a business has spent $100 million developing a product, it’s a fair bet that they don’t want it stolen in two seconds and uploaded to the web where anyone can use it for free.

This problem exists in extreme form for AI companies. These days, the electricity and equipment required to train cutting-edge machine learning models that generate uncanny human text and images can cost tens or hundreds of millions of dollars. But once trained, such models may be only a few gigabytes in size and run just fine on ordinary laptops.

Today’s guest, the computer scientist and polymath Nova DasSarma, works on computer and information security for the AI company Anthropic with the security team. One of her jobs is to stop hackers exfiltrating Anthropic’s incredibly expensive intellectual property, as recently happened to Nvidia. As she explains, given models’ small size, the need to store such models on internet-connected servers, and the poor state of computer security in general, this is a serious challenge.

The worries aren’t purely commercial though. This problem looms especially large for the growing number of people who expect that in coming decades we’ll develop so-called artificial ‘general’ intelligence systems that can learn and apply a wide range of skills all at once, and thereby have a transformative effect on society.

If aligned with the goals of their owners, such general AI models could operate like a team of super-skilled assistants, going out and doing whatever wonderful (or malicious) things are asked of them. This might represent a huge leap forward for humanity, though the transition to a very different new economy and power structure would have to be handled delicately.

If unaligned with the goals of their owners or humanity as a whole, such broadly capable models would naturally ‘go rogue,’ breaking their way into additional computer systems to grab more computing power — all the better to pursue their goals and make sure they can’t be shut off.

As Nova explains, in either case, we don’t want such models disseminated all over the world before we’ve confirmed they are deeply safe and law-abiding, and have figured out how to integrate them peacefully into society. In the first scenario, premature mass deployment would be risky and destabilising. In the second scenario, it could be catastrophic — perhaps even leading to human extinction if such general AI systems turn out to be able to self-improve rapidly rather than slowly, something we can only speculate on at this point.

If highly capable general AI systems are coming in the next 10 or 20 years, Nova may be flying below the radar with one of the most important jobs in the world.

We’ll soon need the ability to ‘sandbox’ (i.e. contain) models with a wide range of superhuman capabilities, including the ability to learn new skills, for a period of careful testing and limited deployment — preventing the model from breaking out, and criminals from breaking in. Nova and her colleagues are trying to figure out how to do this, but as this episode reveals, even the state of the art is nowhere near good enough.

In today’s conversation, Rob and Nova cover:

How good or bad information security is today
The most secure computer systems that exist today
How to design an AI training compute centre for maximum efficiency
Whether ‘formal verification’ can help us design trustworthy systems
How wide the practical gap is between AI capabilities and AI safety
How to disincentivise hackers
What should listeners do to strengthen their own security practices
Jobs at Anthropic
And a few more things as well

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript below.

Producer: Keiran Harris
Audio mastering: Ben Cordell and Beppe Rådvik
Transcriptions: Katy Moore

Highlights

How organisations can protect against hacks

Nova DasSarma: I think that one of the biggest things you can do as an organization to avoid people getting hacked is to give them hardware that you control, and to have a unified fleet of hardware. For example, if you’ve got a fleet of computers that are all MacBooks that are all centrally secured and have encryption on the disk, then you can limit the damage that’s done by a hack to some Windows firewall or something like that to non-existent — because you’re only running on MacOS. And you are also limiting things like, I went on AliExpress to buy a BLÅHAJ plushie or something and it turned out that the vendor sent me a Word macro file that broke into my computer. But I did that on my personal device, and so now they would need to be able to jump another level into your corporate device. And you can do things like lock down the software that’s on there, lock down the websites they can access.
Nova DasSarma: Please use an ad blocker. Ads are important for the health of many companies, but if they’re a legit company, oftentimes you can support them in other ways. Ad networks are one of the easiest ways for an actor to inject malicious code onto a site that might otherwise be safe. So use an ad blocker.
Rob Wiblin: I guess Anthropic is a reasonably new organization, it’s a year or two old. Is there anything important that you did in setting up the way the systems work that secure it to some degree?
Nova DasSarma: Sure. I think that having corporate devices is pretty important. Another thing to think about is we used to talk about trusted networks and having things like a corporate network. And Google’s done some pretty good work with things like BeyondCorp, where you don’t really think about a VPN or something like that for security — you instead think about identity-based authentication. There’s no such thing as a “trusted network” where when you get on it, you no longer have to authenticate to grab files off of a shared SharePoint drive or something like that. You’re always authenticating at every step.
Nova DasSarma: The other thing that we do — that I suggest to every organization — is to think about single sign-on to make sure that you don’t have your users managing passwords for many services, juggling passwords around, where it can get very tedious for them to use a different password for every service. Using things like a password manager and single sign-on can help mitigate some of those flaws.

Top recommendations for personal computer security

Nova DasSarma: Number one I would say is use two-factor authentication everywhere you can. It doesn’t matter if your password gets compromised if you are able to deny access anyways, because they don’t have a hardware key. I think on that same front, use a password manager. Please use a password manager. You are decreasing the blast radius of any given compromised password by doing this. So do that.
Rob Wiblin: And that’s because if you use similar passwords or the same passwords in lots of different places, then if someone steals one, then they’ve stolen all of them.
Nova DasSarma: Exactly. Yes. And that being said, even if you make no other changes and you don’t actually use a password manager, certainly have your mail password be different… That is the skeleton key for resetting so many things. Very frustrating sometimes. And if we had to pick a third thing, I mentioned use an ad blocker. I’m going to say it again. I think that using an ad blocker is really, really, really important here in preventing random malicious code from being injected in your computer.
—-
Rob Wiblin: Yeah. I imagine everyone listening to this show is going to be familiar with two-factor authentication, where you get that six-digit code that you take out of your phone or from SMS and plug it in. The thing that we ideally will be switching over to for almost everything is one of these hardware keys, which is a thing that you plug into your computer and you press a little button on it and it does a cryptographic operation to prove that you had that key on you.
Rob Wiblin: I think that there’s a bunch of different ways that’s a whole lot better, but one of them is that it’s a lot less vulnerable to phishing attacks. So effectively, even if you have two-factor authentication where you’re getting that six-digit number, if someone sends you an email and directs you to a fake login website, they can just immediately take both the password that you’ve put in and the six digits that you’ve put in — the second factor that you’ve put in — and just log in as if they were you somewhere else. And that is, I think, quite a common way to break into people’s accounts. But that is basically not possible with these U2F hardware keys that have become reasonably common, and you can now lock up your Google account and your Facebook account and many other accounts with those.
Nova DasSarma: Yeah. And I really recommend those. I also recommend getting two keys, because I think one of the concerns people often have is that they might lose this key, and that’s a reasonable concern. So you should have two of them. Almost any reputable site will let you register multiple keys. Keep one of them in a secure location and keep the other one on your keychain. And you’ll do a lot better.
Nova DasSarma: I also didn’t mention if you’re buying technology, try and ensure that you’re buying it from a vendor who is reputable. It’s very easy to buy things like USB cables and stuff like that from whoever has the lowest dollar amount on Amazon. Keep in mind that if you’re plugging something into your computer, you are giving whoever produced the device hardware access to it. Even something like a USB cable can be pretty compromised. You have no way of looking inside that cable really and checking if there’s a chip there that when your computer’s away will turn into a keyboard to start typing some stuff there or something like that. And that’s not a theoretical attack — we absolutely see these in the wild.

State of the art in information security

Rob Wiblin: My perception, as someone who takes a slight amateur interest in information security issues, is that the state of the art is very bad. That we do not really have reliable ways of stopping a really advanced, well-funded adversary from stealing data, if this is something that they’re willing to invest a lot of human capital in. Is that kind of right?
Nova DasSarma: I think that’s kind of right. I’ve got a story here around this. A state that will not be named had an attack that was in the news recently, that was a zero-click vulnerability on iMessage. A “zero-click vulnerability” is one where the user doesn’t have to take any actions for them to be compromised. And this had to do with something called the JBIG2 compression algorithm, which you might have heard about, because back in the day, Xerox used to use this for copiers. It’s a compression algorithm, which means that you can copy things faster. But it turns out that if you turn the compression up too high, it turns zeros to nines and vice versa, which is quite bad for numerics.
Nova DasSarma: That being said, JBIG2 was also the culprit in this case, where their compression algorithm is dynamic — which means that you can specify patterns on the fly. It turns out that if you construct a file that has the JBIG2 codec in it, then you can construct logical gates out of this. Which means that in theory, it’s Turing complete — and in practice, it was Turing complete. So to deliver this vulnerability, they produced a computer within the JBIG2 decompression algorithm to deliver the payload to these phones.
Nova DasSarma: And that’s the sort of thing where you could theoretically have defended against this, but the way that you defended against this was least access — so not being able to access anything on your phones, or not having phones. Both of these things are really quite difficult to implement in an organization above a certain size that doesn’t have a very, very strong security mindset.
Nova DasSarma: So that’s on the state access side. That being said, the thing that works the most is always going to be a social attack. So something where you meet someone at a party, and they seem nice, and they become your friend. And then you let them into your building when you maybe shouldn’t have done that, and they plug a USB into your system, and you’re done. We talk about physical access being the end of the line in security oftentimes. So that being said, yes.

Motivated 14-year-old hackers

Rob Wiblin: I guess I have this stereotype from the past that computer security is bad enough that a motivated 14-year-old who hasn’t been to university yet, but just is really into computers, can probably do some interesting hacking, break into systems that you’d be kind of surprised that they could get into. But I wonder whether that might actually be an outdated stereotype, and whether perhaps things have improved sufficiently that a 14-year-old actually might struggle to do anything interesting at this point. Do you know where we stand on that?
Nova DasSarma: I think that stereotype is still quite accurate. Broadly, there is more software than there used to be. So a lot of the targets that were on that lower end of the security spectrum, there just are more of them. I think that until we find ways to create secure systems by default, instead of having to do security as more of an afterthought, we are going to continue to see situations where a script kiddie with a piece of software that they downloaded off of GitHub can do a vulnerability scan and deface some website or something like that. I think it’s a lot harder for them than it used to be to break into things like whitehouse.gov or something like that.
Rob Wiblin: Yeah, I see. Maybe the top end has gotten more secure as this has become more professionalized, but there’s so many more things on computers now in general that the number of things that are not secure is still plenty.
Nova DasSarma: Exactly, yes. And I think in some ways this is good — having systems that kids are able to break into is in fact a good thing. But we’ve seen some really cool stuff in terms of websites where you’ve got a Capture the Flag scenario, where you’re meant to try and break into one level and then it gets to the next level. Then there’s some key that you have to find for the next one. And these are actually really, really fun. I think it’s a great way to get kids interested in security. I would obviously not condone somebody trying to break into arbitrary websites, but certainly there are tools that are actually fun to do this with.
Rob Wiblin: How illegal is it to break into a website or something that doesn’t matter that much, just as a matter of getting practice and training? Assuming you don’t do any damage whatever?
Nova DasSarma: Very illegal and you shouldn’t do it. But I would say that if you’re interested in trying to do some kind of vulnerability testing, I would contact that website and ask them. Because a lot of Silicon Valley mindset is to ask for forgiveness, not permission. Computer security and data losses is not one of those things. This is what one would call a crime. I don’t recommend it.

Is the typical engineer working on non-safety research increasing or decreasing the odds of an artificial intelligence-related catastrophe?

Nova DasSarma: You know what, that’s a really hard question to answer. I think that if you come at things from the perspective that all capabilities work of any kind is negative, and it is of itself increasing the odds of an AGI-related catastrophe, then that would be an answer to your question. But that’s not my model. In some ways, I think that having capabilities that are within organizations that have pledged to treat things responsibly are probably decreasing these odds.
Nova DasSarma: Things like the OpenAI APIs for accessing models mean that more people have access to language models in a way where there is a gatekeeper — where there is a layer of safety and a way of imposing values onto the users of that model in a way that is fundamentally not true if you have a large number of actors. So I think that it’s very hard to say. I would say that I think certainly there are safety researchers at OpenAI, certainly there are safety researchers at DeepMind, and I think that those organizations also are thinking very thoughtfully about these things. And I’m hopeful that they are decreasing the odds of an AGI-related catastrophe. If you made me answer that question, I think that would be my answer there.
Rob Wiblin: Yeah, it’s certainly the case that it’s not only possible for the safety orientation or the thoughtfulness or the niceness of the actors working on AGI to improve. It could also get worse over time, or people can just become more scattered and more arms race-y over time, which is definitely a factor that complicates this question.
Nova DasSarma: And I think that there’s in some ways an evaporative cooling effect if you say that you’re never going to work at a place that has any capabilities, because then all you have working there are people who are purely interested in, “Let’s crank it to 11. Let’s drive this forward.” I think that having safety researchers there is important. And having this sort of collegiality with other organizations, and having standards, and being able to talk to each other about these things is important. So I think that the typical engineer there is probably decreasing these odds, just out of a matter-of-fact consolidation of intellectual capital and capabilities within a smaller number of folks who can then cooperate.

Interesting design choices with big ML models

Rob Wiblin: Something we haven’t talked about yet that I’m really excited to learn more about is what interesting design choices there are when you’re putting together a cluster, to try to get as much useful compute when you’re doing big ML models. What are the interesting optimizations that you could make?
Nova DasSarma: I love this. This is one of the best parts of my job. So for making a cluster for ML models, there are two basic components: you’ve got these units that can do floating-point operations and matrix multiplication, and you’ve got cables that are connecting them together at a certain speed. And you’ve got some memory, that sort of thing. So the design choices that you’re making are tradeoffs between mostly price and how much bandwidth you have between these units. The other thing is a design choice on the software end: are you writing your software in a way where it can easily be distributed over multiple GPUs and things like that. So that’s also a design choice, though mostly on the software side.
Nova DasSarma: And some logistical choices. You want to try and run your stuff where the power is cheap and the real estate is cheap. It turns out that rent for a large number of servers is actually a big cost. You also want to think about how easy it’s going to be to expand. Hofvarpnir made a pretty big mistake starting out. As a bit of background, almost all of my systems that I run for myself are run on hardware that I own, in people’s basements and stuff. It’s this global network of lots of computers that are all talking to each other, but they’re all on residential connections and they have interruptions and stuff like that. And this turns out to be really cheap when you’re a college student. It’s a great idea for anyone who’s trying to work on a system, and they’re trying to do something scrappy — I recommend doing that first.
Nova DasSarma: But it turns out that time is the limiting factor in some ways. So when we first got our EA Infrastructure money, we were like, “OK, we’re going to buy all these components ourselves and assemble.” And then we had some technical difficulties, and we had some difficulties with the provider not having the right width for the racks and stuff like that, and all sorts of really weird integration issues. And it turns out that it was better in the end to go with an integrator for future expansions, because it means that we can ask them, “We’re going to use these standardized units, and I just want four more of them. We’re just going to put them right next to each other,” and that sort of thing.
Nova DasSarma: Going back to the design thing though, the bandwidth does become a concern there, because you have a limit to how much bandwidth a certain switch can carry, and you might want to get a more expensive switch if you think that your cluster’s going to expand more. So that’s one of those things that’s a more direct design choice.

Side-channel attacks

Rob Wiblin: A vulnerability that I’ve always worried about is that you are inserting passwords, including often your password to your main password manager, into a browser. And all of the extensions that you have within that browser, I think can kind of see those passwords, or they can see the keystrokes that you’re entering into the website. And Chrome extensions, Firefox extensions have a record of being regularly compromised or regularly sold to bad actors, who then use them potentially to steal passwords. Are there a lot of these just gaping holes in the way that ordinary people use computers that are making them much more vulnerable than they really ought to be?
Nova DasSarma: Yeah. In security, we talk about these as side-channel attacks, where the primary channel would be breaking into your bank and the side channel is when you put a keylogger on somebody’s desktop so you can grab their password instead. Certainly extensions are a big concern here. I use quite a few extensions. I think that this is definitely a thing that’s quite useful. I am also in a role where I am being paid to be professionally paranoid, and so I read the code that’s being added to those. Trying to limit the number of them that you’re using, trying to keep an eye on what’s happening there is important.
Nova DasSarma: I would say that browsers are more secure than I certainly thought they were back in the day. Chrome especially has had a lot of work done by a lot of people working full time to sandbox execution of arbitrary code. When you think about programs, oftentimes they are something that’s written by somebody else that’s running on your computer, where your data is.
Nova DasSarma: And the web is increasingly like this. We’re recording right now on Riverside FM. It’s got video, it’s got audio streaming, it’s uploading files, it’s downloading files. It’s able to do all sorts of really, really exciting things. And this is inside the browser. If it was something that you had asked me to download, I would’ve been a lot more concerned. I think that the JavaScript sandboxing ecosystem has gotten very, very advanced. People have put a lot of thought into how to do smart things with it.
Nova DasSarma: I think that browsers in particular are oftentimes more secure than things that you’re running unaudited on your laptop. This is actually something where desktop operating systems have taken a page out of mobile’s book though: sandboxing, by default, is something that’s true on many apps and things like that. Permission dialogues for requesting access to information were not a thing on desktops really, because we didn’t start out thinking about that.
Nova DasSarma: The permission scheme for files on a Linux or Unix operating system has a set of permissions for read, write, and execute for user, group, and everybody. And for the longest time, “everybody” could access everything. This was expected: you were inside of a university, everybody was trusted. And moving to this model where things aren’t trusted by default has been very painful, but I think that browsers have been leading the way on that. So that’s pretty exciting.
Rob Wiblin: As far as I understand what you’re saying, I guess in the bad old days, we had this issue that if you loaded up a website, Internet Explorer or whatever browser you were using was not sufficiently good at sandboxing — which I guess is kind of constraining the code that’s running within that webpage to just interact inside that webpage, inside the browser. Instead, they could frequently find ways to get their tentacles into other parts of the disk to run code that you wouldn’t expect them to be able to run.
Rob Wiblin: But these days we’ve gotten better with Chrome and Firefox, and I guess just better computer security in general — figuring out how do we make sure that this tab that we’re using right now to record this conversation can only do the things that we would expect it to be able to do inside this browser, and not to access files that it can’t access, not to do broader things on our MacBooks that are beyond the permissions that Chrome has given this particular tab. Is that right?
Nova DasSarma: Yeah. I think that’s a really good gloss of this.

Articles, books, and other media discussed in the show

Nova’s work:

Hofvarpnir Studios, a nonprofit Nova cofounded with a grant from the Effective Altruism Infrastructure Fund, aims to build a GPU cluster to provide access to compute for AI safety researchers
The Fletcher bot, which does moderation for a large number of Discord servers
In-context Learning and Induction Heads by Nova and others at Anthropic

Nova’s recommended reads and podcasts:

Daniela and Dario Amodei on Anthropic on the Future of Life Institute Podcast
Google Security Blog
The FBI’s InfraGard free mailing list for information security professionals
Friendship is Optimal by Iceman — the fiction that inspired the name “Hofvarpnir Studios”
Darknet Diaries podcast, including episode 29 on Stuxnet

Ways to get involved in systems security:

Learn and practice:

National Security Agency Programs for Students
CVE Program — a centralised database for various systems vulnerabilities
CTF101.org — an online guide for the techniques, thought processes, and methodologies to succeed in Capture the Flag competitions
Metasploit — a penetration-testing framework and database of exploits
Amazon Web Services cloud platform has some articles to try out
MDN — the Mozilla Developer Network, which has resources for web developers
Notes on Contemporary Machine Learning for Physicists by Jared Kaplan
Machine learning online courses from Coursera
Talk to people at EA Global and ICML, and/or email Nova

Job opportunities:

Hofvarpnir is hiring junior DevOps engineers
Anthropic is hiring for roles in data engineering, infrastructure, and security engineering.
The 80,000 Hours job board currently has dozens of vacancies for AI safety and policy engineers

Security recommendations for organisations and individuals:

Security recommendations for new organizations — a checklist Nova has compiled to help secure new employee workstations at a new organisation
Organisations can also consider using an identity-based authentication approach (such as Google’s BeyondCorp), as well as implementing single sign-on so users don’t have to manage many passwords for multiple services
Use a password manager, such as 1Password, or the built-in Chrome or Firefox options
Use two-factor authentication for everything, and consider using the physical U2F security keys (just get and register two of them in case you lose one!)

Other podcast episodes and resources from 80,000 Hours:

Rob’s podcast recommendations:

Hear This Idea by Fin Moorhouse and Luca Righetti, including one of Rob’s top episodes: #34 — Anders Sandberg on the Fermi Paradox, Transhumanism, and so much more.
Narratives Podcast with Will Jarvis, including one of Rob’s top episodes: #90 — The 800 Year Decline in Interest Rates with Paul Schmelzing.
Future of Life Institute Podcast, including the latest episode with Daniela and Dario Amodei on Anthropic, as well as a couple of deep-dive series:
- AI Alignment Podcast
- Not Cool: A Climate Podcast
Rationally Speaking Podcast with Julia Galef, including one of Rob’s top episodes: #250 — What’s wrong with tech companies banning people? with Julian Sanchez
Clearer Thinking with Spencer Greenberg, including one of Rob’s top episodes: #97 — Why is self-compassion so hard? with Kristin Neff
The Spanish effective altruism podcast Un equilibrio inadecuado with Fernando Folgueiro, including one of Rob’s top episodes: Julio Elías: Mercados repugnantes

Everything else:

Conversational AI Programming with CodeGen: Let AI Write Code For You by Erik Nijkamp and Donald Rose, on Salesforce’s new 20-billion-parameter code model released in March 2022
Ethical Issues in Advanced Artificial Intelligence by Nick Bostrom
Countdown to Zero Day: Stuxnet and the Launch of the World’s First Digital Weapon by Kim Zetter
Second Biggest Crypto Hack Ever: $600 Million In Ether Stolen From NFT Gaming Blockchain by Jonathan Ponciano
Nvidia Hacked – A National Security Disaster by Dylan Patel (it was later determined Nvidia was breached by Lapsus$)
The Seven People Who Can Turn Off the Internet — video by Half as Interesting
The Shadow Brokers — a suspected NSA insider threat
CISA Call with Critical Infrastructure Partners on Potential Russian Cyberattacks Against the U.S. by the US Cybersecurity and Infrastructure Security Agency

Transcript

Table of Contents

1 Rob’s intro [00:00:00]
2 The interview begins [00:01:08]
3 Why computer security matters for AI safety [00:06:03]
4 State of the art in information security [00:15:45]
5 The hack of Nvidia [00:25:14]
6 The most secure systems that exist [00:34:51]
7 Formal verification [00:46:26]
8 How organisations can protect against hacks [00:52:42]
9 Is ML making security better or worse? [00:56:34]
10 Motivated 14-year-old hackers [00:59:32]
11 Disincentivising actors from attacking in the first place [01:04:12]
12 Hofvarpnir Studios [01:11:04]
13 Capabilities vs safety [01:18:10]
14 Interesting design choices with big ML models [01:27:08]
15 Nova’s work and how she got into it [01:43:45]
16 Anthropic and career advice [02:04:16]
17 $600M Ethereum hack [02:17:01]
18 Personal computer security advice [02:21:30]
19 LastPass [02:29:27]
20 Stuxnet [02:36:31]
21 Rob’s outro [02:38:41]

Rob’s intro [00:00:00]

Rob Wiblin: Hi listeners, this is The 80,000 Hours Podcast, where we have unusually in-depth conversations about the world’s most pressing problems, what you can do to solve them, and whether it’s a good idea to spend hours coming up with a joke about information security for this intro. I’m Rob Wiblin, Head of Research at 80,000 Hours.

I love being able to find people who haven’t done many interviews and giving them some of the exposure their work deserves, and Nova DasSarma is a sterling example of that.

Computer and information security are super interesting in themselves, and they may turn out to be critical to the safe development and deployment of AI systems with very broad capabilities, so I had a lot of pressing questions for Nova.

We also talk about how important it is to have access to enormous computing power, how you set up an AI computation centre most efficiently, and whether making computers secure is a lost cause with current technology.

Oh and we also geek out about what stuff it’s worth everyone doing to ensure they don’t get hacked in a way that does a lot of damage.

At the end of the episode I’m going to highlight a few of my favourite other podcasts, including a few that are pretty similar to this very show. So stick around for the outro if you’re interested to hear that.

All right, without further ado, I bring you Nova DasSarma.

The interview begins [00:01:08]

Rob Wiblin: Today, I’m speaking with Nova DasSarma. Nova studied information systems at the University of Maryland, Baltimore County, before going on to be a system administrator for a range of university research teams, and then a data engineer at various tech companies. She is now the lead systems architect at Anthropic, which means she’s responsible for information and computer security there, as well as engineering their ever-more-elaborate computer systems to get as much useful computation out of them as is practical. Anthropic, you may know, is an AI company that aims to build prototypes of reliable and steerable AI systems. We last discussed Anthropic in episode #107: Chris Olah on what the hell is going on inside neural networks.

Rob Wiblin: Nova is also a cofounder of the nonprofit Hofvarpnir Studios, which aims to build a GPU cluster that can be used by academic researchers to improve the safety of cutting-edge AI systems. Thanks so much for coming on the podcast, Nova.

Nova DasSarma: Hi Rob. Thanks for having me on. I’m excited to talk about why information security and infrastructure matters for AI safety.

Rob Wiblin: Yeah, exactly. I’m hoping we’ll get to chat about why there is this important interaction between information security and AI development, as well as how you got into this somewhat unusual line of work. But first, as always, what are you working on at the moment and why do you think it’s important?

Nova DasSarma: So at Anthropic, I’m working on securing compute power to run the experiments for lots of large language model experiments. Turns out that a lot of ML is pretty simple experimental concepts, but they require pretty ridiculous engineering efforts to actually run those experiments. And so a lot of my job is making those feats of engineering possible. And at Hofvarpnir, these days I’m mostly writing some software for helping academics containerize their workflows. It turns out that’s pretty important for being able to scale and secure a lot of the software that you’re working on. We’re also building up that cluster that you talked about.

Rob Wiblin: What is “containerizing a workflow”? I don’t know what any of that means.

Nova DasSarma: So when you’re writing software, oftentimes you’ll have some program that talks to an operating system, and it runs on a particular stack of hardware plus software. “Containerizing a workflow” is talking about taking that software that you’ve written, and then talking about the dependencies that it needs and the software stack that it needs, and bundling that into one big image that you can then run on any kind of hardware. So it’s sort of system agnostic. You also get some security benefits from doing this. Obviously most containers don’t have this, but there are definitely ways to make your containerized workflow give you security benefits as well.

Rob Wiblin: Nice, OK. And the first thing you said was that thinking up experiments that you want to do relating to AI or machine learning is somewhat straightforward, and maybe the surprisingly harder part of it is engineering the hardware that allows you to actually do those experiments in a timely manner. Am I understanding what you said right?

Nova DasSarma: That’s the way that I would think about it. I think that there’s a lot of really interesting ideas and a lot of low-hanging fruit in AI right now — people just don’t have the infrastructure to be able to run the experiments to do things. So for example, if you want to scale a large language model, you’re going to need hundreds of millions of dollars of compute, and you’ll need to be able to organize that in such a way that you use it efficiently. Those problems are more traditional software engineering problems, but I think a lot of the talent there isn’t in the AI safety space, so it becomes a bottleneck.

Rob Wiblin: That’s interesting. It sounds a little bit like the very expensive science projects — like the people working on fusion, and the people working on colliding atoms, and the people who want to look at information from deep space from billions of years ago using these incredibly expensive satellites. You’ve kind of got this hardware bottleneck, where lots of people have ideas for what they’d like to look at, but only so many people can do so many things.

Nova DasSarma: Yeah, for sure. I think it definitely compares to some of these really large science projects. And actually, Anthropic has a lot of physicists there, so I think that’s a very appealing comparison for them.

Rob Wiblin: Yeah. We’ve talked about this issue of hardware being a bottleneck on the show before. That’s been the case for like six years or something like that, or that’s been increasingly the case for maybe the last decade or longer. But I mean, I’m old enough to even remember a time when people would talk about random academics doing interesting work on AI, and that they could do it on what I imagine is a much more limited budget than what’s required today.

Nova DasSarma: For sure. I think that one of the things that we got out of the GPT models is that if you scale a large language model, you can get some pretty interesting phase changes in the results — those hockey stick graphs that make everybody nervous. Those are things that only show up at scale, and academics just don’t have the access to resources. And that’s one of the things that Shauna and I want to fix with Hofvarpnir. But yeah, it’s really unfortunate that academic computing hasn’t really kept up with industry. And that’s why I think a lot of those discoveries come out of Baidu or Microsoft or OpenAI or Anthropic — places that have a whole bunch of money to run a very large compute cluster.

Why computer security matters for AI safety [00:06:03]

Rob Wiblin: OK, yeah. We’ll come back to Hofvarpnir, and compute, and issues like that later on in the conversation. But the thing I was most psyched to talk with you about is computer and information security, and the interactions that has with development of cutting-edge ML models. For those in the audience who are not familiar with this topic, why is Anthropic working on computer security as an important organizational priority?

Nova DasSarma: So for one thing, when you have a large language model, you’re oftentimes going to train it on a large number of GPUs for a long time. This produces this asset, which is these weights for the language model. And this is an incredibly expensive asset for the size of it. You can have something that’s a couple of gigabytes that costs millions to produce. So that’s a very appealing target for a lot of bad actors, so security’s pretty important on that front.

Nova DasSarma: On the other side of things, and more directly safety related, a lot of research in AI can be dual use. So protecting your code, protecting your model weights, and that sort of thing is ever increasingly important as you have more powerful models.

Rob Wiblin: So one important priority is you’re developing these models, and they’ve got these weights that were very costly to figure out what numbers they ought to be. And you want to keep those inside the firm, rather than just have whatever ne’er-do-well come in and swipe them from you. Are there any other kind of key computer security challenges that Anthropic needs to solve over coming years and decades?

Nova DasSarma: For sure. I think that we can talk about security as also something where you have a model that we hope is steerable, and we hope is reliable, that sort of thing. You can talk about that as also a security risk. So if your model is able to do things like cough up social security numbers, or cough up passwords and API keys, that’s also a computer security issue that we care about quite a bit.

Rob Wiblin: What about the idea of, you want to run a model, but you’re not sure quite how safe it is. So you want to kind of constrain what resources it has access to. I guess this is some variation on the idea of boxing in artificial intelligence. Is that also a computer security issue, or is that under a different header?

Nova DasSarma: Yeah, very definitely. It’s one of the things that I’ve been worried about quite a bit. So as we’re recording this, it’s March 31. Yesterday, Salesforce released a 20-billion-parameter code model. And as part of their training loop, they used something called the “human eval environment” — that’s an environment for looking at executions of code. One of my concerns has been that this is sort of an industry standard, but there are some pretty subpar security practices with sandboxing the executions of code in that environment. And honestly, I think it’s one of the places where Anthropic might be interested in giving back, in terms of making it easier for actors to sandbox code executions. Because you really don’t want your nascent AI model running arbitrary code with access to the network.

Rob Wiblin: Yeah. That sounds right. Let’s talk first about this issue of data exfiltration, because it seems like one that is present at the moment and potentially not a simple one to solve. So you train these models, it costs a whole lot of money, it takes a whole lot of compute. Potentially these models could have capabilities that you don’t want to be immediately widely applied. But if the model is, as you say, only various gigabytes of data, it seems like it’s going to be very hard to stop someone from stealing that data if they’re really, really committed to it. What, in practice, can be done?

Nova DasSarma: I think that a lot of security is about detection of bad actors within your organization and with access to your systems. So one of the things you can do is you can limit who has direct access to these model weights. You can look for suspicious patterns. So for example, if you’re operating and all of your employees are in San Francisco, and you see somebody trying to access files from Omaha, Nebraska, you might want to have software that raises an alert for some kind of intrusion detection or exfiltration there. That’s one of the things that’s very much an observability problem in security.

Rob Wiblin: Yeah. OK, maybe I’ve jumped the gun on that. Perhaps first we should talk about what sort of actors you’re actually worried about. Because I imagine different actors are going to have quite different capabilities or present a different style of threat.

Nova DasSarma: For sure. This is a really tough topic in security, because the more you’re trying to defend against, the less usability you have within your system — the less you’re able to do things like iterate fast, collaborate with other people, that sort of thing. There has to be some measure of trust in the system to do any of these things. “The most secure system is the one that you never turn on,” is something we like to say.

Rob Wiblin: Yeah.

Nova DasSarma: But in terms of actors who might interact with these systems in ways that we would consider negative, corporate espionage is something that is a big topic. And obviously there’s some amount of foreign intelligence services and things like that, especially as AI becomes more capable, that might be concerned about things being produced within labs like OpenAI or Anthropic.

Rob Wiblin: I see. OK, so states would be one grouping.

Nova DasSarma: Yeah. I think states would be a grouping. You can also talk about individual actors and motivated people within your organization. That’s one of the reasons why I think keeping organizations small, making sure that you’ve got good alignment, doing things like background checks, can all be pretty important in limiting your attack surface, especially for those individual actors.

Rob Wiblin: Yeah. That makes a lot of sense. Is there another category?

Nova DasSarma: Between individual, corporate, and state?

Rob Wiblin: Yeah. I thought you might talk about criminals who want to make money, perhaps steal your data and then sell it to whoever.

Nova DasSarma: Oh, for sure.

Rob Wiblin: Or use it personally.

Nova DasSarma: Yeah. In my head, I sort of grouped this in this corporate zone. Because it’s a large number of individuals who are oftentimes very profit motivated to retrieve these sorts of things. We’ve definitely seen that with the group that’s been breaking into Microsoft and Okta, and various other high-profile hacks recently — that’s been something that’s been in the news. So that’s certainly something that we’re on the lookout for.

Rob Wiblin: Yeah. One sort of misuse that probably you don’t have to worry that much about is another respectable company in the United States, like Microsoft, stealing your model and then trying to apply it. Because I imagine that the intellectual property issues there mean that’s just not interesting to them. They’d rather deal with you and try to buy the service, or catch up by hiring people or something like that. So we’re mostly talking about actors other than those ones.

Nova DasSarma: For sure. Yeah. I think that worrying about individual actors and worrying about cybercriminals, so to speak, and state actors are your three things that you really care about. Though the difference can be not that much depending on what country you’re operating in.

Rob Wiblin: Right. Yeah. I guess there are some countries that don’t fully respect other countries’ intellectual property. I suppose you can also see some countries have gotten more respectful of IP over time. Some countries have gotten less respectful of American IP over time. It kind of depends on broader geopolitical factors.

Rob Wiblin: We kind of started talking about this, but is there a clear breakdown of the different misuse cases? I suppose there’s one where, say, a state takes the IP and then uses it for potentially hostile state-based purposes. There’s another, a company just deploying something in order to make money. And then I guess there’s just the IP or the model being spread quite widely and deployed in a whole lot of situations where you might not endorse it, before you think perhaps it’s ready for prime time. Are those kind of the main categories?

Nova DasSarma: I’d say that those are the main categories. I would also throw ransomware on there. We saw something with oil pipelines on the East Coast recently, where it turns out that motivated actors can break into your system, encrypt all your data, and say, “Give us some money or we won’t give it back to you.” And that’s quite hard to defend against in some ways.

Rob Wiblin: Yeah. OK, so we’ve got those various different categories. Different people have quite different visions of how artificial intelligence is going to advance, and how it will end up influencing society. On one extreme, you have the folks who think it’s going to, at some point, advance incredibly quickly and take off within days or weeks, once a machine learning model is capable of self-improving incredibly quickly. On the other hand, you have folks who expect the deployment to be quite gradual, and maybe some capabilities to come quite early, but others to take a very long time in order to reach fruition. How important is it to have kind of a picture in your mind of how artificial intelligence is going to affect the world in the 21st century in order to figure out what your priorities should be, from a computer or information security point of view?

Nova DasSarma: I think from a theory of change perspective, it’s pretty important to have a good model of what an AI takeoff could look like. Especially because if you think that the difference between a subcritical and a supercritical model is very limited, then you might want to work harder on keeping some of those models under wraps more — where you don’t think that they’ll take off, but you don’t really know what would happen if you added another billion parameters to them. That being said, I think that all actors should basically be trying as hard as possible to defend against these sorts of attacks, whether they have a hard or soft takeoff model for issues with AI systems. And of course this is sort of beyond my area of expertise, to some extent. My role is more on the infrastructure end.

Rob Wiblin: Implementation, rather than the big-picture theorizing.

State of the art in information security [00:15:45]

Rob Wiblin: My perception, as someone who takes a slight amateur interest in information security issues, is that the state of the art is very bad. That we do not really have reliable ways of stopping a really advanced, well-funded adversary from stealing data, if this is something that they’re willing to invest a lot of human capital in. Is that kind of right?

Nova DasSarma: I think that’s kind of right. I’ve got a story here around this, if you want to hear it.

Rob Wiblin: Yeah. Go for it.

Nova DasSarma: A state that will not be named had an attack that was in the news recently, that was a zero-click vulnerability on iMessage. A “zero-click vulnerability” is one where the user doesn’t have to take any actions for them to be compromised. And this had to do with something called the JBIG2 compression algorithm, which you might have heard about, because back in the day, Xerox used to use this for copiers. It’s a compression algorithm, which means that you can copy things faster. But it turns out that if you turn the compression up too high, it turns zeros to nines and vice versa, which is quite bad for numerics.

Nova DasSarma: That being said, JBIG2 was also the culprit in this case, where their compression algorithm is dynamic — which means that you can specify patterns on the fly. It turns out that if you construct a file that has the JBIG2 codec in it, then you can construct logical gates out of this. Which means that in theory, it’s Turing complete — and in practice, it was Turing complete. So to deliver this vulnerability, they produced a computer within the JBIG2 decompression algorithm to deliver the payload to these phones.

Nova DasSarma: And that’s the sort of thing where you could theoretically have defended against this, but the way that you defended against this was least access — so not being able to access anything on your phones, or not having phones. Both of these things are really quite difficult to implement in an organization above a certain size that doesn’t have a very, very strong security mindset.

Rob Wiblin: Security culture.

Nova DasSarma: Yeah. So that’s on the state access side. That being said, the thing that works the most is always going to be a social attack. So something where you meet someone at a party, and they seem nice, and they become your friend. And then you let them into your building when you maybe shouldn’t have done that, and they plug a USB into your system, and you’re done. We talk about physical access being the end of the line in security oftentimes. So that being said, yes.

Rob Wiblin: Right. OK, so one thing is a very well-endowed actor can develop zero-days. We basically live in a world where states are able to figure out completely new ways of breaking into computers, into phones, that people can’t protect against, because no one else is aware of them and potentially they can require no action whatsoever. Even more accessible to actors who have less money is this kind of social engineering attack, where they’ll convince someone to give them access. And this doesn’t require quite the same level of technical chops. But nonetheless, basically it’s extremely hard to secure a system to be very confident that it’s not vulnerable to one or the other of these approaches.

Nova DasSarma: For sure. And on the social engineering side, you don’t need the folks who have the most access in your organization to be compromised by social engineering attacks. Oftentimes those are the folks who are least vulnerable to that. All you need to do is have somebody who is on operations — or somebody who is maybe even the physical security person for the building who connects to your corporate wifi — be compromised, and then they can be the threat vector into your organization.

Rob Wiblin: Yeah. So given that we live in that kind of world, should we just not be training models where it will be disastrous if they leak? It just seems like we just don’t live in a world that’s safe for that kind of ML model yet.

Nova DasSarma: For sure. And that’s something that a lot of labs would definitely have on their minds. I think that it being difficult to secure models is one of the reasons why we wouldn’t want to train such models. Models with a lot of capabilities are oftentimes very alluring for people to build anyways though. And so my perspective on this is that it’s important for me, and people who I work with, to develop tools to defend things anyways. Because if you can disrupt that sort of attack while it’s happening, if you can notice it’s happening, then you’ve got a better chance of keeping things contained longer.

Rob Wiblin: Yeah. I remember a couple of years ago, folks were really worried that the GPT-2 language model and then the GPT-3 language model — if people had broader access to them, or they could reproduce that kind of result — that those models could then be used for crime, or just some kind of negative purpose that we haven’t yet thought of. People thought that perhaps that’d be used to simulate actors on social media, and just create so much noise on social media, and make it impossible to tell who was a real person and who was not.

Rob Wiblin: For that matter, a lot of people, including me, predicted that during the Russian invasion of Ukraine, we’d see a lot of cyberattacks — I guess, competition between the US and Russia, as well as Russia using cyberattacks in order to deactivate infrastructure or personnel within Ukraine. But as far as I know, we haven’t seen very much of that. Is this a little reassuring that, although in any specific case, there’s a lot of potential for information to leak or to be misused, in practice, lots of things that could happen don’t happen?

Nova DasSarma: Well, I hate to disagree here.

Rob Wiblin: OK, yeah. Go for it.

Nova DasSarma: Unfortunately we actually have seen quite a few things that look like this. Not on the AI side. But during the Russian invasion of Ukraine, ongoing, one of the most successful Russian efforts was to disrupt communications out of Kyiv, and that was definitely something that involved a cyberattack on Ukraine. We’ve also seen Ukraine punch back, have a call to action for their homegrown hackers to take on the Russian state there, and we’ve seen that sort of thing.

Nova DasSarma: In the software ecosystem, we’ve seen some serious disruptions from individual actors doing things. Like there’s an npm package — npm being the JavaScript package repository — where somebody pushed a malicious code update that checked whether the code was running on a Russian or a Belarusian computer. And if it was, then it deleted everything on the hard drive. And this was a project that was included by many, many other projects, and it turns out that was quite damaging. There’s a thread on their GitHub, which I have no way of verifying, from somebody claiming to be an NGO operating for whistleblowers out of Belarus, claiming that this actually ended up deleting a whole bunch of data for them. So certainly we have at least some people claiming that this was something damaging.

Nova DasSarma: We’ve also seen on social media evidence of manipulated profiles and things like that. Where images were generated, not by GPT-2, GPT-3, but by things like CLIP. You can see the telltale signs of an AI-generated image, where there are things like an earring is only on one side, various kinds of oddities around the corners of the eyes and hairline and things like that, where we see some of these things. And I think honestly, there are more of these than we know, because if they’re successful, then they’re undetected. You have to be doing it quite badly to be detected there.

Rob Wiblin: Yeah.

Nova DasSarma: One other thing though. People are very, very good at doing these sorts of attacks on their own. And humans are quite cheap compared to somebody who can play GPT-3 like a piano. It’s easier to just hire 1,000 very low-paid workers to do this and have them do this all the time. And it’s way easier to train them than it is to train an ML model. So I think that’s part of the reason. And I think as capabilities increase of these large language models, the potential for abuse increases, because their capabilities outstrip that very cheap labor.

Rob Wiblin: Yeah, that makes a lot of sense. My perception is that the AI and ML field has, in the past, had this kind of academic culture of sharing results so that other people can replicate them. Like publishing papers, basically explaining everything that you’re doing. If it’s important to keep secrets and it’s important to keep models under wraps, I guess we’re expecting and maybe hoping that that culture of just universally sharing models is going to come to an end, or at least be modified in important ways. Is that right?

Nova DasSarma: You know, I think that that’s something that we’ve seen already. For example, the GPT-3 model was released as an API rather than as access to code and weights. That sort of thing. So I think we’re already starting to see signs of those things. That being said, I think that academics have a culture that is going to be quite difficult to change even in the face of something like this.

Rob Wiblin: Right. I guess you might get this cultural change just because most of the cutting-edge work is going on within companies, in these large AI specialist organizations, rather than among academics. Because they don’t have access to the compute, for better or worse.

Nova DasSarma: Yeah. I think that’s one of those things where culture is done by replacement rather than convincing people. And certainly that compute barrier is one of those things that has made this easier, and having a limited number of actors, that sort of thing.

The hack of Nvidia [00:25:14]

Rob Wiblin: Yeah. Are there any historical case studies of information leaks in ML? Are there any cases where an ML model has been stolen in the past?

Nova DasSarma: That’s a great question. I don’t think I can think of one offhand actually. If they have been stolen, then it’s one of those things where they’ve kept hush-hush about it.

Rob Wiblin: I suppose in the past, the incentive has been less, because it’s often been possible to replicate the work one way or another.

Nova DasSarma: Yeah. I think it’s definitely something where we think that the data is the thing that’s valuable, and the data is something that’s open source. So yeah, it’s not something I’ve seen.

Rob Wiblin: Yeah. An audience member wrote in wanting to know your take on this hack of Nvidia that I understand happened in February. For listeners who don’t know, I think there was a group, as yet unidentified, that stole a couple gigabytes worth of data from this semiconductor manufacturer called Nvidia — one of the largest semiconductor manufacturers in the world. I think they have some of the leading chip designs for ML among other applications. Basically, I think the amount of data was not that large. We’re talking gigabytes. But it would probably cost billions of dollars to produce the data that was contained in there — or at least potentially it could be worth billions of dollars to other actors, if they could get access to it and use those results to design their own chips. Do you have any thoughts on that event?

Nova DasSarma: Well, first of all, I would like to say that it’s really unfortunate that that happened from a safety perspective, because a lot of the chip design and things like that inside of Nvidia relies on some secret sauce that only a few engineers inside there know. So having that information be more accessible is probably net bad. That being said, a lot of the really hard work inside of things like making a GPU or other AI accelerator card is going to be on the hardware and the execution side, more than the ideas.

Nova DasSarma: You can certainly take one of Nvidia’s A100 cards, and take some hydrofluoric acid and decap the chips, and take a microscope and take a look at the circuit that’s inside there — and that’s almost certainly something that other companies have already done. So I think that some of the IP loss is unfortunate, but not as bad as it could be. Was it preventable though? I think probably not. I think that Nvidia is a large enough organization that this is the sort of thing that we’re lucky it didn’t happen earlier. Or we actually have no way of knowing — it’s possible that it did happen earlier and we just don’t know about it.

Rob Wiblin: Yeah. It’s interesting. I actually own a bunch of Nvidia stock — I’ve owned it on someone’s recommendation, who will go unnamed, for quite some time. I think since I bought it, it’s more than 10x-ed. Anyway, this prompted me to take a look, when someone told me about this hack. I took a look at the share price, and it hadn’t moved at all when the announcement was made.

Nova DasSarma: I’m not surprised. It’s definitely one of those things where the information is going to maybe level the field for somebody making an NPU or something like that. But the real limiter on doing these is having really talented people who can run fabs, can take these designs and convert them into actual chips, that sort of thing. I imagine that whatever information they have, it’s going to be quite difficult to apply it.

Rob Wiblin: That makes sense. I suppose it might help people catch up to some previous stage perhaps, or gain insights here and there, learn stuff. But to actually replicate what Nvidia is doing based on these files would be tremendously difficult. You would probably have to hire a whole bunch of people from Nvidia and get them to help you to do it.

Nova DasSarma: Yeah. Honestly, if you’re trying to beat Nvidia, that’s the easiest way to do it. But if you look at, for example, AMD — AMD has a chip that’s pretty comparable to the A100 series, it’s the MI200 or MI250 series. One of the reasons that that isn’t as used in things like machine learning is that the software stack just isn’t there. And I think if you were an actor who bought this Nvidia stolen IP, hoping to create your own chips and things like that, it’s possible that you could do something that would be compatible with the Nvidia APIs. It’s possible that you could do something that’s going to be comparable in performance. But by the time that you get there, they will have moved on, there will be another die shrink. Unless you have ongoing access to this information and the hottest team of silicon engineers on the planet, I think it’s not going to be very useful.

Rob Wiblin: OK, yeah. That’s interesting. Coming back to the preventability, I guess you’re just saying that given the current state of information security, in order to keep this information secret, they would have to bury it so deeply that it would interfere with their operations more than it would even be worth, in terms of protecting the information. So basically, if a piece of information is this valuable and this obvious, we should often expect it to be stolen in one way or another.

Nova DasSarma: I think that’s a good expectation to have. It might not be what actually happens. I wouldn’t give greater than 50% on a one-year timeline of, for example, Hofvarpnir’s CTL command being stolen. But it’s a good mindset to have. It makes you think more carefully about what sorts of capabilities you’re developing and things like that. Because if you assume that a bad actor is going to use it, then you are going to be in a better state if they do actually end up using it.

Rob Wiblin: You mentioned in response to the first question in this section that you can think of AI alignment and AI safety as an information security or computer security problem. Could you flesh out that perspective a little bit more?

Nova DasSarma: Sure. So when we talk about an AI system that can do something like, for example, OpenAI Codex or Salesforce’s 20-billion-parameter model that can write some code, you worry that the code that it is putting out into the world is something that will allow it to have capabilities that you might not want it to have. Allowing it to put its sub-agents into the world, that sort of thing. And this is quite similar in some ways to trying to have a boundary of preventing execution by other actors inside of your organization. So I think that there’s a symmetry there, and lessons to be learned from traditional information security.

Rob Wiblin: So what implications does that have for what a place like Anthropic has got to do? I suppose it means maybe you want to just hire a lot more InfoSec people, because they can be potentially involved in the training of ML models, or thinking about how ML models can be designed better, as well as securing their networks themselves.

Nova DasSarma: Absolutely. I think that if you’re out there doing security and you would like to work on a very interesting problem, Anthropic almost certainly has a role for you. That being said, we’ve done a lot of work on looking at things like human eval and trying to figure out ways to do more clever sandboxing, find ways of limiting the capabilities of something like running the Salesforce model, that sort of thing.

Rob Wiblin: Yeah. Tell me more about this Salesforce model. I didn’t quite understand what it is that that model does.

Nova DasSarma: Sure. So Salesforce just released yesterday a 20-billion-parameter model, which they claim does better on evaluations than OpenAI Codex, which is 12 billion.

Rob Wiblin: What’s it evaluating?

Nova DasSarma: It’s evaluating its success at converting human-readable descriptions of problems to codes that can answer those problems.

Rob Wiblin: OK. So someone writes a description of something that they would like a computer to do, and then it codes that up and tells whether it’s succeeded at doing that?

Nova DasSarma: Yes, exactly. And you can imagine why a place like Salesforce would want this, given that their business model is absolutely trying to take business rules and convert them into computer systems. So it’s pretty exciting to hear that they’re thinking about ML models here, even if it is somewhat alarming.

Rob Wiblin: Why is it alarming?

Nova DasSarma: It’s alarming because I think that code models in the wild are generally something that increase the ability of people to deploy code. And oftentimes, I think we already have a problem with a lot of people deploying code without thinking about the security implications of it. And that’s even within a very select group of people who have done a bunch of training.

Rob Wiblin: Have gone through computer science degrees.

Nova DasSarma: Maybe done a computer science degree, certainly hung out with people who spent their time in their youth breaking into systems and that sort of thing. I think that there are a lot of folks on the business side of things who might not be thinking quite as carefully about what that code looks like. In my last job, I worked on a no-code system for financial products. Certainly one of the things that I learned from working with a bunch of banks is that a lot of the folks there are more concerned about the bottom line than they are about information security beyond compliance. So it’s certainly something that I’m concerned about.

Rob Wiblin:Yeah. This situation reminds me almost suspiciously exactly of The Sorcerer’s Apprentice thing from Fantasia. You’re saying there’s all of these junior folks who will suddenly have the ability to automate a whole bunch of stuff. But they might not appreciate what they’re getting themselves in for if they just start coding things up using this evaluation model from day to day.

Nova DasSarma: Absolutely. Yeah. I saw a post the other day which was talking about test-driven development, where you produce a set of specifications for what your software should be able to do and what it shouldn’t be able to do. And they’ve got this function for randomness, and it returns eight, and it says that this was randomly selected at the time of writing this code. It’s one of those things where understanding what’s actually wanted is a very, very difficult problem, even for programmers.

Rob Wiblin: For a 20-billion-parameter model.

Nova DasSarma: Especially for a 20-billion-parameter model, yes.

The most secure systems that exist [00:34:51]

Rob Wiblin: So to what degree could we solve any of these information security issues by putting information that we don’t want to get out there on ice somehow? Like putting things in cold storage, except for the exceptional cases where occasionally you want to access them for some practical reason?

Nova DasSarma: I highly recommend it. Certainly limiting the amount of information you have that is eligible to be exploited and that is easily accessible is a great way to limit your footprint. So things like, if you’ve got a model that you trained and you have a whole bunch of checkpoints and you’re storing them on some online system, consider whether you need to do that. Consider whether you could instead encrypt those and put them on something like Amazon Glacier or Google Cloud Storage or something like that, where you’ve got a cold storage — it’s going to take several hours to restore it, and you can absolutely set an alarm if somebody tries to restore that information without letting you know.

Rob Wiblin: Interesting. What are some of the most secure networks that exist today, or most secure computer systems that exist?

Nova DasSarma: Well, I think that my TI-84 Plus calculator is pretty secure, because it can’t connect. It’s hard for me to comment really on the security of other organizations. I think that everyone’s trying very, very hard to produce systems that are secure and reliable, because that’s very much important for their bottom line.

Rob Wiblin: One system that I’ve heard of on a bunch of computers that is very focused on security, I think is maintaining the encryption keys that underpin the Domain Name System, which I guess allows us to find websites on the internet. Do you understand how that system works at all?

Nova DasSarma: Sure. So DNS is a system where you’ve got a hierarchical name lookup. You’ve got some servers that understand the concept of dot org or dot com, and we call these the root domain servers. And then there are other servers that inherit from these, essentially, that you can then look up things like google.com or anthropic.com or something like that. Because obviously these servers are getting many, many queries a second, and trying to avoid that is pretty important.

Nova DasSarma: It’s something where if you had a sufficiently motivated actor who was able to compromise one of these root servers, you might be able to redirect traffic for a large portion of the internet. So the security measures taken around that are pretty high compared to many other parts. Is that what you’re talking about?

Rob Wiblin: Yeah. That’s exactly what I’m talking about. I’ve seen a slightly clickbaity video that tried to describe some of the precautions that they took. Is it the original signature keys that verify that it’s a legitimate server or something like that they’re trying to keep secret?

Nova DasSarma: Ah, OK. So I think this has to do more with another system called HTTPS. If you look in your browser’s URL bar, you’ll see an HTTP or an FTP or an HTTPS, oftentimes it’s got a lock beside it. And the way that you verify that this encryption is secure is that your computer has a stack of certificates, and it looks up on that chain what those certificates were authorized by and that sort of thing. The root certificates that were issued by certificate authorities are very much something where you don’t want that compromised. So having a multi-part step where you’ve got cross-signatures is I think the main thing that really secures these — where one certificate authority is guaranteed by several other certificate authorities. So you’ve got this shadowy cabal that’s managing encryption for the internet.

Rob Wiblin: I see, OK. So on this video I saw, it said that they had this thing on cold storage, but that’s not enough. So they put it in a Faraday cage, so no one can use wireless stuff to try to break into it. And they’ve got physical securities, so people with guns, and I think it has a vault underground. Then to get into it, you have to get six out of nine people who have keys to open it up, and then each of them knows only part of the password. I suppose this is an unusual case, because it’s something that you can keep on cold storage. You don’t necessarily have to use it all that often; you only have to access it periodically.

Nova DasSarma: Yeah. It’s definitely one of those things where it’s very valuable information that’s very small. This would be, I think, more difficult with a lot of other sorts of assets that you have in a situation like this though.

Rob Wiblin: I guess it’s a slightly hopeful example in that, to my knowledge, these keys have never successfully been stolen, or maybe I just haven’t heard that.

Nova DasSarma: Not as far as we know.

Rob Wiblin: Not as far as we know. OK, so at least if you don’t need to use data, then maybe you can secure it by sticking it in a very deep bank vault and having people with guns protecting it.

Nova DasSarma: Yeah. I think oftentimes the answer is you have people with guns outside. When you look at, for example, the US military, and you look at software that’s developed for their secure systems, you have something that’s called a SCIF — a sensitive compartmented information facility, if I remember correctly — where you do have a Faraday cage built into the building, and you can’t bring any outside devices in, and you do have people with guns outside. So every day when you go into work, you have your ID badge verified, you go through a metal detector, and you never take any of your code outside.

Nova DasSarma: And that’s definitely something that seems like the way that some of these systems are going. It is of course, very, very inconvenient to do things like this. So if you’re actually trying to get developers to work in here, you need to have copies of things like documentation and package repositories, and various other kinds of infrastructure that we sort of take for granted as developers. You need to replicate all of those things, keep them inside the building, and ensure that when you created them you didn’t also create a vulnerability in there — because these systems eventually will interact with the outside world. That’s part of what you’re doing there. So if you’ve got a sufficiently motivated actor, it’s possible that they got deep cover, something into that building.

Rob Wiblin: So this is if you’re designing the software that runs a tank or some incredibly sensitive military thing? Like strategic nuclear systems would be designed in this kind of extremely secure facility?

Nova DasSarma: I think nuclear systems sounds like something that would be there. I imagine that various three-letter organizations have software that’s developed inside of SCIFs as well.

Rob Wiblin: And I suppose that’s very difficult to do, and I guess it’s not the kind of thing that can run an API where you’re interacting with lots of things in the world. So it’s extremely limiting deployment to, occasionally you compile —

Nova DasSarma: You bundle up an artefact.

Rob Wiblin: — and then send it out.

Nova DasSarma: Yeah. You have a guy with a briefcase that’s full of floppy drives come out, and he says, “Would you like to see these floppy drives?” Yeah. So that’s not something that’s very feasible for an online deployment. You can do things that are limited versions of this though. And oftentimes you can have systems where you’re developing in a more secure environment. For example, you don’t have windows out to the outside, where someone with a camera can read your password while you’re typing it. You can have that kind of security, where you’re doing some good stuff on the inside and deploy to a server room that is relatively secure and only talks to the outside through fiber.

Rob Wiblin: Yeah. I’m slightly asking about these because I’m trying to figure out what is viable today, like what’s the state of the art? Because it seems like there are things that we can probably keep secure most of the time. The problem is that it’s just massively reduced the functionality of these things. So maybe what we want to do is find ways of creeping out the things that we can keep reasonably secure while still maintaining some level of functionality and interaction with the outside world. And that’s the really big challenge.

Nova DasSarma: Yeah. I think that if you can keep most of your system inside these domains and you mostly talk through things like a gatekeeper — you’ve got some piece of software that talks to the outside and converts something into a language or a packet that is ideally formally verified or something like that to not produce bugs in your system. Then that’s, I think, the best you can do, really. You can do data cleaning on the input to your system.

Rob Wiblin: Are there any networks that we really would have expected to be penetrated, that as far as we know have not? I suppose the NSA stands out to me as an example of an organization that, from what we’ve learned, has an incredibly large repository of incredibly sensitive information that they’ve nicked off of our emails and phones and so on. They obviously have a huge target on them. Other countries might be very interested in breaking into the NSA’s networks. And yet, as far as I know, that hasn’t happened. They might want to do it just to embarrass the NSA, but somehow they’ve managed to not have that embarrassment happen. Do you have any thoughts on that?

Nova DasSarma: Yeah, I think that’s an interesting case. I think the NSA is very lucky in a sense, where they’re a great attraction for a security person who is interested in solving probably the world’s greatest challenges. So they absolutely have things like people who are actively looking at and inspecting the traffic that’s going in and out of these networks and things like that. If you’ve got a team that is examining everything and they’ve got enough tools to do that, you can get a lot further than you can get being anybody else I think. This is actually one of the cases where AI might be interesting in improving information security: doing things like these sorts of deep packet inspections, looking for suspicious activity — whether that’s information being exfiltrated or some kind of payload that’s coming in. So we’ve seen some things that look like that.

Rob Wiblin: OK, so you’re saying one reason why maybe the NSA could be doing better than almost anyone else is just the sheer amount of effort that goes into securing their networks — which includes not only people updating the systems regularly, so they’re patched, but people who are inspecting everything that’s going on within the system. So that if something dodgy begins to happen, if someone breaks in, they’re likely to be detected and blocked much faster than they otherwise might be.

Nova DasSarma: Yeah. It also helps that the NSA is part of the federal government. And if you try and break into the NSA and they catch you, then they can send people with guns to your house, whereas many other companies can’t do that. Certainly not as directly. The other side of this is that the NSA has folks who are interested in things like breaking into systems. If you are a 14-year-old hacker who broke into some large system and you’re possibly in danger of going to prison for a very long time, I imagine that you might get an NSA recruiter tapping you on the shoulder saying, “Hey, wouldn’t it be nice if…”

Rob Wiblin: “…you didn’t go to prison.”

Nova DasSarma: Giving you probably one of the most exciting jobs of your life. So doing things like penetration testing — trying to break into those systems — is one of the best ways that you can verify that security. So you have sufficiently smart people, clever people who think outside the box trying to break into those systems. And that helps you discover those vulnerabilities before somebody else does.

Rob Wiblin: OK. Well, I started out this conversation feeling some degree of despair, because it sounded like this was borderline futile trying to prevent this information from getting out. But it seems like there are people who are making some serious effort, and hopefully if we have years or decades to try to improve what the state of the art is, maybe we’re not completely screwed, that any ML model is going to be stolen pretty quick smart.

Nova DasSarma: I think there’s some pretty exciting progress. I think that right now, if we were to produce AGI in our kitchen, it would be stolen. But if you don’t have that case and the timelines are longer, then you have time to do things like build your own formally verified systems that are very robust to these sorts of attacks.

Formal verification [00:46:26]

Rob Wiblin: Yeah. Can you explain what formal verification is?

Nova DasSarma: Formal verification is a logical technique for talking about the execution of an algorithm. You would use something like this to describe the behavior that you want in a system, and then basically create a very, very long, complicated proof that proves that there are no vulnerabilities or unexpected behaviors or undefined behaviors in the system. If you look at something like, for example, the C language — which is a systems language that is used for producing things like the Linux kernel and other really important pieces of software — there are certain kinds of operations that you can do that are just simply not defined in the specification for it.

Nova DasSarma: If you look at almost any piece of software, there are ways you can give it sufficiently weird, sufficiently well-crafted inputs that will cause it to crash, that will cause it to override pieces of memory that you didn’t expect it to. Formal verification is a technique for preventing those things going into software in the first place. Of course, it’s quite difficult. You don’t see very many large pieces of software produced this way. But we’ve seen some pretty interesting examples of things like microkernels that come out of mostly academic labs that are interested in this sort of thing — where you’ve got a kernel that has some very limited functionality, but is completely proven.

Rob Wiblin: Interesting. So this is something that we kind of know how to do, but it’s quite challenging. And I guess the more complicated the program you are making, the harder it is to formally prove that there’s no circumstance under which it might do X. That there’s no input that can produce X as an output.

Nova DasSarma: Yes. If you think about the complications inside of a program, you have a lot of cases where you’re multiplying and putting exponents on the complication of the kinds of inputs that might come into the system. A good technique for making your program able to be formally verified is splitting up the input in ways where you can formally verify parts, and then talk about things at that level of abstraction. So you can say that these parts interact in a particular way.

Rob Wiblin: Is this a kind of new technology? I remember hearing people bring up formal verification years ago, and I think they were talking about it with the sense that this was something that didn’t really work at the time, but might get better in the future. Is it an area of research where we’re improving?

Nova DasSarma: It definitely seems like we’re improving. I think that programming language design in general has come a really long way since the dawn of computing. So I definitely think that there’s progress there, and it’s a pretty exciting field to be working in. That being said, I think we will have to have some really significant advances in programming language theory to have formal verification become a very accepted part of the toolbelt.

Rob Wiblin: Yeah. I was saying I was optimistic because it sounded like there were some people who were succeeding partially when it came to information security. But I guess we should only be optimistic if we think that on balance, information security or computer security is improving over time, rather than staying static or getting worse. Would you say we’re getting better, or at least maybe building up the capability to in future become better at securing information?

Nova DasSarma: Yes. I think I can confidently say that things have gotten better over time. Things have gotten much more complicated and obviously the surface has increased, but as that surface has increased, you’ve also gotten the incentives in play for people to secure things. Moving commerce to eCommerce, having online systems that are handling large amounts of money, has been a fantastic motivator in getting people to think about how they can write software in a way that doesn’t get you broken into.

Nova DasSarma: When I was in middle school, we had some software on the Macs in the computer lab that didn’t let me run NetHack — which is like a text adventure game — and I was pretty upset about it. So at the time it was extremely trivial for me at like 12 to boot this thing into a recovery mode where it didn’t check for password, and then you could change the password in the password file because it wasn’t very well encrypted and things like that. And that just wouldn’t happen today.

Nova DasSarma: If you look at something like the MacBook M1 chip, a lot of the passwords are secured in a hardware chip that is specifically designed to resist this sort of thing, as opposed to on a file on disk. Now that’s not true of everything. For example, in the Okta hack recently, there was an Excel file that contained the password for a LastPass account — which is certainly something you don’t want to see as somebody on the security side of things. So obviously the last system to secure is humanity. You can have as secure a system as you want, but if you put a sticky note on the outside of your laptop that has your password, which is going to be visible to somebody who is on a Zoom meeting with you, then perhaps your security is not so good after all.

Rob Wiblin: Yeah. I suppose you could have two-factor authentication. Use the keys that you have to physically have in order to secure something, so the password isn’t so central.

Nova DasSarma:For sure. Yeah. I think multifactor authentication is very, very exciting, actually. Something where you can both prove who somebody is and what somebody has is pretty important. We’ve seen some advances in this. For example, now there are more places that might require two-factor authentication, that might require two-factor authentication that isn’t through SMS, and might even support something like FIDO2, which is a protocol where not only is the key that you’re producing out of this hardware device specific to that device and that user, but it’s also specific to the site that you’re authenticating with as well.

Rob Wiblin: Yeah. We’ll come back and maybe do a bunch of security advice for individuals and organizations. Because some of that stuff is really quite relevant to how you can in practice secure a network that people are actually doing normal work on. Not secure it against the most persistent threat from all of the best Russian hackers, but secure it against ordinary attacks.

Nova DasSarma: Right. Having enough things in the way that perhaps they will go and attack somebody else.

How organisations can protect against hacks [00:52:42]

Rob Wiblin: I imagine you don’t want to spill all the beans on the security work that you are doing at Anthropic. But what’s one thing that people in the audience might not have heard about that you could do with a network in order to make an adversary think, “Maybe I’ll go and hack someone else. Maybe I’ll go and bother someone else’s network, because this is getting harder.”

Nova DasSarma: For a network specifically?

Rob Wiblin: To be honest, anything to do with Anthropic’s computer systems.

Nova DasSarma: Sure. I think that one of the biggest things you can do as an organization to avoid people getting hacked is to give them hardware that you control, and to have a unified fleet of hardware. For example, if you’ve got a fleet of computers that are all MacBooks that are all centrally secured and have encryption on the disk, then you can limit the damage that’s done by a hack to some Windows firewall or something like that to non-existent — because you’re only running on MacOS. And you are also limiting things like, I went on AliExpress to buy a BLÅHAJ plushie or something and it turned out that the vendor sent me a Word macro file that broke into my computer. But I did that on my personal device, and so now they would need to be able to jump another level into your corporate device. And you can do things like lock down the software that’s on there, lock down the websites they can access.

Nova DasSarma: Please use an ad blocker. Ads are important for the health of many companies, but if they’re a legit company, oftentimes you can support them in other ways. Ad networks are one of the easiest ways for an actor to inject malicious code onto a site that might otherwise be safe. So use an ad blocker.

Rob Wiblin: I guess Anthropic is a reasonably new organization, it’s a year or two old. Is there anything important that you did in setting up the way the systems work that secure it to some degree?

Nova DasSarma: Sure. I think that having corporate devices is pretty important. Another thing to think about is we used to talk about trusted networks and having things like a corporate network. And Google’s done some pretty good work with things like BeyondCorp, where you don’t really think about a VPN or something like that for security — you instead think about identity-based authentication. There’s no such thing as a “trusted network” where when you get on it, you no longer have to authenticate to grab files off of a shared SharePoint drive or something like that. You’re always authenticating at every step.

Nova DasSarma: The other thing that we do — that I suggest to every organization — is to think about single sign-on to make sure that you don’t have your users managing passwords for many services, juggling passwords around, where it can get very tedious for them to use a different password for every service. Using things like a password manager and single sign-on can help mitigate some of those flaws.

Rob Wiblin: Yeah. Anthropic has appeared during the COVID era, which I imagine means that you’ve probably been working remotely, at least in part. Do you think as security becomes an even bigger concern, as hopefully your models become more capable of doing important things, is there a chance that you’ll basically have to stop being remote or have some kind of physical restriction on access to models or data in order to ensure that they are sufficiently secure?

Nova DasSarma: I would not be surprised if that’s something that happens in the future. That being said, I think that we’ve actually had some advantages in terms of starting out with this remote policy. You can’t have a trusted network that everybody’s on if everybody’s on their own network. It’s been important in sort of driving forward the identity-based authentication policies there. I agree that I think physical security is going to be quite important in the future, and there are sorts of mitigations that you can’t really express remotely, but I think that’s in our future timeline.

Is ML making security better or worse? [00:56:34]

Rob Wiblin: For a while, people have talked about how improved machine learning or improved AI could lead to a kind of security apocalypse situation, where AI is just so good at discovering vulnerabilities in systems that they would suddenly become much more transparent to a wider range of actors than they currently are. The response I’ve heard is, “Well, if the bad actors can do that, then surely the good actors can get the same model, find the vulnerabilities and patch them pretty quickly.” So it’s actually not obvious whether this helps with offense or defense. Do you have any view on whether ML itself is making security better or worse?

Nova DasSarma: Hard question. I think that probably it’s going to make things better. The reason I believe this is I think that a lot of the work that’s being done at places like Apple on security is also driven by ML. I know I’ve specifically called out Apple a couple of times here. There are other things that I like — for example, the Google Security Blog is fantastic to read. But I think that security researchers getting access to more sophisticated models is going to be quite positive. I also think that there’s a possibility that we can use ML models to drive formal verification forward, and to drive adoption of programming practices that are more defensive, that are harder to break into, with ML. It is definitely a Spy vs. Spy sort of scenario, where the technology is very much dual use. And it’s certainly going to have some growing pains as these things become more capable, but I’m hopeful that that’s going to result in a more secure — while still being usable — future.

Rob Wiblin: To what extent would computer security benefit from just more money being spent on it by the relevant actors? I ask this because I think I remember reading about someone who got paid a bug bounty for finding this horrific flaw. I can’t remember what it was in — MacOS or in the iPhone or something like that. But plausibly, they could have sold this for tens of millions, hundreds millions of dollars, because of the power of this exploit. I think they got a million dollars or something from Apple for going through their bug bounty program.

Nova DasSarma: It’s pretty good. I see thousands of dollars often, or hundreds.

Rob Wiblin: Right, yeah. So I think this was one of the largest, but it was nevertheless much less than you imagine they might be able to sell it to criminals for, or to state actors for. I suppose most people would rather sell the thing to Apple, all else equal, and they’d probably rather not be on the run from the feds. But would it maybe just help to throw more money basically at getting people who are kind of on the fence between being good actors and bad actors to be good actors?

Nova DasSarma: That’s a great question. Honestly I’m not sure, but my guess is yes. It definitely seems like something where we’ve seen bug bounties show up more often. We’ve seen security researchers get paid better. And I think we’ve seen an increase in the resulting security and a decrease in the prevalence of people trying to passively break into systems for money instead of getting to work on a team of people who are all trying to break into things for money and not going to jail.

Motivated 14-year-old hackers [00:59:32]

Rob Wiblin: I guess I have this stereotype from the past that computer security is bad enough that a motivated 14-year-old who hasn’t been to university yet, but just is really into computers, can probably do some interesting hacking, break into systems that you’d be kind of surprised that they could get into. But I wonder whether that might actually be an outdated stereotype, and whether perhaps things have improved sufficiently that a 14-year-old actually might struggle to do anything interesting at this point. Do you know where we stand on that?

Nova DasSarma: I think that stereotype is still quite accurate. Broadly, there is more software than there used to be. So a lot of the targets that were on that lower end of the security spectrum, there just are more of them. I think that until we find ways to create secure systems by default, instead of having to do security as more of an afterthought, we are going to continue to see situations where a script kiddie with a piece of software that they downloaded off of GitHub can do a vulnerability scan and deface some website or something like that. I think it’s a lot harder for them than it used to be to break into things like whitehouse.gov or something like that.

Rob Wiblin: Yeah, I see. Maybe the top end has gotten more secure as this has become more professionalized, but there’s so many more things on computers now in general that the number of things that are not secure is still plenty.

Nova DasSarma: Exactly, yes. And I think in some ways this is good — having systems that kids are able to break into is in fact a good thing. But we’ve seen some really cool stuff in terms of websites where you’ve got a Capture the Flag scenario, where you’re meant to try and break into one level and then it gets to the next level. Then there’s some key that you have to find for the next one. And these are actually really, really fun. I think it’s a great way to get kids interested in security. I would obviously not condone somebody trying to break into arbitrary websites, but certainly there are tools that are actually fun to do this with.

Rob Wiblin: How illegal is it to break into a website or something that doesn’t matter that much, just as a matter of getting practice and training? Assuming you don’t do any damage whatever?

Nova DasSarma: Very illegal and you shouldn’t do it. But I would say that if you’re interested in trying to do some kind of vulnerability testing, I would contact that website and ask them. Because a lot of Silicon Valley mindset is to ask for forgiveness, not permission. Computer security and data losses is not one of those things. This is what one would call a crime. I don’t recommend it.

Rob Wiblin: But you’re saying if you contact a random website and say, “I think you might have a bunch of vulnerabilities, I am training in this. Would you like me to try to break into your systems and then tell you what to fix?” that enough of them will say yes? That this is a viable method?

Nova DasSarma: I think not very many people will say yes to you, if you’re not somebody with a background in this sort of thing. And if you don’t have a background in this sort of thing, then I would recommend looking at some of these Capture the Flag websites, some of these other sorts of things where somebody has actively set up a really interesting puzzle for this. And I imagine that the NSA has some programs around this, if you’re interested and on the younger end.

Rob Wiblin: If there’s young people in the audience who are interested to try their skills at this sort of thing, what resources can you point them towards? Is there like a Hacker Monthly magazine or a podcast they should be subscribing to?

Nova DasSarma: There’s a thing called the CVE, which is a centralized database for talking about various sorts of vulnerabilities and computer systems. Taking a look at the sorts of things that are there can be quite informative. Oftentimes they have exploits that come with them as a proof of concept for being able to break into those sorts of systems. That’s a good way to get acquainted with the sorts of vulnerabilities that people introduce into these systems.

Nova DasSarma: There’s a site called CTF101.org that talks about forensics and cryptography and exploitation and reverse engineering and that sort of thing. That’s a pretty good resource. There’s a thing called Metasploit, which is another database of exploits that you might want to look at. There are a lot of different kinds of Capture the Flags. Those specifically I think are really good. I think there’s nothing like experience in many, many computer things. It’s very easy to read about something and go, “Oh, that makes sense.” It’s a lot harder to put it into practice, and having a system that’s live that you can try stuff on where they won’t call the police to your house is really good. Trying those is great.

Disincentivising actors from attacking in the first place [01:04:12]

Rob Wiblin: A listener wrote in with this question: “The obvious way to reduce infosecurity risk is to beef up your own security. But another is to disincentivize actors from attacking in the first place. Are there any good ways of doing that other than the obvious of the criminal justice system?”

Nova DasSarma: So you’re talking about counterintelligence type stuff? Like having something that will damage their computer if they log into it or something?

Rob Wiblin: I’m actually not sure what they’re thinking of. I suppose, if you want to talk about options for using criminal justice to discourage people that would also be good. But I guess when I was reading about the Nvidia hack that we were talking about earlier, the hackers themselves wrote that Nvidia tried to hack them back, in order to destroy their systems and then destroy the data that was on them. And they were miffed about this. I don’t know whether there’s any kind of counterattack thing that you can do that makes it less appealing to target you in particular.

Nova DasSarma: That’s a great question. I don’t know if there are automated systems for doing this sort of thing, because you might be a criminal breaking into my system, but if I try and break into your system and succeed, that is also a crime still. I think there are some difficulties there though. Oftentimes I would imagine that it wouldn’t be prosecuted very heavily. I think that this is probably not the road you want to go down, unless you have people in mind who would be interested in actively doing this sort of security thing.

Nova DasSarma: But I think in Nvidia’s case, they specifically wanted to destroy an asset. They had a piece of intellectual property that destruction would result in the hack being neutralized. Whereas oftentimes, for example, Amazon or something like that might just lose a bunch of money. You might have a bunch of things stolen that a counterhack isn’t really going to save you from.

Rob Wiblin: Right, yeah. I guess this person might have this in mind in part because I think this exists at the state-versus-state level — where one country will hack another, and then the other gets annoyed and so they retaliate in kind. I suppose knowing that retaliation is possible helps to put limits on what countries will do to one another, basically.

Nova DasSarma: For sure, a mutually assured destruction policy is certainly something that’s important in the broader game theoretical sense. I think that most companies and most individual actors do not have the capacity to do these sorts of things. And that you’d be better off spending the time that you’d spend on that in decreasing your attack surface, limiting the software that you use and that sort of thing.

Nova DasSarma: Criminal justice-wise, in financial systems and things like that, we talk about having compliance and auditing, and having very clear policies on what misuse of a system looks like — such that we can do automated detection, such that you’ve got some recording of some kind of bad behavior on your computer systems and it is very clearly something that we told you not to do and is very clearly an abuse of the CFAA. That’s something where I guess on the criminal justice side, keeping good audit records is pretty important for being able to successfully prosecute this sort of thing.

Nova DasSarma: And get on the FBI InfraGard list. There’s a mailing list for information security professionals that is free. It comes from the FBI and oftentimes has some pretty interesting stuff. They recently had a webinar, for example, on cyberattacks related to the Russia-Ukraine crisis.

Rob Wiblin: Nice. We’ll track that down and stick up a link to that in the show notes.

Nova DasSarma: You can find that on YouTube actually. One of their calls was recorded.

Rob Wiblin: To what degree is it still a major problem that people just don’t patch their systems when there’s security updates?

Nova DasSarma: This is very, very, very, very prevalent. Certainly this is something that it’s very inconvenient to patch systems. You have to take them down. You might have to test them. There might be something in there that leads to a change in behavior.

Rob Wiblin: Incompatible with the security update.

Nova DasSarma: Yeah. I used to work at the NCBI, the National Center for Biotechnology Information, doing systems administration on some of their large systems. We were trying to upgrade our systems from one version of CentOS, which is an operating system, to another. And in the process, it changed this core system directory component. It wasn’t a security update, but it broke a major piece of functionality that we used to look up users on systems. And this was an incredibly frustrating thing for me when I was working there. And in fact, I didn’t solve it — I instead left the company. Not because of this, but I had other opportunities that I moved onto, but it is something that I did not solve in my time there and that was something I worked on for several months. Especially when you have many pieces of software interacting, patching a system is very nontrivial.

Rob Wiblin: To what degree is this kind of an important thing for the whole software world to be improving? Finding ways of doing security patches that don’t break other software and don’t break systems, so that people are more willing to patch things faster? Maybe they can even just be set up to occur automatically. Possibly this is just fundamentally not possible, because sometimes the security patches require changing something, and changing that thing then breaks something else.

Nova DasSarma: Yeah. I think that the incentives just aren’t in place for people to make a big effort to maintain older versions of their software with security patches. If you look at something like the Linux kernel, this is a project that does this quite well. There have been several versions of the kernel. I think we’re on five something right now — don’t quote me on that, I would have to check. But there are several long-term support versions of the kernel — that might be the four series, or it used to be the three series — where when a security vulnerability was found, maintainers would backport that fix from the latest version to all these long-term support branches.

Nova DasSarma: That means that if you have software that depends on a particular version, you don’t need to upgrade and have to deal with other kinds of compatibility issues to patch your kernel. All you need to do is apply this one fix. And more software should work like this. The concept that you have to be on the latest version of software — as opposed to up to date with your security patches — is a thing that I think makes being up to date with security patches much more difficult.

Rob Wiblin: Yeah. This sounds very expensive for the Linux people. I guess it’s cheaper for everyone else, but is creating a whole bunch of work, making sure that you patch every previous version of the thing with the current security updates.

Nova DasSarma: Yeah. And it’s something where you’ve got a long-term security policy, you’re only going to let it last for a couple of years or something like that. Maybe a little bit longer. There are ways to mitigate how many versions of the software that you’re keeping track of. I think it’s possible and I think it’s definitely worthwhile. I imagine that there are ways that you can grant to open source projects that are relied on to help them have the capacity to do these sorts of things — to backport these fixes to when there is a security problem, not force people to jump three versions of the software.

Hofvarpnir Studios [01:11:04]

Rob Wiblin: OK, let’s talk now about your other big project, which is Hofvarpnir Studios — which has gotten funding from the Effective Altruism Infrastructure Fund, and now Open Philanthropy as well, and a bunch of other people. I guess while we’re talking about that, we could maybe talk about this issue of system architecture and compute as well. What is the goal for Hofvarpnir? What’s it going to end up doing?

Nova DasSarma: It’s Hofvarpnir Studios, and we think that AI compute is kind of a carrot in terms of driving how research works. So what we’re trying to do is build the capability to run models in a very consistent, very secure way, and offering that capability to AI safety researchers. Specifically we’re working with folks at CHAI and we’re working with Jacob Steinhardt (also at Berkeley), some folks at UT Austin, Mila. We’re hopefully talking with some people at Princeton. And we’re looking at people who specifically are working on safety, that we’ve seen have a proven track record of publishing safety research.

Nova DasSarma: Our goal is to have this differential capability available to them. Because oftentimes they’re not as well funded, or in general when it comes to crunch time right before a conference submission deadline or something like that, oftentimes all the compute’s gone; you can’t get your P4Ds at Amazon. So having these extra resources available means that these folks are able to get that little leg up. And I think that’s what we’re doing.

Nova DasSarma: The reason that we actually did this is because Shauna a few months ago got a 3090, which was a very exciting graphics card. I think she mostly got it to play Half-Life: Alyx or something like that.

Rob Wiblin: Good game, by the way.

Nova DasSarma: It’s a pretty good game, yeah. So we had this graphics card. And we had these friends who are safety researchers who were struggling to get compute at their own labs, and we ended up doing a remote connection to her home desktop with one graphics card because it turns out that was valuable enough — that putting in this effort helped them get their research moving forward. And we were like, “This is ridiculous. This is absolutely the sort of thing that an organization should be doing, and nobody’s doing it.” And then we looked at each other and we were like, “We should do that.” So that’s how we got started.

Nova DasSarma: And the Effective Altruism Infrastructure Fund was pivotal in helping us get those first machines online. And now Open Phil has been helpful in scaling that even further. So I’m very excited to see how this work turns out, and making sure that we’re staying with safety. I think part of that ends up looking like working with the researchers who are currently using our compute to drive decisions about who else gets to use it, because alignment is important. And I can’t claim to be an expert on safety, but I have friends who are. So I can ask Shauna, I can ask Dr. Steinhardt, ask folks at CHAI, that sort of thing.

Rob Wiblin: Yeah. Why on earth is this thing called Hofvarpnir? I should spell it out, I guess, for people who want to Google it. It’s H-O-F-V-A-R-P-N-I-R. Why?

Nova DasSarma: Yeah, it’s a horse from Norse mythology. But the reason that we called it this, it’s very tongue-in-cheek. There’s a piece of fiction called Friendship is Optimal, where there is an AI foom, essentially, and it was developed by a company that’s called Hofvarpnir Studios. And so we took this. One, I think that if somebody reads that and goes, “We should totally do that” at least they can’t call it Hofvarpnir. Two, I think it was one of the things that pointed me at, “Huh. You know, safety might be important for these AI systems.” So it’s sort of a call out to Iceman, who wrote that fiction.

Rob Wiblin: Nice. OK, so the basic idea is to collect a whole bunch of compute that academic research scientists can use if they want to do research that is more tilted towards improving safety and alignment than it is tilted towards improving raw capabilities of AI. And I guess the vision is there’s people who wanted to do this research but they were lacking the compute, so it was slowing them down. And then the availability of this cluster might be a sweetener for people who are kind of on the fence about what sort of work they’d like to do. Because if they’re doing something that other academic safety- and alignment-focused people think is great, then they might be able to get access to this cluster, and effectively it’s kind of like getting a grant on the side.

Nova DasSarma: Yeah, for sure. Right now we’re focusing on people whose work is entirely safety, because we’re not really equipped right now to do a good evaluation of some of those more edge cases. But that’s definitely something that’s down the road if things go well.

Nova DasSarma: There are two other things that I think that Hofvarpnir is important for. One, I talked about it being quite difficult to get access to the amount of compute, and access to these large systems — especially as a beginning infrastructure person who might be interested in AI safety. So this is providing a sort of a playground where there’s somebody who’s had some time to think about these things. And we’ve got opportunities to hire some of these people and train them and hopefully keep them within AI safety. So in some ways it’s an on-ramp there.

Nova DasSarma: The other thing is lots of folks already have GPUs. Many labs have GPUs, but those GPUs might go underutilized because you might only use them say 50% of the time, right? You might not always be running experiments off them. So part of what we do is we also try to incorporate some of those existing resources into that same cluster, and make it possible to schedule across these several spines within this cluster.

Nova DasSarma: I’m going to keep saying Dr. Steinhardt and CHAI — partially because they are our funders, and also because they’re doing a lot of really exciting safety work. So if Dr. Steinhardt wants to run an experiment that’s on more computers than he has available, and CHAI isn’t actually running some stuff on some of their computers, then because we have this unified platform — and because we’ve containerized their workflows, we’ve done a bunch of infrastructure stuff to make it possible to run these transparently — you can schedule your workload to take advantage of these additional computers that wouldn’t otherwise be available to you. So think about it like defragmenting these otherwise siloed pools of compute, and freeing those up specifically for AI safety.

Rob Wiblin: So you’re telling me there are still GPUs sitting out there idle, rather than mining crypto?

Nova DasSarma: I know! Who can believe it?

Rob Wiblin: Well, I suppose they’re in universities, so maybe it’s hard to get permission to use all of the electricity to get some ETH or something.

Nova DasSarma: Yeah. I don’t think that we’ll be mining crypto on these GPUs anytime soon.

Capabilities vs safety [01:18:10]

Rob Wiblin: Yeah. OK, I know this isn’t particularly your area of expertise, but let’s talk a little bit more about the capabilities-versus-safety issue, because it’s something that came up in discussion online when I said that I was going to interview you. For listeners who haven’t heard these ideas before, there’s a distinction that people draw between trying to make AI capable of just doing things in general — like being good at dealing with people, being able to get people to do what they want, which people called “capabilities” — and then there’s talk of “safety and alignment” research, which is more about ensuring that whatever a system is capable of doing, it ends up doing what you want, or using the methods or achieving the outcome that was actually desired by its handler or deployer.

Rob Wiblin: Now some people dispute this distinction a bit. Some people think that these two things kind of go hand in hand more than some other people do. The argument there is that something’s not really capable if you give it instructions to do what you want it to do and it goes off and does something completely different and random — that doesn’t sound super capable, and it’s also not aligned. But there’s plenty of other people who think that there is kind of a distinction here — that there’s some things that you could do that would add capabilities but not do very much to improve alignment and vice versa.

Rob Wiblin: How worried are you about the possibility that your work either at Anthropic or at Hofvarpnir could end up improving capabilities more than is your desired outcome?

Nova DasSarma: I think there’s a chance. Certainly there are a lot of unknown unknowns in research, and it’s possible that we would differentially improve capabilities. Part of the reason we’re also providing additional compute is because I think that a lot of the good research is done on larger language models. I think that there are things that show up in large language models that you don’t see in lesser models.

Rob Wiblin: Could you say a little bit more about the lines of AI safety research that are most bottlenecked by compute?

Nova DasSarma: Oh yeah, sure. I think anything with large language models is pretty bottlenecked by compute. Anything where you’re doing reinforcement learning, you’re doing something kind of pathological with how much you’re able to use the GPU versus CPU on systems. So I think both of those are pretty bottlenecked by compute.

Rob Wiblin: Yeah, that’s interesting. So the issue of the risk that people who are trying to improve AI alignment could accidentally on net make things worse by contributing to AI capabilities and just speeding up progress in AI in general — this has been an issue that I remember people talked about 10 years ago basically. And it shows up in biotechnology as well, this issue of dual use and the fact that if you’re doing research then almost by definition you don’t know exactly what you’re going to find. So couldn’t you accidentally make discoveries — or possibly even predictably make discoveries — that are actually going to hasten the deployment of various AI systems? We’ve talked about that with Chris Olah at some length, so people can potentially go back to that interview if they want to hear his view on that.

Rob Wiblin: A challenging thing with this disagreement, inasmuch as people disagree, is that it seems like the disagreement is quite often driven by quite deep worldview issues. Where the people who are most keen to just push ahead with AI safety research and feel pretty positive about it are folks who think that AI alignment is reasonably likely to succeed — that current AI systems are within shooting distance of being made aligned — and so they feel pretty good about pushing ahead and just making more capable systems and aligning them as we go.

Rob Wiblin: There’s other folks who think that the current ML architecture is just not going to be possible to be made safe, ever. Maybe they also think that dangerous AI systems, things that could actually do damage, are likely to be deployed in the near term, and ideally they’d just like to see everything slowed down so that there’s more time to really pick a different direction in which to go.

Rob Wiblin: People have different fundamental worldviews like that, or quite different visions for how the world is and how it might go in future. It’s challenging for them to discuss these concerns and then reach an agreement. They kind of have to coexist peacefully and, I suppose, hope that over time, one of them will be shown to be right and actually people will converge on having a better understanding of what the risks are, if indeed they are real. Sorry, is there a question there?

Nova DasSarma: Yeah, I think that was more of a statement than a question. But yeah, I think I agree, Rob. That being said, I think folks who think that the underlying architecture of especially large language models is unalignable by its nature — I’m vaguely sympathetic to this worldview in some ways. But even if this were to be true, I don’t think that I would change what I’m doing here, because I think in a lot of ways, it’s at worst a harm reduction measure. We are giving the folks who are working on fundamentally different architectures more time to try and create those things.

Nova DasSarma: And the other thing is, in some ways it looks kind of arms race-y, right? You might be able to say that in the US, you shut down this research somehow — you get enough senators on your side to come out and say, “You can’t do this anymore. It’s national security.” That’s not going to stop that research from happening. It’s just going to drive out safety researchers. I think almost any kind of thing that looks like that just means that you only have bad actors — you only have sufficiently motivated capabilities actors doing this research.

Nova DasSarma: I think that’s not what we want as a field. We want safety to be driving things forward. We want to have results that come out of things like Chris’s interpretability work. The finding of induction heads is not something that came out of capabilities; it came out of looking at how to make these models interpretable. So I think that there’s a large amount of value to be had in doing that research, even if it turns out that these are fundamentally not fully alignable.

Rob Wiblin: Yeah. An audience member wrote in with the question: “Is the typical engineer working on non-safety research at OpenAI or DeepMind increasing or decreasing the odds of an artificial intelligence-related catastrophe?” This question definitely puts you on the spot. It’s one that I know many people are not super keen to answer because they could offend their friends.

Nova DasSarma: You know what, that’s a really hard question to answer. I think that if you come at things from the perspective that all capabilities work of any kind is negative, and it is of itself increasing the odds of an AGI-related catastrophe, then that would be an answer to your question. But that’s not my model. In some ways, I think that having capabilities that are within organizations that have pledged to treat things responsibly are probably decreasing these odds.

Nova DasSarma: Things like the OpenAI APIs for accessing models mean that more people have access to language models in a way where there is a gatekeeper — where there is a layer of safety and a way of imposing values onto the users of that model in a way that is fundamentally not true if you have a large number of actors. So I think that it’s very hard to say. I would say that I think certainly there are safety researchers at OpenAI, certainly there are safety researchers at DeepMind, and I think that those organizations also are thinking very thoughtfully about these things. And I’m hopeful that they are decreasing the odds of an AGI-related catastrophe. If you made me answer that question, I think that would be my answer there.

Rob Wiblin: Yeah, it’s certainly the case that it’s not only possible for the safety orientation or the thoughtfulness or the niceness of the actors working on AGI to improve. It could also get worse over time, or people can just become more scattered and more arms race-y over time, which is definitely a factor that complicates this question.

Nova DasSarma: And I think that there’s in some ways an evaporative cooling effect if you say that you’re never going to work at a place that has any capabilities, because then all you have working there are people who are purely interested in, “Let’s crank it to 11. Let’s drive this forward.” I think that having safety researchers there is important. And having this sort of collegiality with other organizations, and having standards, and being able to talk to each other about these things is important. So I think that the typical engineer there is probably decreasing these odds, just out of a matter-of-fact consolidation of intellectual capital and capabilities within a smaller number of folks who can then cooperate.

Rob Wiblin: Well, I’m glad we’ve wrapped that one up. I’m sure this debate will never come up again.

Nova DasSarma: Yeah. For sure.

Interesting design choices with big ML models [01:27:08]

Rob Wiblin: Something we haven’t talked about yet that I’m really excited to learn more about is what interesting design choices there are when you’re putting together a cluster, to try to get as much useful compute when you’re doing big ML models. What are the interesting optimizations that you could make?

Nova DasSarma: I love this. This is one of the best parts of my job. So for making a cluster for ML models, there are two basic components: you’ve got these units that can do floating-point operations and matrix multiplication, and you’ve got cables that are connecting them together at a certain speed. And you’ve got some memory, that sort of thing. So the design choices that you’re making are tradeoffs between mostly price and how much bandwidth you have between these units. The other thing is a design choice on the software end: are you writing your software in a way where it can easily be distributed over multiple GPUs and things like that. So that’s also a design choice, though mostly on the software side.

Nova DasSarma: And some logistical choices. You want to try and run your stuff where the power is cheap and the real estate is cheap. It turns out that rent for a large number of servers is actually a big cost. You also want to think about how easy it’s going to be to expand. Hofvarpnir made a pretty big mistake starting out. As a bit of background, almost all of my systems that I run for myself are run on hardware that I own, in people’s basements and stuff. It’s this global network of lots of computers that are all talking to each other, but they’re all on residential connections and they have interruptions and stuff like that. And this turns out to be really cheap when you’re a college student. It’s a great idea for anyone who’s trying to work on a system, and they’re trying to do something scrappy — I recommend doing that first.

Nova DasSarma: But it turns out that time is the limiting factor in some ways. So when we first got our EA Infrastructure money, we were like, “OK, we’re going to buy all these components ourselves and assemble.” And then we had some technical difficulties, and we had some difficulties with the provider not having the right width for the racks and stuff like that, and all sorts of really weird integration issues. And it turns out that it was better in the end to go with an integrator for future expansions, because it means that we can ask them, “We’re going to use these standardized units, and I just want four more of them. We’re just going to put them right next to each other,” and that sort of thing.

Nova DasSarma: Going back to the design thing though, the bandwidth does become a concern there, because you have a limit to how much bandwidth a certain switch can carry, and you might want to get a more expensive switch if you think that your cluster’s going to expand more. So that’s one of those things that’s a more direct design choice.

Rob Wiblin: Yeah. OK, I’m just trying to picture all of the various different tradeoffs here. It seems like one of the key issues is all of these different chips kind of talking to one another, or sharing preliminary results with one another in order to coordinate. The natural way to solve that is just to stick them all in the same place. Can you mostly not do that because you want to bring together lots of different chips that people happen to have locally in their own research center?

Nova DasSarma: Yeah. Hofvarpnir is very weird because we don’t have one cluster — we actually have essentially federated clusters: you’ve got these clusters of machines that are localized, and then we schedule across them. So I’ve been doing a lot of work in writing some scheduler stuff for doing this — which hopefully I’ll be able to make public at some point — for marshaling these jobs, and making sure that they schedule correctly across multiple diverse groups of machines with different kinds of capabilities. So for example, some machines have more powerful graphics cards, some of them have access to fast storage, some have more memory, better CPUs. They’re all over the place.

Nova DasSarma: My job at Anthropic is significantly easier in some ways, even if some of those scale problems are harder. Because all the machines are the same. So Hofvarpnir is a lot closer to the stuff that I’m used to, where most of the problems are, “Can you make the things work?”

Rob Wiblin: OK, so with Hofvarpnir, a key issue is how do you get lots of different pieces of equipment to all talk to one another, even though they have internet connections of varying quality and I suppose different kinds of hardware. What are the key design tradeoffs that you face with Anthropic? Where I suppose things are more centralized.

Nova DasSarma: I think a big problem that we have is deciding how big of a computer we need to run certain kinds of experiments, because this determines really big price numbers. That’s probably the biggest concern, deciding how big it needs to be, because depending on the scale, various parts of the software or the hardware might start falling down. You might need to have a different kind of network topology to support things. If certain kinds of bandwidth are more or less important to you, if you’re trying to get better arithmetic intensity on the accelerators, then things look very different. Does that answer your question there?

Rob Wiblin: Yeah, I think so. It sounds a little bit like the trouble that you have when you are renting an office for five or 10 years, and you don’t know how many staff you’re going to have.

Nova DasSarma: Exactly. In fact, we had an issue with this where we actually rented two floors, but we didn’t actually fill both floors. So I think we have some costs in terms of rent because of that.

Rob Wiblin: It’s a classic thing that organizations go from having a glut of space to an extraordinary scarcity of space and then they expand in some lumpy fashion and then suddenly they’ve got way too much space again.

Nova DasSarma: Yeah. I think this is a classic systems architecture problem in some ways. Because it’s not easy to add additional units of space for your employees, and there’s some cost to not having those be contiguous, but then there’s also a cost for pre-allocating that space. I think this is exactly the same with GPUs. If you have these separate clusters that are not geographically the same, they might have a different number of GPUs on a fast connection, then you are limiting the total size of collaboration or matrix multiplications that you can do. So yeah, they’re actually quite comparable.

Rob Wiblin: You talked about increasing the “arithmetic intensity” of the operations of the system. Can you explain what that means?

Nova DasSarma: It’s basically just saying that if you’ve got an accelerator, it’s got a processor, you can run a certain number of instructions per second on it, theoretically. There are various things that influence how many things you can actually do in practice. So if you’ve got a laptop,it’s not running at 100% all the time, and that’s fine. You can have lots of extra space there. That’s actually really good for a laptop. But it’s not good when you’ve got a data-center-scale computer that costs a really large amount of money. So getting the speed of the models depends on finding the bottleneck and hammering out that and making the software able to push more instructions faster, push more memory in the correct ways, store various kinds of caches, and that sort of thing. Arithmetic intensity is referring to a single measure of that efficiency.

Rob Wiblin: I see. So I suppose at a high level, the thing that you’re trying to avoid in the design of this entire thing at each different stage is that you don’t want to have a chip that is sitting there that could be doing more operations, but it can’t get the instructions into the thing, so the task has had to be routed somewhere else — because, say, it didn’t have enough bandwidth in order to send the instructions there, or maybe it was just poorly designed in terms of where it routes things. Or maybe it’s done a task and now it’s gotten stuck with a bunch of the output, but it can’t send that where it needs to go.

Nova DasSarma: It can’t offload it so it can’t get the input. Yeah.

Rob Wiblin: So you want to have everything working at the same pace.

Nova DasSarma: Exactly.

Rob Wiblin: So that the kitchen works smoothly and you don’t get a bottleneck. I’m not sure whether anyone’s played Overcooked?

Nova DasSarma: Overcooked, yeah, absolutely. I actually love that. Overcooked, Factorio. There are a lot of games that are very appealing to the systems designer in my brain. If I can’t program anymore, I will still play Factorio for some reason.

Rob Wiblin: Yeah. But the basic idea is you’re trying to design a system that works near maximum efficiency, because the amount of all of the different things coming in and out is matched up such that you get the maximum output at the end without any lack of capacity or something getting stuck.

Nova DasSarma: Exactly. So you’re minimizing the amount of slack and you’re maximizing the online time of the most expensive part of your system.

Rob Wiblin: OK, so you have to think about the shadow cost of something not being used, basically — financially or in terms of electricity or space or whatever else.

Nova DasSarma: All sorts of stuff. Exactly. So it’s just a really big optimization problem. It’s great. I like it a lot. You’ve got these very specific metrics that you can measure. You can, down to the clock cycle, think about how your software is going to run and then determine an efficiency number compared to the peak efficiency that you might be able to get out of a chip.

Rob Wiblin: I can see why this seems super fun. This seems like playing Factorio. It seems like playing a computer game in a way. You could get really hooked on it. But you think people haven’t quite cottoned on to how exciting this aspect of AI is — designing the actual equipment, doing the engineering, doing the architecture design to make things work smoothly. Maybe one day people will cotton on.

Nova DasSarma: I hope so. I enjoy these problems and I enjoy being a person who’s valued for working on them. But in a just world, this would not be the case. I think that everyone would be excited about this. And if you’re interested in infrastructure, you probably need to have a pretty good frustration tolerance for some of these things, because it’s not all the fun spreadsheet part of things. Gosh, I can’t believe I said that.

Rob Wiblin: No, I totally understand. I’m with your team on this.

Nova DasSarma: But it is also the implementation, because it turns out that this rack that you bought actually has a width that is one centimeter smaller than it was advertised to be, and so the faceplate of the machine that you’re trying to put in there, the physical geometry doesn’t work. This actually happened to us with the machine. We just took off the faceplate and just fixed it. But it was one of those things where there are all sorts of many, many moving factors in a system like this.

Rob Wiblin: Yeah. Completely random question. Is it the case that specific transistors on a chip can break and that the chip can detect this and just routes operations away from those transistors?

Nova DasSarma: Yeah. So there’s this project called Cerebras, which produces a chip that’s called the WSE-2. It’s 56 times larger than the largest GPU, has 123 times more compute cores. It’s a beast. It’s a crazy project. I’m very excited to see what they’re doing. The way that this works is that if there is a single subunit on that chip that fails, then you can route around it. You can treat these sort of as interchangeable. That is in fact a thing that people are doing, and especially in a large distributed system, not even on the chip level, you can see a certain server or some component of the server fails. And one of the reasons why we care about things like Kubernetes is for scheduling a workload across a distributed system in a way that is resilient to failures and limits downtime or capacity sitting idle. So it’s a piece of software that we use specifically for doing this.

Rob Wiblin: At a high level, what kind of efficiency gain do you get designing a cluster well versus designing it in a garbage way? Is it like twice as efficient or more or less?

Nova DasSarma: I think you can go more than that. There’s a lot of efficiency gains to be had. So for example, let’s talk about if everybody had their own GPU, and you’ve got all the GPUs in a big pile. So assume that I’m a researcher and my lab gives everybody an A100 chip. I’m only running experiments 20% of the time or something like that, but when I am running experiments, I’m running several of them — I’m doing something where it’s like a scan of various kinds of parameters, and I want to run multiple experiments.

Nova DasSarma: I have to do those in serial if I only have one graphics card. I could do them in parallel if I had access to a cluster, because chances are, most of the time, people are not going to be simultaneously trying to use their GPU. You can assume that they’re going to use it maybe 40% of the time or something like that. I’m making that number up. You would need to determine that. But that’s one of those things where having more things interconnected and inter-schedulable means that you can get better throughput.

Rob Wiblin: Yeah. I see. I guess we haven’t explained serial versus parallel.

Nova DasSarma: Yeah. It’s if you do one experiment at a time versus you do them all at once across multiple GPUs, basically.

Rob Wiblin: Earlier you said some software is designed such that it is more parallelizable, whereas other ones it’s more effort to ensure that things are extremely parallel. Is that right?

Nova DasSarma: Yeah. Absolutely. It’s certainly easier to think about the execution of a program as a series of instructions that execute in a row, and you’re doing one thing logically, and then another. Whereas, when you’re writing something that is designed to be run on a data-center-scale computer, or even if it’s something that’s meant to run on four cores on your CPU, you have to think about —

Rob Wiblin: At some point you’ve got to break it up…

Nova DasSarma: Yeah. You’ve got to break up the problem into subproblems that can then be solved simultaneously and brought back together. And that’s just quite difficult to do.

Rob Wiblin: OK, so that’s another margin of potential improvement, is improving the software to allow more to be brought on simultaneously?

Nova DasSarma: Yeah, absolutely. And especially in machine learning, you talk about various kinds of parallelism. You talk about the node level or the chip level or all sorts of stuff like this.

Rob Wiblin: So if you’re talking about increasing the efficiency of the use of the hardware by multiples, then I guess it explains why this is such an absolutely crucial issue. I don’t know whether you know what is the division between spending on wages versus spending on hardware and compute in various different forms, but I think compute is now a massive part of all the costs of any AI-related project. And so you double the efficiency of that.

Nova DasSarma: It’s $24,000 or something like that for an A100 GPU. You can get a lot of grad students for that.

Rob Wiblin: Right.

Nova DasSarma: So yeah, the compute is horrendously expensive.

Rob Wiblin: Yeah. I imagine that wages for people who are great at server architecture or compute architecture are getting bid up?

Nova DasSarma: I’d say they’re pretty good. Yeah.

Rob Wiblin: Cool. All right. Maybe we’ll stick out some links to some job opportunities.

Nova DasSarma: Maybe hit levels.fyi for this.

Rob Wiblin: Is Hofvarpnir hiring at all?

Nova DasSarma: We’re in a special case where we’re trying to hire for specifically junior candidates who maybe have some experience in DevOps, but are interested in learning more about it. So that’s what we’re hiring. I think we’re trying to hire right now, one full-time as of time of recording for this, who would be doing support stuff for this weird heterogeneous cluster. Because both Shauna and I have full-time jobs, and it turns out that running one and a half jobs all the time might be a little bit stressful sometimes.

Rob Wiblin: How many roles do you envisage there might be in Anthropic in that kind of work, in fullness of time?

Nova DasSarma: For junior candidates, it’s hard to say. If you already know some stuff about this, then I think that we could easily absorb another four or five people that are doing the sorts of things that I’m doing and not slow down.

Rob Wiblin: Now. Let alone in the future.

Nova DasSarma: It’s really hard to say in the future. It depends significantly on how well we do and what our results are. So I can’t really talk very much about what future plans look like, but certainly right now, I think we could take some more folks.

Nova’s work and how she got into it [01:43:45]

Rob Wiblin: OK. Let’s push on from talking about a whole bundle of stuff to do with Anthropic and computer security in general, and let’s talk about how you got into this job. How did you end up in this role? It seems like you’ve done quite a range of different stuff in your career in the past.

Nova DasSarma: So I think most directly, a friend at Mila, which is another ML place, was talking with Andy over at Anthropic about some of the research they were doing. Andy mentioned that they needed some infrastructure stuff, and I at the time was friends with this person, and they mentioned, “Well, you should talk to Nova.” And so that’s the main way that I got here.

Nova DasSarma: In general, my background is pretty broad. I did information systems at UMBC, and a lot of the work there was on the corporate end, and talking about how you do business information systems. But I think a lot of the best work that I did there was in the Division of Information Technology. I used to work there, working on student systems and working on other systems there. And that’s something that I kind of fell in love with.

Nova DasSarma: Before then, I also did a bunch of bioinformatics. I think you can probably find some papers out there from me on orthologous genes and metagenomics, that sort of thing. A lot of the reason that I got excited about that work was because there were some very direct biological implications and really interesting things that came out of the computer work that I did there. And at the time, bioinformatics was sort of a nascent field in some ways, and there was actually work for somebody who is 12 working on that kind of thing.

Rob Wiblin: You were 12 when you were doing that?

Nova DasSarma: Yes. Both my parents are microbiologists, and so I helped them out with their work quite a bit. You can look up HaloWeb sometime.

Rob Wiblin: Nice. What were the biological implications of the work that you were doing?

Nova DasSarma: I did work on comparative genomics of haloarchaea, which are these very salt-loving microbes that live in places like the Great Salt Lake and the Dead Sea and things like that. And they’ve got some pretty interesting gas vesicle nanoparticles, which is sort of a fancy way of saying there’s this protein structure inside of them, that turns out to be very body neutral and immune system neutral, and very easy to express antigens on. So there are some vaccine implications there.

Rob Wiblin: When you say “body neutral,” you mean it doesn’t affect the human body directly and it doesn’t have proteins that tend to set off our immune system?

Nova DasSarma: Yes. So if you think about prosthetics or something like that, you oftentimes need to make them out of something like titanium, because otherwise your body will interact with them in a way where there’s inflammation caused or something like that.

Rob Wiblin: Yeah. Did you have a preexisting interest in AI alignment and AI safety before getting involved in this? Or are you mostly working at Anthropic because you’re interested in the IT and computer security and engineering aspects of the problem?

Nova DasSarma: I’ve been interested in things that look like ML systems for a long time. One of the most important techniques in determining what a gene does, and what we used to do, is something called a hidden Markov model — which is in some ways a more primitive form of doing machine learning to determine the weights of a Markov chain, so that way you can figure out what protein function is from a gene. So that was something I used to have to do.

Nova DasSarma: And for a long time, I used to write — and actually still do write — these chatbots for various kinds of systems, previously on IRC and now on Discord mostly. A lot of these systems interact with a lot of people, and you have to think very carefully about the UX of these sorts of things, so that people sort of treat the bots well. There’s sort of a humanization in some ways of trying to have the bot interact in a way that’s personable, so that people don’t abuse the bot and are less likely to do damaging things to it.

Nova DasSarma: I run this project, Fletcher, which does moderation for a large number of Discord servers. And in fact, this was one of the ways I got into ML. I run these large-scale ML models, doing inference on a global network of computers for doing this, and that was one of the things that I was working on. I think Andy tends to appreciate that that’s some deployment experience that I bring to the job today.

Nova DasSarma: AI safety in specific, I used to read LessWrong back in high school, and then I sort of dropped away from it for a while. And then I ended up back in the Bay, and people were talking about it. And it definitely seems like the sort of thing where it feels like infrastructure on the safety side is a limiting factor, and it felt like a place that I could try and make there be a differential boost for safety researchers. That’s one of the reasons why I’m at Anthropic specifically, because I think it’s one of the places where infrastructure is the most important, and I think it’s highest impact.

Rob Wiblin: OK, I’ve got a lot of followup questions. So one project you’ve done over the years is use machine learning in order to train a moderation bot that monitors these chat channels, and I guess blocks people and deletes stuff based on its ability to infer that these are things that you don’t want on the network. And you also tried to make it seem like a nice bot, so people wouldn’t try to attack it or make it not work.

Nova DasSarma: Yeah, that’s a good description. Though it’s less going to delete things, and more it forms a model of what a conversation looks like when it is angry or heated or something that you don’t want that’s not on topic. And it’s really quite a good use of ML, I think in some ways, to try and identify those sorts of situations before they escalate, and then escalate them to a human. Because oftentimes, these systems are not going to be reliable enough without a human in the loop, but they can watch everything when a human might not be able to. So a lot of the work with Fletcher is pinging a moderator if something looks like it’s about to get out of hand.

Rob Wiblin: Where do you get the training data for this? Is it just all of the past moderation decisions basically?

Nova DasSarma: So I can’t talk about what the current model is, because I open source it every time I have a better one, to try and avoid people trying to do adversarial things with it. But the original model was a very naive sentiment-based model, where it just looked for very high sentiment — oftentimes high negativity — looking for words where I had a very carefully curated list of things that should be brought to attention. I was able to crowdsource some of this information from moderators who already knew what they were looking for, to convert these into these expert system rules. Then once I had that, and I had some information about whether things did blow up or not, that was something that I was able to use to train models to identify these sorts of things, to build classifiers and that sort of thing. And I think my data pipeline’s kind of interesting there. I wish I could talk more about it.

Rob Wiblin: Sure, yeah. Who asked you to do this? Or did you just volunteer to do this?

Nova DasSarma: The original impetus for Fletcher was that I was on a server that was run by a friend of mine who I had a crush on. She was having some trouble with doing some moderation things, and having some trouble specifically with moving things between channels. And I was like, well, this is a way that I can flirt with her. And she is now my fiance, and we work on a bunch of stuff together, and so I think it did work.

Rob Wiblin: That is a fantastic way to woo someone.

Nova DasSarma: Yeah, I tend to write software as that sort of thing.

Rob Wiblin: Nice. So it sounds like you’ve been doing interesting projects since you were a kid, basically. Are you one of these people who was doing really cool stuff at an age where I was just completely wasting away my time?

Nova DasSarma: I mean, I’m not sure that I would consider whatever you did wasting, because you are here now, but certainly I wasn’t this productive for this whole time. But yeah, I did a lot of science, I did a lot of computing stuff back in the day.

Rob Wiblin: That seems to be particularly common among tech people. I suppose one reason is the equipment’s not so expensive, and there’s just interesting stuff that you can do that doesn’t really require that much permission.

Nova DasSarma: Yeah. I think tech in general has a lot of low-hanging fruit compared to, say, physics. There have been a lot of people working on physics for a long time, and a lot of the easy problems have been solved, and so what you have left are these quite hard problems that you might have to do a lot of thinking for. Whereas I can mess around with a TI-84 Plus and get a lot of value out of that. I can do something that seems worthwhile. I could, for example, make a game, which is something that’s impressive to my friends. Or I can make a computer algebra system, which is impressive to, well, my grades in the end. So I think there are very, very direct applications here, and it definitely appeals to the engineer in me more than the scientist, but I think that’s OK.

Rob Wiblin: Did you have a particular kind of career philosophy that you’ve been following? Like, “I’m going to try to get extremely good at some aptitude, and then I’ll be able to use it in future”? Or have you more just been following your interests year to year?

Nova DasSarma: I think my general guiding philosophy is to make choices that bring me more choices, and to increase my capabilities — not on the AI side, but on the me side. So a lot of the work I do, and a lot of the projects that I choose, are ones where I think I’m going to come away with it better than I was before. For example, a beginner blacksmith starts with very few things and makes their first pair of tongs. And then you produce all these tools that you have built yourself, that let you do really amazing things.

Nova DasSarma: And I think a lot of my career has basically been, I’m going to build a tool that lets me… Here’s an example: I like to read fanfiction. That’s a thing that I enjoy. So when I was choosing projects back in the day, I wrote a piece of software that interfaced a fanfiction site with a computer-readable protocol. Then I wrote another piece of software that took that protocol and converted it into a unified interface, so I could have multiple backends into the same thing. Then I wrote another piece of software that tied into that same thing, that let me read those books in the browser. So I’ve got this stack of four pieces of things that all were individual projects that sort of tied into each other, and it meant that I could do a pretty impressive thing without that much difficulty. And having these tools has served me quite well, I think.

Rob Wiblin: Yeah. Are there any other important decisions or strategies that you’ve made in the past or used in the past that you think have helped to lead you to a good situation today?

Nova DasSarma: Take opportunities. I tend to be extremely busy, and I think that I probably still lean on the side of taking too many opportunities. But it is far better to do that than it is to let things pass you by. You’ve only got a limited amount of time, and it’s better to get into the habit of just sort of grabbing something. Even if you think that you might not be that good at it, you can always ask for help. And being humble and asking for help, and being willing to step up when nobody else will — or even if they will, still stepping up — is I think really, really important for anybody pretty much anywhere in their career.

Rob Wiblin: Is there any limiting principle to that? Are there any things that you’ve taken on that you kind of wish in retrospect you hadn’t?

Nova DasSarma: Certainly I’ve had projects that have, in the end, become too big for me, or have become something that’s caused me some sort of grief, but I don’t think that I really regret any of it. I’ve learned things from all of those, and as long as I have an exit strategy or a next thing to go to, it’s been pretty good.

Rob Wiblin: So you studied information systems at university.

Nova DasSarma: Also classics.

Rob Wiblin: Also classics. A real Renaissance woman. Can you explain the difference between information systems and information security and computer security and systems engineering? All of these things slightly blur together for me.

Nova DasSarma: Sure. Oh, yeah. I think a lot of these terms are pretty much what you make of them. But if you’re thinking about information systems in general, you’re talking about not only the computer side of things, but also the humans and the data that goes into the systems and out of the systems and the business processes that surround them. So information systems is sort of the study of that.

Nova DasSarma: Computer security is more talking about the boundary between computer systems. And then information security would be the boundary between human-plus-computer systems, and other human-plus-computer systems. Then systems architecture, which is sort of a title that is pretty broad in some senses, is talking about designing those systems. If you think about what an architect does for a building, I like to make the claim that a systems architect does a similar thing, but without an architecture degree. So we make more mistakes. That’s basically what I do at my job, is trying to design these really complicated systems in ways that they will be maintainable and secure and reliable and efficient and all these sorts of things. So there’s these properties of systems that I try and instill in them.

Rob Wiblin: How important was your undergrad in this whole picture?

Nova DasSarma: The experiences I had at UMBC were really quite exciting, and I think that I wouldn’t be here if it wasn’t for UMBC. But in terms of undergrad education, I think that degrees oftentimes don’t prepare people too well for this sort of role.

Rob Wiblin: Why’s that?

Nova DasSarma: I think a lot of the learning that you do is experiential. Especially in computers, I think it’s very much an engineering kind of degree that gets taught more like a science. Some of my most valuable classes were the ones that looked like, “We’re going to force you to do a group project with some arbitrary group of people on something.” I think that that was really quite good. I think the information security program at UMBC is excellent.

Rob Wiblin: And the stuff that’s less useful is the more theoretical computer science, which you don’t get to apply as much?

Nova DasSarma: No, I think that theoretical computer science is really important, but as a new programmer, it’s really hard to know when that’s the right thing to apply. So here’s an example. Let’s talk about Fletcher, actually, because I do this all the time. I could have decided that to write it efficiently, I would have to write something in C, and it has a bunch of very fancy algorithms to do something fast. But I didn’t do that. It’s in Python. And part of the reason that it’s in Python is that it is fast for me to iterate in that language. And it turns out that you can buy enough compute that it doesn’t matter, not in a very broad sense. Like obviously if I was writing this in TI-84 Plus basic, that would be maybe a bad idea, but for Python, it was the right choice because I could move fast.

Nova DasSarma: And it’s really hard to get that intuition without doing stuff. Because I have various functions to this bot that I absolutely know are O(N^2), which is like saying that you’ve got a piece of the program that the time to run it increases as the square of the number of pieces of input. I think a lot of beginning programmers will do things like, well, I’m going to do a bunch of work to replace this with an algorithm that is O(N) or is O(√N), or something like that. And somebody who’s experienced will say, well, I can bound N and that worst case is still fine, even if this algorithm itself is N^2. And those are the sorts of things that I think theoretical computer science stuff sometimes falls down on, compared to just “make some stuff.”

Rob Wiblin: So I imagine there’s quite a lot of people in the audience thinking, “Nova seems pretty badass. I’d like to have a career a bit like Nova’s, or I’d like to have a dash of this in my life.” Is there any advice that you haven’t already given to people who are interested in having a career that’s a bit more like yours?

Nova DasSarma: I think that computer science and technical stuff in general is most useful when it is applied to something that you already know about. So if you’re somebody who’s interested in literature, but you also want to think about computer science, then find a journal that is interested in having a website, and build them a website. Build them a workflow tool. If you’re interested in biology, do some bioinformatics. It’s much more rewarding to do stuff when you can see the outcome and see the positive stuff that comes out of it.

Nova DasSarma: That’s one thing. And the other thing is, don’t get discouraged too easily. And if you do get discouraged, talk to somebody about it. Because I wouldn’t be where I am today without mentorship and being willing to reach out to people, to learn from their knowledge and that sort of thing. Send me an email if you’re interested. I’m happy to connect people together, and you should be willing to talk to people.

Nova DasSarma: One of my favorite activities that I do — that I imagine various government functionaries don’t appreciate — is there’s a thing called the Public Information Act, or the Freedom of Information Act, depending on what level of the government you’re at. When I have a question about something that’s running in an agency, I will just send in a request for this. This turns out to be a very rewarding thing, because you will likely get an answer of some kind to this. It’s a really big system, and it sometimes feels like you can’t affect it. But it turns out that there are many ways that you can, and this is a very direct way of doing that.

Rob Wiblin: How much does it help to have a partner who sounds like they’re kind of a fellow traveler, and has quite common interests?

Nova DasSarma: It’s fantastic working with Shauna. It’s really nice. Though the one thing that I would caution against is if you’re going to work with your partner, you’ve got to make sure that you’ve got good boundaries on what is work and what is not. So we both work at Anthropic, and cofounded Hofvarpnir together, and I do Fletcher for things that she’s moderated and that sort of thing. And we do a Burning Man camp, and we do lots of stuff together. And it’s very power couple, but I think being able to turn that off is really important.

Rob Wiblin: Yeah, I guess it reduces the diversification that you have in life, right? Because your relationship and your work are somewhat tied together. Some people recommend getting a partner who you definitely don’t work with, who has kind of separate interests, maybe even has a somewhat different personality.

Nova DasSarma: Yeah, I used to say I shouldn’t date programmers. And honestly, that was what I started out with. But then she stopped doing physics and started doing programming, so I guess I messed up there.

Rob Wiblin: Right. Right. I guess you converged a little bit over time. It sounds like it’s a maybe a slightly higher-risk strategy, but one that’s very rewarding when it’s working, and there’s things that you can do to try to make it not as risky as it could be if you were not thinking about it.

Nova DasSarma: Yeah, I wouldn’t recommend going into a relationship expecting to do this sort of thing. But if you’ve got opportunities and it seems to work out, and you’re being gradual and you’re being thoughtful about it, then yeah, I think it’s pretty good.

Rob Wiblin: What are the aspects of your current job that you most enjoy?

Nova DasSarma: Well, one, they give me all the computers I can eat, which is pretty good. And the scale of what we do at Anthropic is pretty exciting for me. But I think the thing that I most enjoy is being able to be around people who are smarter than me. A lot of the researchers have very, very exciting things that they’re working on, and deep understandings of ML systems that I don’t oftentimes. I’ve got some basic understandings and I’ve got some principles that I work by and I’ve done some deployments, but at the end of the day, I’m not driving the research forward. And being able to be in an organization that is advancing the state of the art, and being able to be a part of that, I can’t recommend anything more. It’s like being in the academy, but without as many of the granting politics.

Rob Wiblin: Right. OK, so you’ve said all of that positive stuff about Anthropic. Now, what are the parts of the job that you don’t like that much, or at least that might not appeal to other people so much?

Nova DasSarma: Well, two things. Parts that don’t appeal to other people: for some reason, people don’t really enjoy doing systems infrastructure, which is honestly their loss and my gain. But I think right now we are sort of people constrained in some of the tasks that we’re working on, and so that can result in things being kind of tough for me to work on. So oftentimes there are quite a few priorities. There are a lot of things that often feel like they’re top priority, so that can be difficult. But we’ve been working on it. We’ve got some new personnel coming on. By the time this airs, we’ll have some additional infrastructure folks on, so I’m excited to have minions.

Anthropic and career advice [02:04:16]

Rob Wiblin: Nice. Let’s push on from your career in general and talk a little bit about Anthropic and the work you’re doing now specifically. I’ve seen different safety agendas that different organizations have — there’s Anthropic, there’s folks at DeepMind, there’s folks at Hemera, there’s Redwood Research — there’s a whole lot of organizations that are started up, each with their own specific conception of what might help to make AI safer in future. Is there any way that people can go about reading these different things or try to form a judgment about which one they’d rather throw their efforts behind? Or is that maybe one where, in your position, you have to defer — to just trust the judgment of people who seem like they’re on the ball in general?

Nova DasSarma: I think if you’re interested in working in AI safety and trying to choose an organization, one thing you can look at is the output. You can look at what sorts of things they’re releasing. If you’re releasing more interpretability work, or if you’re releasing more alignment work, or if you’re releasing more capabilities work, I think that’s, in some ways, the most important thing. You can also take a look at the output from people on podcasts and things like that. For example, think Dario and Daniella at Anthropic were recently on the FLI podcast talking about what our theory of change is and things like that. So that’s a good way to get an insight into these sorts of things.

Nova DasSarma: I don’t think anyone’s done any ranking or anything like that. The other thing that you should do is talk to people who work there. Oftentimes people are more approachable than you’d expect naively. People are very busy, but it’s often a pleasure to talk with somebody who’s bright and is excited about the future — or maybe concerned about the future and would like to do something to improve it — because not everybody’s doing that. So I would say, try and talk to some people.

Rob Wiblin: What are the opportunities to talk to people working at a place like Anthropic? Conferences, or specific meet and greets for career purposes?

Nova DasSarma: Certainly things like EA Global and ICML, and there are lots of things where there are conferences. I’d say that if you see somebody doing work that looks interesting to you, and you’ve read their papers and you have some questions about it, then you should email them. I think that the worst they can do is ignore it. I don’t think that anyone’s going to hold it against you to send a couple of questions or to ask for a few minutes of time. They might say no. They might say, “Come back after you’ve done X, Y, and Z.” There are definitely lists out there and things like that of ways that you can prepare and ways that you can make the most of time like that. But like I said, try and take those opportunities where they are.

Rob Wiblin: We talked about that issue of cold emailing people whose work you’re excited by in the interview with Chris Olah, so people can go back and listen to that interview from last year. Of course, Chris Olah is also one of your colleagues at Anthropic, so he knows what he’s talking about.

Nova DasSarma: Chris is absolutely a pleasure to work with.

Rob Wiblin: Someone in the audience wrote in with this question. Well, firstly, how did you get so good at DevOps, which we’ve somewhat covered already. But then they wanted to know, “What advice would you have for people who recently graduated with a computer science undergrad degree, but they don’t feel particularly exceptional at DevOps, but nonetheless they’d like to end up in a role like the one you have?” It sounds like this person is asking for a friend.

Nova DasSarma: Well, certainly they should email me. But I of course have to make the pitch that Hofvarpnir is hiring junior DevOps engineers, because I think it’s really, really important to get more infra-y people in AI safety and I’m willing to put in the mentorship hours to do that.

Nova DasSarma: But it turns out that it’s easier than you think to try out some of the things involved in DevOps. Many cloud platforms have free credits, especially for university students. But in general, they’ll give you $200 to play with, which is enough that you can run some pretty interesting things for a pretty long time. So I would recommend making an account at AWS and trying out some of the articles that they have. They’ve got things talking about their Elastic Kubernetes Service, or training a model, that kind of thing. The model thing is I guess more in the researcher end, but for DevOps stuff, try doing something like that.

Nova DasSarma: There are oftentimes opportunities that you see in your life to build services that would improve something. The best thing that you can do for DevOps, in my opinion, is to build a service and then release it widely — release it on Reddit, release it on Product Hunt, Hacker News, that sort of thing. There is nothing like doing a launch. There’s a joke about a QA tester walks into a bar, and they order one beer. They order two beers. They order minus one beer. They order Not a Number beer. And then the first customer walks into the bar and asks where the bathroom is, and the bartender explodes. That’s sort of a good representation of what it’s like to launch. You see all sorts of very, very odd things that you didn’t expect were possible. And it really helps you to sort of broaden your worldview in terms of the sorts of things that people will do with systems. And that’s the best thing to learn as a DevOps person. And yeah, email me. I’m happy to chat.

Rob Wiblin: We’ve talked quite a lot about self-directed and organic ways of building skills. Are there any more formal courses of training that people could use in order to build their skills, or is that maybe the wrong way to be thinking about it?

Nova DasSarma: I think that those courses can be useful if you’re on the web side of things. MDN, which is the Mozilla Developer Network, has some interesting stuff. I think that there’s some Coursera courses out there that have looked pretty interesting — mostly on the ML side and less on the DevOps side.

Nova DasSarma: The best thing that you could do if you’re in a university kind of role, is you’re in a good position to apply to one of these roles at a software company. Two things to keep in mind there. One, infrastructure is super in demand, it turns out. I’m not sure why everyone isn’t doing it because it’s the most exciting thing in the world and possibly the best thing that you could be doing. But that’s in your advantage if you’re interested in this sort of thing: I think there are fewer people that you’re competing with compared to a generic software position.

Nova DasSarma: The one choice you might want to make is between something like Google, Facebook, Amazon, those sorts of like large companies — where your DevOps looks pretty different from doing something at, for example, a startup, where the work that you would be doing is very much greenfield work, very much working with tools in a more direct way. You’ll get more mentorship at a place like Google, but you might learn slightly different things and you might need to do more projects on your own to see if you can apply those tools. Because for example, if you work at Google, a lot of the really hard problems have been solved for you. There’s a lot of people working there, and there’s a lot of tooling that’s been developed to make it so when a software engineer wants to launch a product, they have a very specific thing they can do.

Nova DasSarma: Whereas I did a bunch of startups. I’ve worked at a bunch of Y Combinator places, and every single one of them has been from the ground up: you’ve got to look at the problems and draw out a thing on paper and then make that happen, and you can choose basically whatever tools you want. It’s just a very different experience, I think. But internships are a good place for this, if you’re so inclined. I do still recommend doing your own projects though, because I think there’s nothing like that. If you’re looking for more feedback, then that’s where you want to launch, right? If you produce something that has users, those users will want things from you. And I think there’s nothing like that.

Rob Wiblin: Yeah. So let’s talk now about all of the different roles that Anthropic is trying to fill. It sounds like there could be quite a lot of different positions. Maybe it’s worth going through them a little bit systematically. So what are the different categories of roles that someone in the audience could hypothetically apply for?

Nova DasSarma: I think that the main roles that we’re looking for right now are things on the data engineering side; on the infrastructure side, which I’m leading up; and on the security engineer side, which is very, very related. I think the infrastructure and security run hand in hand — you can’t do systems without thinking about security. But those I think are the main ones.

Rob Wiblin: OK, maybe let’s go through those one by one. Not in enormous detail, because people can obviously go on the website and learn more about them, but maybe what sorts of schools or backgrounds do those call for?

Nova DasSarma: So on the data engineering side, I think that we’re looking for folks who are interested in taking a large pile of pretty unstructured data and structuring it and cleaning it and that sort of thing. You’d probably be familiar with tools like Spark or Hadoop. But honestly, if you know how to do sed really well, then that’s also very useful. On the engineering side, that’s pretty important.

Nova DasSarma: On the infrastructure side, I think candidates who would work well are generally strong software engineers — but it’s pretty easy to pivot from software into infrastructure, if you’re so inclined. On the infrastructure side of things, if you’ve had experience with, for example, Kubernetes, that’s a big boon. It’s quite difficult to replicate that experience, so I think that’s pretty important. Having worked with GPUs before, because they’re finicky beasts, unfortunately.

Nova DasSarma: On the security side, if you’ve worked with a large system before, I think that’s pretty important. But in a lot of ways there’s a lot of low-hanging fruit for systematizing stuff. So there are a lot of very broad backgrounds that you could have that would be applicable to that. Mostly you need to come at it with a creative mindset more than anything else.

Rob Wiblin: Yeah. That’s what everyone always says about the computer security stuff. I think Bruce Schneier was really on this line as well. Basically it’s like, you have to be the kind of person who looks at anything and is like, “Here’s the weakness. Here’s how I would break into that.”

Nova DasSarma: Yeah, yeah. I think that Bruce is spot on with that. I think that that’s the most important thing in a security person, being able to think outside of the box. We talk a lot in security about something called “fence post security” — where you’re walking in a desert and you see this really tall fence post in the middle of the desert, and you’re just going to walk around that. Security is about walking around those fence posts that people put up. And so a good security engineer is trying to actually build a fence.

Rob Wiblin: You’re kind of gesturing towards the idea that you can have this tokenistic security or tokenistic barriers, but then if they don’t actually stop someone who’s interested in getting past, then it might not be immediately obvious that’s the case.

Nova DasSarma: Yeah, yeah. Because I think it’s very easy to get the blinders on and think a lot about, for example, SOC 2 compliance or something like that. There are lots of compliance procedures where simply doing the steps is not enough for your system to be secure, and you have to be thinking about an adversary who doesn’t care about those steps.

Rob Wiblin: The rules, yeah.

Nova DasSarma: They care about getting in.

Rob Wiblin: Yeah, yeah. It’s extraordinary how uncommon that mindset is. I think I may even have given this rant with Bruce years ago. But my bank literally calls you up on your mobile phone and tries to have secure conversations with you about your bank account without doing anything to authenticate that they are from the bank.

Nova DasSarma: Have you considered changing banks?

Rob Wiblin: I actually have, yeah. I maybe seriously should do that. And then if you try to be like, “I don’t even know that you’re from my bank. Why would you expect me to talk with you about this?” They get kind of belligerent or baffled about the fact that I’m not willing to talk to a random stranger, even though it’s extremely common to have people call you up and pretend to be from a bank. Anyway, I don’t know what to say about that, but that’s kind of the state of financial infrastructure.

Nova DasSarma: I think that encouraging people to think about ways that systems fall down is extremely valuable. And in general, even if you’re not interested in doing security, you should still be thinking about these things. Because in some ways, security isn’t just about offense-defense. It’s a mindset that lets you solve problems in out-of-the-box ways.

Nova DasSarma: I used to joke that anytime I found a security vulnerability in the system, it was an unexpected patching mechanism. It’s a way that you can update that software without having to go through the standard procedures for it. So in some ways a lot of security looks like this. Or if you’re interacting with a government bureaucracy and you’re having a lot of trouble with the standard way of doing it, maybe you should call a senator, who knows, that might be able to help. They won’t tell you to do that on the website. You need to be thinking about all the ways that you can hit that problem.

$600M Ethereum hack [02:17:01]

Rob Wiblin: I read a couple of weeks ago about these folks who managed to steal hundreds of millions of dollars of some cryptocurrency, I think ETH. Basically they were looking at various patches that had gone through to these interfaces that moved cryptoassets between different blockchains. They noticed that someone had accidentally programmed this check system so that rather than check that it said it was accurate, they instead checked that X and Y were the same. So they could make the check sum be inaccurate, but then say that it’s inaccurate. And it would say, these two match, because they’re both inaccurate, and then it would approve it.

Rob Wiblin: I guess it’s the kind of thing that’s just so easy to look over when you’re looking at the code. You’re like, oh, it says equals-equals when it should actually say, I don’t know what the positive side is, but it seems like a kind of classic computer security error in this case cost hundreds of millions of ETH.

Nova DasSarma: Yeah. I’ve definitely seen that one going around. It turns out that the actual thing that happened there is almost funnier. Maybe not funny. I think it’s caused a lot of economic damage. But the thing that happened there wasn’t that they specified the wrong check exactly. So basically when you think about something like a data store, say it’s a blockchain: you’ve got inputs coming from somewhere and then you store those output results on the chain in some way, or in this ledger.

Nova DasSarma: The way that Ethereum works is that oftentimes the functions that are called on that blockchain store their results in the blockchain, and you might have multiple functions that need to be called to verify the same thing. And because it’s quite difficult to modify protocols after they were created, the Ethereum developers put in a bunch of functionality to modularize things like checks. So for example, if you wanted to upgrade the security — you needed to increase the number of bits in an encryption key or something like that — there are ways to do that without having to have everybody update their software, which would be impossible with a distributed system.

Nova DasSarma: So what the attackers did is, there is a thing on that chain that verifies whether the protocol that you are asking it to verify your signature with — so the verify verifier — is a system output. But if you store something to the chain fast enough, you can say that that check returned true, even if that function hasn’t finished. So they were able to inject this bad check — it is accurate the way you were talking about that — but the way that they were able to do this involved basically a race condition between two systems trying to verify.

Rob Wiblin: Their thing on the chain.

Nova DasSarma: Yeah, exactly.

Rob Wiblin: So it’s slightly more complicated.

Nova DasSarma: Yeah. It’s slightly more complicated and it required a little bit more thinking outside the box than noticing this one vulnerability. But it’s still a great example of how a computer security error can have a very large economic impact.

Rob Wiblin: Yeah, yeah. To come back to what we were talking about, I was just thinking this is an example of someone who has this mindset of just like, “I’m going to break stuff.” Who might look at that code and be like, “Wait, couldn’t you race to put in this alternative thing on the chain, if you got it in first?” And if you’re the kind of person where weaknesses like that jump out at you, then you should go into computer security.

Nova DasSarma: Absolutely. Yeah. I mean, I think that there are many things that you can do other than computer security, but computer security is full of these sorts of things and the rewards for finding it are, I think, much higher than in a lot of other places.

Rob Wiblin: Are there any sorts of people who think they’re not qualified or not suitable to work at Anthropic, or in these kinds of roles, but actually are? Is that a phenomenon?

Nova DasSarma: Yeah, I think we see that sometimes. The things that we’re looking for are folks who are relatively self-directed and are able to pick things up fast. The biggest thing is you might not have a huge ML background. But if you’re a really strong software engineer — like I was saying at the beginning — I think sometimes the ML is pretty easy compared to the software engineering problems, and you can pick up the ML. Jared Kaplan has a really good note out about learning ML, that’s I think really targeted at physicists, but I think it’s one of the clearest things out there on this. So if you think that’s readable, and you’re otherwise a pretty strong software engineer, then I encourage you to apply.

Personal computer security advice [02:21:30]

Rob Wiblin: Let’s just talk a little bit about what people in the audience can potentially do to tighten up their personal computer and information security. And I suppose what stuff they might be able to introduce into their organizations in order to be able to lift their game. We’ll be able to link to a doc that you’ve been involved in writing, called Security recommendations for new organizations, which covers some of this. Maybe let’s try to keep it to three top recommendations. What would number one be?

Nova DasSarma: For an organization or a person?

Rob Wiblin: For a person.

Nova DasSarma: For a person, number one I would say is use two-factor authentication everywhere you can. It doesn’t matter if your password gets compromised if you are able to deny access anyways, because they don’t have a hardware key. I think on that same front, use a password manager. Please use a password manager. You are decreasing the blast radius of any given compromised password by doing this. So do that.

Rob Wiblin: And that’s because if you use similar passwords or the same passwords in lots of different places, then if someone steals one, then they’ve stolen all of them.

Nova DasSarma: Exactly. Yes. And that being said, even if you make no other changes and you don’t actually use a password manager, certainly have your mail password be different.

Rob Wiblin: Because that’s the skeleton key to everything.

Nova DasSarma: That is the skeleton key for resetting so many things. Very frustrating sometimes. And if we had to pick a third thing, I mentioned use an ad blocker. I’m going to say it again. I think that using an ad blocker is really, really, really important here in preventing random malicious code from being injected in your computer.

Rob Wiblin: I didn’t realize that ad networks had become such a common source of malicious code. It’s kind of surprising they can’t tighten themselves up.

Nova DasSarma: I think that the incentives really aren’t in place for that, because it’s not like the ad network is going to stop being used if there is any malware delivered by it, because it’s really hard to trace it back and things like that. I just think that the incentives aren’t in place for this, it’s less bad than it used to be. I think in the bad old days you could throw any JavaScript you wanted in there. And then we changed that because it was bad. But that being said, I think it just still is quite unfortunate.

Rob Wiblin: It’s such a juicy target.

Nova DasSarma: Yeah. Because if you, for example, send something and you want to target a specific person, you can do all these targeting mechanisms that you wouldn’t otherwise be able to do by saying, “Well, I know that they work in San Francisco and they are probably using a Mac and…”. So it’s worth doing.

Rob Wiblin: Yeah. I imagine everyone listening to this show is going to be familiar with two-factor authentication, where you get that six-digit code that you take out of your phone or from SMS and plug it in. The thing that we ideally will be switching over to for almost everything is one of these hardware keys, which is a thing that you plug into your computer and you press a little button on it and it does a cryptographic operation to prove that you had that key on you.

Rob Wiblin: I think that there’s a bunch of different ways that’s a whole lot better, but one of them is that it’s a lot less vulnerable to phishing attacks. So effectively, even if you have two-factor authentication where you’re getting that six-digit number, if someone sends you an email and directs you to a fake login website, they can just immediately take both the password that you’ve put in and the six digits that you’ve put in — the second factor that you’ve put in — and just log in as if they were you somewhere else. And that is, I think, quite a common way to break into people’s accounts. But that is basically not possible with these U2F hardware keys that have become reasonably common, and you can now lock up your Google account and your Facebook account and many other accounts with those.

Nova DasSarma: Yeah. And I really recommend those. I also recommend getting two keys, because I think one of the concerns people often have is that they might lose this key, and that’s a reasonable concern. So you should have two of them. Almost any reputable site will let you register multiple keys. Keep one of them in a secure location and keep the other one on your keychain. And you’ll do a lot better.

Nova DasSarma: I also didn’t mention if you’re buying technology, try and ensure that you’re buying it from a vendor who is reputable. It’s very easy to buy things like USB cables and stuff like that from whoever has the lowest dollar amount on Amazon. Keep in mind that if you’re plugging something into your computer, you are giving whoever produced the device hardware access to it. Even something like a USB cable can be pretty compromised. You have no way of looking inside that cable really and checking if there’s a chip there that when your computer’s away will turn into a keyboard to start typing some stuff there or something like that. And that’s not a theoretical attack — we absolutely see these in the wild. So I guess that’s the other thing. You asked for three and I gave you four, so sorry about that.

Rob Wiblin: No, that’s totally fine. Where should I go to buy hardware like a USB hub? I’ve bought those things on Amazon before. Is that a problem?

Nova DasSarma: I think there are some concerns with Amazon in terms of mixed inventory and things like that. If you can buy them directly from vendors, this can be better. I live in the US, and I have some amount of extra trust there for US companies and trying to get hardware that was at least designed in the US. This isn’t necessarily something that will always help you, but it’s at least something that can help as a heuristic.

Rob Wiblin: What about going into a shop, like going into a computer hardware shop? Is that probably better than Amazon?

Nova DasSarma: It might be better than Amazon in some ways, because I think that their supply chain looks kind of different. Amazon does things like mingling their inventory, so they might have a device that they bought from several different people in the same bin, and you have no way of really knowing whether the thing that you got was from the original vendor — or worse, if somebody swaps something out in the box or something like that.

Rob Wiblin: You’ve spoken positively about MacBooks. Generally do you think MacBooks are a good laptop from a security point of view?

Nova DasSarma: I think they are. Mac makes it very easy to always have encryption on your hard drive by default. You should always enable that. And I think that their encryption hardware is pretty good. They have some good incentives in place to ensure privacy and things like that. So I sort of trust that they’re making a good effort at it.

Rob Wiblin: Is iPhone still substantially better than Android phones? Even like the Google-sold ones?

Nova DasSarma: I would say that’s probably true. But I use a Pixel, and my general philosophy here is that if you are accessing something or if it is accessible from your phone, then you should treat it as compromised.

Rob Wiblin: I see. So phones are just sufficiently insecure in general that you don’t want to be doing anything sensitive on them, ideally.

Nova DasSarma: Yeah. There are some things that you can do. On Android, with things like a work profile — you can set apps that are for your work basically on a separate partition from everything else. And I recommend doing that if you’re going to use an Android phone, which I do.

Rob Wiblin: Obviously people should keep their operating system and their phone and their browser up to date with all the necessary patches. Any preference on browser choice or browser behavior, beyond having the ad blocker?

Nova DasSarma: So I use Firefox for my personal stuff. And the reason I do that is less on the security side and more that I think that there should be multiple browsers, and that the web should not be siloed into a single browser’s thing. That being said, if you’re going to choose one browser, you probably should choose Chrome. I think that Google has more folks working on it than Mozilla has on Firefox, so some of those security properties can be better — assuming that your threat actors don’t include Google.

Rob Wiblin: Yeah, an important thing to add there.

LastPass [02:29:27]

Rob Wiblin: For a number of years, I used LastPass. And I was incredibly irritated that despite the fact that these U2F keys were becoming very common, widely available, and dramatically improved security by having this multifactor authentication — something that you definitely have to have in your hand in order to be able to log in — that LastPass just did not implement this, despite the fact that their whole thing is to be the most secure thing, because it’s holding all of your most sensitive information. Anyway, I switched away from LastPass. Do you have any views on what password manager people ought to use? And should I be as disgusted with LastPass, as I in fact am? I’m extra annoyed with them because they claimed for years that they were always about to implement it and then they just stopped even saying that.

Nova DasSarma: That’s definitely a strike against them, because I think that two-factor authentication using hardware tokens is absolutely the way of the future. I expect that to broadly expand as we get easier libraries for people to implement and more demands on things like a SOC 2 audit for security that is more than just wallpaper. I agree that’s definitely something that I would love to see out of LastPass in terms of expressing a better interest in securing your passwords better.

Nova DasSarma: I think a lot of folks use 1Password. 1Password’s been pretty good on this front. That being said, the best password manager is the one that you use. So if you are a person who’s thinking about getting a password manager, I highly recommend starting with something like the built-in Chrome one or the built-in Firefox one. It’s pretty easy to migrate away from them, but you will use it. And that is the most important thing: having separate passwords for every site is consistently always using something that contains a vulnerability that you might have on one site if one password gets leaked or broken.

Rob Wiblin: A vulnerability that I’ve always worried about is that you are inserting passwords, including often your password to your main password manager, into a browser. And all of the extensions that you have within that browser, I think can kind of see those passwords, or they can see the keystrokes that you’re entering into the website. And Chrome extensions, Firefox extensions have a record of being regularly compromised or regularly sold to bad actors, who then use them potentially to steal passwords. Are there a lot of these just gaping holes in the way that ordinary people use computers that are making them much more vulnerable than they really ought to be?

Nova DasSarma: Yeah. In security, we talk about these as side-channel attacks, where the primary channel would be breaking into your bank and the side channel is when you put a keylogger on somebody’s desktop so you can grab their password instead. Certainly extensions are a big concern here. I use quite a few extensions. I think that this is definitely a thing that’s quite useful. I am also in a role where I am being paid to be professionally paranoid, and so I read the code that’s being added to those. Trying to limit the number of them that you’re using, trying to keep an eye on what’s happening there is important.

Nova DasSarma: I would say that browsers are more secure than I certainly thought they were back in the day. Chrome especially has had a lot of work done by a lot of people working full time to sandbox execution of arbitrary code. When you think about programs, oftentimes they are something that’s written by somebody else that’s running on your computer, where your data is.

Nova DasSarma: And the web is increasingly like this. We’re recording right now on Riverside FM. It’s got video, it’s got audio streaming, it’s uploading files, it’s downloading files. It’s able to do all sorts of really, really exciting things. And this is inside the browser. If it was something that you had asked me to download, I would’ve been a lot more concerned. I think that the JavaScript sandboxing ecosystem has gotten very, very advanced. People have put a lot of thought into how to do smart things with it.

Nova DasSarma: I think that browsers in particular are oftentimes more secure than things that you’re running unaudited on your laptop. This is actually something where desktop operating systems have taken a page out of mobile’s book though: sandboxing, by default, is something that’s true on many apps and things like that. Permission dialogues for requesting access to information were not a thing on desktops really, because we didn’t start out thinking about that.

Nova DasSarma: The permission scheme for files on a Linux or Unix operating system has a set of permissions for read, write, and execute for user, group, and everybody. And for the longest time, “everybody” could access everything. This was expected: you were inside of a university, everybody was trusted. And moving to this model where things aren’t trusted by default has been very painful, but I think that browsers have been leading the way on that. So that’s pretty exciting.

Rob Wiblin: As far as I understand what you’re saying, I guess in the bad old days, we had this issue that if you loaded up a website, Internet Explorer or whatever browser you were using was not sufficiently good at sandboxing — which I guess is kind of constraining the code that’s running within that webpage to just interact inside that webpage, inside the browser. Instead, they could frequently find ways to get their tentacles into other parts of the disk to run code that you wouldn’t expect them to be able to run.

Rob Wiblin: But these days we’ve gotten better with Chrome and Firefox, and I guess just better computer security in general — figuring out how do we make sure that this tab that we’re using right now to record this conversation can only do the things that we would expect it to be able to do inside this browser, and not to access files that it can’t access, not to do broader things on our MacBooks that are beyond the permissions that Chrome has given this particular tab. Is that right?

Nova DasSarma: Yeah. I think that’s a really good gloss of this. We’ve developed a lot of tools recently. You were asking about whether things have gotten better or worse here. Our testing for security vulnerabilities has gotten unimaginably better over time. There are things like fuzzers, there’s a thing called American Fuzzy Lop — look it up, it’s a piece of software that generates inputs to systems to try and cause undefined behavior and things like that. Being able to produce these weird automated test streams to break into things is a new innovation, fuzzers in general.

Nova DasSarma: That being said, for browsers specifically, we used to have things like plugins. I think maybe it’s not clear what the difference between a plugin and an extension is. A plugin is oftentimes something that’s developed in native code. So you’ve got something like the Java virtual machine, or the Java plugin. It’s probably going to be written in C and it’s something that runs on your hard drive, separate from the browser process. You’ve got Flash. Flash is oftentimes very controversial in terms of being very easy to use, a great creativity boon, and also the source of countless —

Rob Wiblin: Ungodly numbers.

Nova DasSarma: Oh, for sure. Yeah. It was quite bad. The death of it was very bittersweet, I think. But for example, I’ve got an extension for making sure that things only run inside of the Facebook sandbox, and that’s something where I can verify that it does what it says it does. I can rely on the JavaScript sandbox to restrict access to other sites and things like that.

Stuxnet [02:36:31]

Rob Wiblin: Yeah. OK, we’ve taken up a ton of your time, and I’m sure people have got the impression that you have a ton of stuff on the go to immediately go work on. Maybe a final question. Do you have a favorite story of a hack or a computer security breach that will be fun to share with people?

Nova DasSarma: Sure. The one that springs to mind is Stuxnet, which you may be familiar with as a quite an interesting state actor hack. Basically — despite these centrifuges that were being used for enriching uranium being on a completely air gaps network, and despite them being in a secure facility with the guys with the guns outside — the US national security apparatus managed to get a virus onto these systems that essentially adjusted the RPM of these centrifuges to get them to break themselves. And it’s very hard for me to give a concise version of this thing, but it’s definitely worth looking up. I would also recommend the podcast Darknet Diaries for this.

Rob Wiblin: I started listening to Darknet Diaries. I think its opening line is, “These are stories from the dark side of the internet.”

Nova DasSarma: I know! It’s so dramatic.

Rob Wiblin: And I was like, “I’m not sure that serious computer security people listen to this podcast.” But it sounds like actually they do.

Nova DasSarma: They do, I think. It’s certainly entertaining. And it’s a good way to get ideas of crazy things that people do to break into each other’s stuff.

Rob Wiblin: Yeah. I’ve been listening to that podcast recently, and I’m glad I shouldn’t be too ashamed of the fact. On Stuxnet, there’s this great book about it called Countdown to Zero Day. It’s a very fun potboiler that goes into all of the technical aspects of it, which are truly remarkable. I don’t know when we’ll see the next most impressive technical hack from that point of view.

Nova DasSarma: I have no idea. I’m very excited to see it.

Rob Wiblin: For sure. Well, my guest today has been Nova DasSarma. Thanks so much for coming on The 80,000 Hours Podcast, Nova.

Nova DasSarma: Thanks for having me.

Rob’s outro [02:38:41]

Rob Wiblin: OK, so as promised in the intro, I’m going to go through a bunch of other shows you might be interested to subscribe to if you like The 80,000 Hours Podcast.

Don’t worry, we’re not getting any ad revenue. I’m just a helpful person and a big fan of podcasts.

Probably the single most similar show to this one anywhere is called Hear This Idea. Like all of the shows I’m about to mention, it’s another deep interview podcast, and the two hosts (Fin Moorhouse and Luca Righetti) describe it as “a podcast about ideas that matter — showcasing new thinking in philosophy, the social sciences, and effective altruism.”

A fun episode from them to check out would be #34: Anders Sandberg on the Fermi Paradox, Transhumanism, and so much more.

Another one worth checking out is Narratives Podcast, hosted by Will Jarvis. He describes the show as being about “the ways the world is better than in the past, the ways it is worse, and paths toward a better, more definite vision of the future.” A top episode for me was #90: The 800 Year Decline in Interest Rates with Paul Schmelzing.

This third one many of you will already know about is the Future of Life Institute Podcast, which is focused on existential and catastrophic risks and how to prevent them, as well as longtermism more broadly.

While the topics are similar to this, I’d say the FLI Podcast is a notch more serious and technical than us. If you look back into their extensive archives, you’ll find a series of episodes on AI alignment and another on climate change, which in aggregate would take you very deep on those topics.

A reasonable place to start is just their most recent episode: Daniela and Dario Amodei on Anthropic.

Fourth, a hugely popular show that has been running for an incredible 12 years is the Rationally Speaking Podcast with Julia Galef. Julia is open to covering a wide range of topics, because her main goal is to showcase reasonable and evenhanded conversations about difficult and sometimes divisive issues.

A fairly recent episode that was memorable to me was #250: What’s wrong with tech companies banning people?, a conversation with Julian Sanchez.

Another show along the same lines is called Clearer Thinking with Spencer Greenberg. Spencer pitches the show as a “podcast about ideas that truly matter” featuring “fun, in-depth conversations with brilliant people.” Spencer talks to all sorts of folks, including some I think have great ideas and others I think are pretty misguided — but he gives guests plenty of room to lay out the case for their personal perspective on things.

A memorable recent conversation for me on that show was #97: Why is self-compassion so hard? with Kristin Neff.

A final one that almost none of you will have heard of is called Un equilibrio inadequado with Fernando Folgueiro. It’s basically an attempt to do this show, but in Spanish with Spanish-speaking guests. I know we have plenty of Spanish speakers in the audience. I can speak Spanish reasonably well and I still had to slow it down to follow, but it was a great way to learn vocabulary that’s relevant to the topics we talk about on this show.

A good one to check out first on there would be Julio Elías: Mercados repugnantes. Currently the show isn’t getting new episodes, but hopefully Fernando or someone else in the Spanish effective altruism community will be able to pick it up in future.

I just mentioned a lot of shows and episodes here, but if it’s easier you can head to the transcript for this conversation on the website, scroll to the bottom and find this section, which will have links to all of them so you can go through them systematically.

All right, The 80,000 Hours Podcast is produced and edited by Keiran Harris.

Audio mastering and technical editing for this episode by Ben Cordell and Beppe Rådvik.

Full transcripts and an extensive collection of links to learn more are available on our site and put together by Katy Moore.

Thanks for joining, talk to you again soon.

Learn more

AI security

Risks from power-seeking AI systems

Machine Learning PhDs

Computer Science PhD

Related episodes

August 4, 2021

#107 – Chris Olah on what the hell is going on inside neural networks

Listen now

October 25, 2019

#64 – Bruce Schneier on how insecure electronic voting could break the United States — and surveillance without tyranny

Listen now

January 19, 2021

#90 – Ajeya Cotra on worldview diversification and how big the future could be

Listen now

October 2, 2018

#44 – Paul Christiano on how OpenAI is developing real solutions to the ‘AI alignment problem’, and his vision of how humanity will progressively hand over decision-making to AI systems

Listen now

August 11, 2021

#108 – Chris Olah on working at top AI labs without an undergrad degree

Listen now

June 3, 2019

#58 – Pushmeet Kohli on DeepMind’s plan to make AI systems robust & reliable, why it’s a core issue in AI design, and how to succeed at AI research

Listen now

August 19, 2021

#109 – Holden Karnofsky on the most important century

Listen now

March 5, 2021

#92 – Brian Christian on the alignment problem

Listen now

About the show

The 80,000 Hours Podcast features unusually in-depth conversations about the world's most pressing problems and how you can use your career to solve them. We invite guests pursuing a wide range of career paths — from academics and activists to entrepreneurs and policymakers — to analyse the case for and against working on different issues and which approaches are best for solving them.

Get in touch with feedback or guest suggestions by emailing [email protected].

Our crash course on transformative AI

We've carefully selected 10 key episodes to help listeners get to grips with the potential upsides and downsides of powerful, transformative AI.

Check out 'The 80,000 Hours Podcast on AI'

Listen here, or anywhere you get podcasts:

If you're new, see the podcast homepage for ideas on where to start, or browse our full episode archive.

On this page:

Highlights

How organisations can protect against hacks

Top recommendations for personal computer security

State of the art in information security

Motivated 14-year-old hackers

Is the typical engineer working on non-safety research increasing or decreasing the odds of an artificial intelligence-related catastrophe?

Interesting design choices with big ML models

Side-channel attacks

Articles, books, and other media discussed in the show

Transcript

Rob’s intro [00:00:00]

The interview begins [00:01:08]

Why computer security matters for AI safety [00:06:03]

State of the art in information security [00:15:45]

The hack of Nvidia [00:25:14]

The most secure systems that exist [00:34:51]

Formal verification [00:46:26]

How organisations can protect against hacks [00:52:42]

Is ML making security better or worse? [00:56:34]

Motivated 14-year-old hackers [00:59:32]

Disincentivising actors from attacking in the first place [01:04:12]

Hofvarpnir Studios [01:11:04]

Capabilities vs safety [01:18:10]

Interesting design choices with big ML models [01:27:08]

Nova’s work and how she got into it [01:43:45]

Anthropic and career advice [02:04:16]

$600M Ethereum hack [02:17:01]

Personal computer security advice [02:21:30]

LastPass [02:29:27]

Stuxnet [02:36:31]

Rob’s outro [02:38:41]

Learn more

AI security

Risks from power-seeking AI systems

Machine Learning PhDs

Computer Science PhD

Related episodes

About the show

Our crash course on transformative AI