I’m not a firefighter

00:00

Hi everyone. My name is Lukas Meyer. I am not a firefighter. Imagine a job interview, where one of the components of the job interview is actually an interview with a psychologist. Now imagine software or like an app that takes a recording of a conversation like that and turns it into a report. You could imagine that a prompt for an app like that would look kind of like this. You would throw in a bunch of example reports, you would throw in a transcript of that actual interview that happened, maybe some private notes from the psychologist that did the interview and maybe some generic instructions on what the report should look like, how long it should be, things like that. So I have this app in production and it works kind of great, except like from time to time we have this problem where it misgenders the candidate. It would say him when it's really her or her when it's him because we don't actually feed any of the real names into the language model. So I thought I would fix that because normally in the interview there's some clues there to figure out at least what the gender is.

01:22

So I try to fix it in the prompt. Let's do it together. I added analyze the transcript, try to figure out what is the name and the gender of the candidate and write the results in candidate tags. Like all the cool kids do with the chain of thought thing where you first have it write out some things so that later it remembers and no one knows how it works but it's how we do these things. And I, does this, like, who thinks this would work? No one? Yeah, like, thanks, Ali, like, thank you.

01:59

I also thought it would work because I shipped it but I got a phone call from one of my clients and they were saying that they were using it for a job applicant for the head of HR and that they were getting a report that was really recommending them that this person would be a great fireman.

02:20

And then I got another phone call for a job opening for a marketing associate who was also being recommended that she would be a great fireman. So what happened? Any ideas? Type O? Okay, so... Miss gendered? So what happened is we actually asked the LL M to write something, right? What do LLM's do when you ask it to write something and it doesn't know? Number one, make it up. Make it up. Oh, here we go. Make it up.

03:03

Number two, though, is look at your examples. And it turns out in these transcripts of these two clients that were complaining, there were no clues in the transcript whatsoever about the gender or the name of the person. So what did the LLM do? It's actually not that crazy. It just took a name from the examples. It wrote candidate Minaire van Dorn. But because of the way that the chain of thought works, now it has convinced itself that its job is to write a report for Minaire van Dorn, which is already in the examples because Minaire van Dorn was one of the example reports and he was applying to be a fireman.

03:54

Okay, so you would think easy fix, maybe not easy fix. Let's try to fix it. Let's add this in the middle. When it's not clear from the transcript what the name or gender is, always use "mavrada frieze". Who thinks that's not a good fix? It's like a one, two. Who thinks like, what could go wrong? Like, should I ship this? Anyhow, like, what's going to go wrong if we ship this? Nothing? So nothing's actually a pretty good answer. But I think what I learned from this whole endeavor is that the thing you need to check if you make a fix like this, it would go wrong if there happens to be a "mavr ada frieze" in your examples. So it is a, this whole adventure is a reminder to myself that you really, if you make software like this, which I do, you really need to be intimately familiar with all the examples in your prompt because if anything of the case that you are actually being using it for right now happens to overlap too much with an example, then the "lm" is going to, like, lean into the example 100x too much and you'll get bullshit like this where everyone is an amazing fireman.

05:20

That's it for me. Go ship something cool and up to the next speaker. Oh, maybe, do I get a question? I don't know. One question. One burning question. One burning question. Who's there? So your clients conduct in job interviews with web psychologists? Is that a thing? That is a, yes, that is a thing. If you're like, if you're doing a job interview for, I don't know, CTO, some are CEO, but also sometimes like for, just like a, like a fireman.

06:00

They would, as part of the interview process, they have like a sort of like a psyche valve thing. It's like an external company where they ship the candidate and they just want, you know, someone else to say like, okay, this person is not all fucked up. Like, I think , I think you're good.

06:17

And they use this software to make their job easier.