6 minutes

Whisper Clipboard

Whisper Clipboard, a new tool, enhances language models' efficiency. It captures questions and essential details, streamlining the prompt process with a convenient paste queue. Activates with keystrokes, saving time and effort.
00:00

So hi, I'm Kanishk. I'm actually from the US, and I'm only an Amsterdam for a month for work. And I've been to AI Tinkerism at NSF, and I liked it, so I decided to come to one here. For context, so me and my friend have been working on this tool for a bit, and it's called Whisperer Clipboard, and basically a lot of time when we want to craft a prompt that we want to ask a language model, we want to maybe like, interspers questions like thoughts we have with context about the problem.

00:42

And this helps us do that. So for example, let's say I'm reading an algorithm textbook, and this is a description of the solution to like the end queen's problem. So one way one thing I can do is I can hit Command Shift X to start recording, and I'll ask, Hey Claude, I don't quite understand this description of the end queen's algorithm.

01:10

Can you explain it? Here it is. So that's my recording, and then I can like highlight the parts I want to do, and whenever I copy it, that copy part will show up there as well. And then I can give it some more context like, and here is like a more formal pseudocode description of it. Could you maybe please also give me an example that works through it? And then I can highlight the next part, and I'll show up as well. And it's cool, like the UI feature, or I guess UX feature of this I like a lot, is it's like a queue. I can like control V, control V will go to the next one, next one, next one, and I can press Enter.

01:58

So this is trying to reduce like the activation energy of utilizing stuff like Cla ude, where we need to give context and multiple sources along with instructions like interspersed them. And yeah, that's basically it.

02:17

[Applause] This is very cool. I'm cute. Essentially you're kind of maintaining a clipboard like thing. Would that be fair? Yes. How do you build that context across different kind of, you know, if it's the web browser, or even if it's just like an application that you're using, and also both text and voice, how do you bring all of these things together? Because as a user, whenever thinking about, I'm in this app or on that web browser or this tab, so how do you think about that? So right now we're triggering off like the command C, like set of key presses. And this app actually started more out as just like a transcription app than my friend made for himself because he like kind of injured his friends. I got like RSS, he injured his fingers, RSS, I've been like fl anking too much basketball.

03:13

So he wanted to like get transcription out. And so right now all we're doing is command C, but we have seen other transcription apps, say for example, they'll take in the current focused window as context when feeding your transcription to models like whisper. So for example, if I'm in my messages app, and I then hit command shift X here and start recording, like some other apps, we will like take what's on your screen. So and feed us context in whisper and the value out of this is if there's proper nouns like names or like Twitter handles or stuff like that, whisper will actually able to get them. So this is like a future extension that I've seen in other transcription apps, but we right now only trigger off a command V.

04:06

I imagine going through a longer research session, the Cliburnian environment could get quite noisy. And have you thought of like in my experimentation, I found that some some models are better if you have like a series of a stack of clips and some are like completely outside of the topic of what the others are. And you thought of like I know grouping them or sort of kind of making some judgment around like yeah, this looks like you just copied a message from random DM as opposed to this is topical to the prompt that you're currently building. We don't have that yet.

04:42

Actually he fixed this like this morning so I could demo it. And we have not like released it to anyone yet. Yeah, we don't have that yet, but on more of the transcript side, he does have plans there to add more like LLM based features to the transcripts. I've like started a job so I've like stopped working on this. If you're interested in using it, I'll like forward you and for like your Twitter to him and he would like love for you guys to try it out.

05:23

Just curious how you thought about this like now you it's I think it's awesome that you can very easily sort of create contacts around like sort of like copying something trans cribed, and etc. But you still now have to paste it into cloth like what is the reasoning behind not actually just hitting enter on the clipboard basically and getting the answer there? Have you thought about that? I just haven't thought about it like this like the kind of the stack based I guess Q. It's like fairly easy like you just press like command V five times.

05:56

Yeah we just didn't think about it. Okay, okay, yeah, like it. [silence]