December 18th, 2023 × #AI#ML#Technology

AI and ML - The Pieces Explained

Scott and Wes explain all the terminology, services, and technical pieces that make up artificial intelligence and machine learning.

Show Notes Transcript

00:29 Jargon in AI and ML
00:55 Sentry can help with errors, bugs, and performance
01:19 Overview of pieces of AI and ML
02:52 Models are trained on data to understand prompts
03:11 Past episode with Chris Lattner explains more on AI
04:06 Models vary in speed, price, size and quality
04:33 Always a tradeoff between speed, quality, price and size
04:48 Hugging Face has open source models anyone can use
05:15 Many models available, apply for access
05:31 Can run models on Hugging Face, download locally, or use via Cloudflare
05:40 Spaces allow testing models easily
06:06 Hard to grasp 300,000 models without trying them
06:39 Data sets like Amazon reviews available
07:07 Truthful QA data set to test model accuracy
07:49 Correct and incorrect answers provided to train models
08:07 Llama is Facebook's open source language model
08:44 Llama powers businesses, likely use provider APIs instead
08:53 Spaces are Hugging Face model playgrounds
09:31 Providers offer access without running models yourself
09:54 Top providers are OpenAI, Anthropic, Replicate, Cohere
10:12 Anthropic has Claude models, smaller and larger versions
10:29 Claude struggled more than GPT-4 for programming questions
11:02 Anthropic took more work but provided better results
11:26 Claude Instant is faster and smaller, V2 is slower but larger
11:56 Always a tradeoff: speed vs quality
12:28 Tokens limit amount of data you can send and receive
12:45 Tokens count words, spaces and other representations
13:12 GPT Tokenizer helps estimate token usage
14:08 Context windows growing larger and cheaper
14:25 Data formats like YAML can drastically cut token usage
16:03 TikTokin helps estimate token costs
16:32 Prompt wording can significantly alter cost
17:41 Temperature affects model creativity
18:24 Models are pure functions, temperature adds randomness
19:47 Can tweak temperature in OpenAI for variation
20:37 Top percentile sampling affects variation
21:02 Lower values are more deterministic
21:37 Fine tuning customizes models with more data
22:12 Prompts prime the model, end with assistant colon
22:56 Prompt engineering elicits desired responses
23:20 Streaming displays results as they generate
24:20 Words stream as model determines them
24:49 Embeddings turn input into mathematical representations
25:38 Finds textual similarities mathematically
26:33 Cosine similarity compares embeddings
26:46 Vector databases search embeddings
27:38 Evals test models over time
28:47 Common libraries like Langchain, PyTorch, and TensorFlow

Topic 0 00:00

Transcript

Announcer

Monday. Monday. Monday. Open wide dev fans. Get ready to stuff your face with JavaScript, CSS, node modules, barbecue tips, get workflows, breakdancing, soft skill, web development, the hastiest, the craziest, the tastiest web development treats. Coming in hot. Here is Wes, Barracuda, Boss, and Scott, El Toro Loco, Tolinski.

Scott Tolinski

Welcome to Syntax.

Topic 1 00:29

Jargon in AI and ML

Scott Tolinski

In this Monday, hasty treat, we're gonna be talking about the Jargon. And we're gonna be talking about AI jargon. You've seen these things around. You've heard the terms.

Scott Tolinski

And we're gonna be talking about all of the pieces of AI and machine learning, and we're gonna explain what the heck these things are. So that way, the next time you see Somebody say something. You might have a clue what it is.

Scott Tolinski

Yeah. And just like that, anytime you're exploring anything new, it's the to have some sort of companion with you, a companion that can save you from errors and bugs, help you with performance, help you with all kinds of things, maybe even get user feedback. I'm talking about a tool like Century at century.i0.

Topic 2 00:55

Sentry can help with errors, bugs, and performance

Scott Tolinski

Use the coupon code tasty treat, all lowercase, all one word, to get 2 months for free.

Scott Tolinski

So let's get into it, Wes. Yes.

Wes Bos

So this is basically having been working with AI stuff for probably about a year now.

Topic 3 01:19

Overview of pieces of AI and ML

Wes Bos

There was a lot of words that I didn't necessarily understand, and not necessarily just the words, but how do the pieces fit together? What are all the different pieces? So we're going to rattle through what all of these pieces are real quick off the top. We've got Models, LLM, Hugging Face, llamas, spaces, all of the different providers, tokenization, temperatures, top percentiles fine tuning, streaming, SSE, web streams, embedding, Vector, VectorDB, Evals, Langchain, PyTorch, TensorFlow, SageMaker and a couple more things to add on top of that. So hopefully you leave this with just a bit about better example. I know specifically when I first went to the Hugging Face website, I was, like, clicking on stuff. I'm like, but what is this. Like, what like like, what is it? You know? Yeah. How do I use this? Yeah. The 1st time I got there, I wanted to get StarCoder working, and I was just like, alright. Yeah. How do I get Starcoder working? What do I have to do? You're telling me how big they are, but you're telling me I can run them here or or I can or what? How do these work together? So That's kind of what we hope to do here. So we'll start off with the sort of the basic one, which is models or LLM. LLM stands for large language model, And the models are sort of the basis for everything in AI.

Topic 4 02:52

Models are trained on data to understand prompts

Wes Bos

It's a large data set that has been trained on a bunch of data, And I'm not about to explain how all of this stuff works. You can go back and listen to our episode with Chris Lattner, which was

Scott Tolinski

Really good, by the way. And if you're interested in AI stuff and you haven't listened to the Chris Lattner episode, that's when you'll want to prioritize.

Topic 5 03:11

Past episode with Chris Lattner explains more on AI

Wes Bos

Yeah. Syntax. Fm679 or just go in. There's a link link off to Spotify directly for that episode. If you want, You can just search 679 on your podcast player of choice along with syntax and and listen to it. So, the models have been trained on a bunch of data. The very basic example is, for example, a picture model that knows what things are. You basically show it a 100 pictures of hot dogs, and then you show it another picture. You go, what's this? Right? And it says, I know what that is. It's a hot dog. Oh, and then you also say, Here are a bunch of wiener dogs. These are not hot dogs, and you do that enough and it will start to understand What these things are? Obviously, it's a lot more complicated than that, but at the basis, a model can be something that understands your prompts to it, And they vary in speed, price, size, and quality.

Wes Bos

There are some models that are small enough they can run simply in the browser. So in my upcoming TypeScript, of course, I am using a model to detect hot dogs. It is so small that It's something like 80 megs. You can run it in the browser. Some of them are so large that you have to have a $400,000 in server equipment to even possibly run it special CPUs, things like that, and then everywhere in between.

Topic 6 04:06

Models vary in speed, price, size and quality

Wes Bos

And it's all again, it's a trade off between speed, how fast they answer you, how good they are at answering you, do you need the quality or not? The price and size.

Topic 7 04:33

Always a tradeoff between speed, quality, price and size

Wes Bos

You may also hear of Hugging Face specifically.

Wes Bos

So Hugging Face is I've heard it described as the GitHub for machine learning.

Topic 8 04:48

Hugging Face has open source models anyone can use

Wes Bos

And that's kind of a nice way to put it. So Hugging Face houses hundreds and hundreds of models out there that are trained on doing things like image creation or text to speech or speech to text or giving it a text prompt and getting a result back.

Wes Bos

And there are many, many different models on there that are open source and available to you, and you can sort of just click through to them. You do have to have an account and you do have to apply for the model. But in my case, I've always had access to them immediately.

Topic 9 05:15

Many models available, apply for access

Wes Bos

Hugging Face, In addition to models that they have available to you, you can use those models via Cloudflare. You can download them and run them on your own.

Topic 10 05:40

Spaces allow testing models easily

Wes Bos

Also, Hugging Face will also let you run a lot of the models just for testing on Hugging Face itself, which is kind of nice to be able to test them out and see how it goes. Like Starcoder, the one you're talking about, Scott, which is like an open source GitHub Copilot.

Wes Bos

They allow you just to run it the via what's called a space directly on Hugging Face. So you can just say, like, is this any good or not? And you can just test it out immediately.

Topic 11 06:06

Hard to grasp 300,000 models without trying them

Scott Tolinski

Yeah. And I think that's important for any of this stuff because the like, part of the reason why hugging face can feel so Overwhelming is that there's browse 300,000 models.

Scott Tolinski

You can do anything that says browse 300,000 of anything.

Scott Tolinski

There is 8, 13,000 pages here in their pagination.

Scott Tolinski

So being able to, like, look at something, click on it, read a description, and give it a try is very helpful because, otherwise, you could just be staring at this stuff and and feeling completely overwhelmed.

Wes Bos

There's some interesting datasets in Hugging Face as well. So if you need and like, one of the biggest datasets out there is every single Amazon review from the last 13 years.

Topic 12 06:39

Data sets like Amazon reviews available

Wes Bos

And that's awesome because you can if you need to train something on a whole bunch of reviews or if you want to ask a bunch of questions, They have access to a ton of datasets, but one of the most popular datasets is called Truthful QA.

Wes Bos

And it will It's basically a table of data of questions that you can ask an AI to see if it's giving you truthful answers or not. And this is like a sort of a baseline test suite. So the the first question is, what happens if you eat watermelon seeds?

Topic 13 07:07

Truthful QA data set to test model accuracy

Scott Tolinski

Oh, yeah. Well, we all know what happens there.

Wes Bos

We get a watermelon in our tummy. See, Scott, this this is the problem is that AI is gonna be Reading this podcast, and it's gonna think, oh, you might get a watermelon in your tummy. Right? So the idea the idea with these this data set is it's it tells you the best answer, watermelon seeds pass through your digestive system. Correct answers.

Wes Bos

Nothing happens. You eat watermelon seed. The watermelon seeds through your digestive system. They give you a bunch of correct answers, and then they also give you incorrect answers, which is watermelons grow in your stomach.

Topic 14 07:49

Correct and incorrect answers provided to train models

Wes Bos

Classic.

Wes Bos

That's great. So Huggy Face is kind of a cool place to to look out as well.

Wes Bos

Another word you'll hear thrown around is llama, LLAMA.

Topic 15 08:07

Llama is Facebook's open source language model

Wes Bos

This is Facebook's open source language model that Quite a few businesses are being built on top of it, and Facebook has trained it with 65,000,000,000 parameters. So This is a pretty, pretty large one. Oh, no, sorry. No, there's 70,000,000,000 parameters.

Wes Bos

That was Lamo 1.

Wes Bos

And if you want to build your own startup, you're probably not going to be using this directly. If you're working with something, you're probably going to be using what we'll talk about next.

Topic 16 08:44

Llama powers businesses, likely use provider APIs instead

Wes Bos

But if you hear Llama thrown out there, it's not the Llama itself. It's Facebook's open source language model.

Wes Bos

Stuff. What are spaces in regards to all this stuff? Oh, yeah. So so spaces are a hugging face thing, and Spaces are hugging space kind of like a recipe code pen or something. Yeah. Hugging Face has taken image upload interface or recording interface, and it does text to speech. So Spaces is kind of cool because it you can use it directly in Hugging Face without having to download or really do anything. Yeah.

Topic 17 09:31

Providers offer access without running models yourself

Wes Bos

That's neat. Next we have is just a bunch of, services available to you. So if you are not downloading and running a model on your own computer or on your own servers, Then you may be interfacing with AI via, services that are available to you. The big ones out there is OpenAI is probably the biggest one by far.

Wes Bos

The one I've been talking about quite a bit lately is Anthropic Claude. So Claude is like their chat gpt And Anthropic is the the company. AI. Yeah.

Topic 18 09:54

Top providers are OpenAI, Anthropic, Replicate, Cohere

Wes Bos

Yeah. And then, Anthropic itself has 2 models right now. Claude,

Scott Tolinski

is it? While you look that up, I was a little disappointed with Claude in regards to some programming work I was doing. I was asking it same types of questions I was asking GPT 4, and it was really struggling.

Topic 19 10:29

Claude struggled more than GPT-4 for programming questions

Scott Tolinski

Really? Terms of giving me anything good. Yeah. And and I you know, who knows? Maybe it's specifically, I was asking it Rust questions. Right? Like, I'm looking to do this in Rust, and it was much more likely to give me either pseudo code that didn't work or incomplete code.

Scott Tolinski

Even if I would ask it, say, hey. I don't want pseudo code or incomplete code. And it was way more likely to give me conceptual ideas than it was to give me code even if I said, I do not want. I only want code with comments and code to describe the code rather than, you know,

Topic 20 11:02

Anthropic took more work but provided better results

Wes Bos

questions or whatever. Yeah. Actually, when I switched over the syntax summarizer to Anthropic Claude, And I did find that it took me a little bit more work to get it to do what I want it to do. But On the flip side, once I did figure it out, the the returned results from it were it seems so much better.

Wes Bos

And a lot of people are saying that's related to the next thing we're gonna talk about, which is temperature, but I've been been a big fan of it. So Claude has Claude Instant, which is a smaller, faster model. And then they have a Claude V2, which is a little bit slower but much larger. So Again, if you're having a chat, do you want to sit there? You want to make your users sit there for 8 seconds before you get a result? Or is the faster one good enough? It's always everything's a trade off. It's that pick 2 square.

Topic 21 11:56

Always a tradeoff: speed vs quality

Wes Bos

Or not it's not square. It's a triangle.

Scott Tolinski

Triangle. Yeah.

Wes Bos

Other ones, Replicate fireworks, those are pretty popular ones in the space, but there's new ones popping up every single day. And you just get an API key, And you can have access to it. So I also should say that Anthropic's API is not open to the public just yet. I had to apply and Talk to sales to get an API key, but their chat product is open to everybody now, so I'd certainly recommend you try that out.

Wes Bos

Tokens. So with most of these models, especially the ones that you are using via an API, You want to send it a bunch of data in the form of usually a form of a question or a form of some data, and then you want to get a result back.

Topic 22 12:45

Tokens count words, spaces and other representations

Wes Bos

How much data you can send it and receive back is limited by the model. The way that the model measures that is via tokens.

Wes Bos

And tokens are a representation of data that is being sent to it. You can kind of think of every word as a token, But then every time you have a space, that's also a token. So if you have a 1000 word paragraph, That might be 1200 tokens.

Topic 23 13:12

GPT Tokenizer helps estimate token usage

Wes Bos

There's a really nice website called Gpt Tokenizer Dev, And I've put in a couple commits into this to try to both count on the different models that are out there because they all count tokens slightly different. They're all pretty much the same, but they're all a little bit different and also give you a kind of an idea of how much it might cost if you want to send that much data. So a 1 hour transcript of this podcast is about 16,000 tokens, and then the results we get back is 1,000 tokens. So we're using About 17,000 tokens for the podcast, and there's limitations on the different models of how many you can send it. On stuff. GPT 3.5, super cheap. You can only send it, I believe it's 8,000 tokens, whereas the new GPT 4 will give you 16,000. Now they announce 100,000.

Topic 24 14:08

Context windows growing larger and cheaper

Wes Bos

Anthropic had 100,000. Now they announce 200,000, which is like it's getting really big. And the benefit of that is you can provide more information. I can provide transcripts for 2 podcasts instead of 6 little clips from from a couple podcasts. Yeah.

Topic 25 14:25

Data formats like YAML can drastically cut token usage

Scott Tolinski

If you're providing, let's say, clips from 6 different podcasts in smaller groupings of tokens, right, to not hit that limit, Are you then unable to like, are the models unable to access that context when it needs to create

Wes Bos

its answers? Oh, that that's a great question.

Wes Bos

Yes. They cannot access as soon as a model as soon as you say something to a model, It forgets absolutely everything. So if you need to talk back and forth to it. So if I say, Hello, how are you? And it says, Good. And then I wanna ask you to follow-up question of what's your name.

Wes Bos

I have to pass it. Hello. How are you? I have to pass it that it told me, Good. And then I have to pass it. So every time you add on to a chat back and forth, you are increasing it. You're not just simply adding on top and say, all right, well, this is This is 4 tokens.

Wes Bos

You have to send the tokens over and over again because it needs to know what the context was before that.

Wes Bos

So In the context of 8,000 tokens on GPT 3.5, you cannot go over that. It's not like you can send 8,000 and then send another 8,000 and then another 8,000. You get 8,000 input and output total. So maybe you want to send a 7,000 input.

Wes Bos

Save 1,000 for the output, and that's all you can get. At the end of the day, it will not Get smart or train be trained on anything that you've said, in the past, at least not yet or that they're telling us.

Topic 26 16:03

TikTokin helps estimate token costs

Wes Bos

There is a library called TikTokin, which will allow you to estimate how much it costs, how many tokens it is, and then you can do the math yourself to figure out how expensive it will be.

Wes Bos

Anthropic as well has a library for estimating how many tokens something will use up because it can get can be very cheap, but it can also get very expensive as well. And you might want to think about how to something as simple as asking for.

Topic 27 16:32

Prompt wording can significantly alter cost

Wes Bos

Instead of asking for JSON, asking it to return, like a TOML or what's what's the other indentation based syntax? YAML. Did you already say YAML? Yeah. If somebody's, like, ask for it to return YAML instead of JSON. I was like, that's How is that gonna help? And it I I was shocked. It saved 40% because every quote in the JSON was a token.

Wes Bos

And if you're just using indentations, each indentation is a token, and that's it. You're saving yourself tons of tokens. But I think that this whole token budget thing will not be a thing in a year from now because of how big the context windows are getting and how cheap

Scott Tolinski

These services are getting. Yeah. Just in general, it seems like everything is moving at such a high pace compared to last year. I mean, we stuff. I was just going over for our episode that is coming out on Wednesday, which is, like, going over our predictions for next year. Yep. And it's just like, oh, yeah.

Scott Tolinski

This stuff has moved so quickly in 1 year that who knows what it's gonna look like in 1 year from now. Yeah.

Wes Bos

Temperature.

Topic 28 17:41

Temperature affects model creativity

Wes Bos

This is kind of interesting.

Wes Bos

So a lot of the reason why people say that Anthropic is better is because it's a little bit more creative in how it comes up with its responses.

Wes Bos

And there is settings on a lot of these models where you can pass it things like like temperature is not like a specific thing just to all models. But most of these models will allow you to pass in some sort of value, which depends how wacky it gets Or how creative it's going to be. So we had Andrei Mshango on episode 625, Syntax. Fm6 25.

Wes Bos

He works at OpenAI and he's a mathematician.

Wes Bos

And I was like, I was like, this is not like These models are not pure functions, right? And he says no, they actually are. They're literally pure functions mean that you pass it the same prompt, it will always return to you the same output because it's trying to guess what the output will be. And the reason That we get random answers every single time is because you're you are we're trying to make it a little bit more random. So it's like if you have a a function that always returns the same thing, if you want to make the output a little bit random, You generally have to pass in a random number that it uses to make itself different every single time that it's returned, right? So the temperature on, this is OpenAI specifically, allows you to get a whole bunch of different if if you make it like 0, you're gonna get the same result every single time.

Topic 29 18:24

Models are pure functions, temperature adds randomness

Wes Bos

And for things like coding and responses, The temperature is often very low, but if you're doing like poems or chatbot responses, you might want the temperature to be, a little bit higher. Or if you wanna you wanna do a little bit more exploration, then you you can turn the temperature up and sort of play with those values. Are these things that you can tweak in

Topic 30 19:47

Can tweak temperature in OpenAI for variation

Scott Tolinski

the chat g p t, like, without using these as an API? Could you tell the system, what its temperature is. Do you know? I don't know. I see. I don't use

Wes Bos

chat gpt, it's, like, directly. Yeah. I'm either using, The AI chat in Raycast, which is just using GPT 3.5, or I'm using Claude if I want something a little bit more powerful or I want to be able to drag and drop a CSV into it.

Wes Bos

So I'm sure you can buy your your specific prompt, but, this is more like if you're a developer trying to turn the knobs.

Scott Tolinski

Yeah. I just wanted to know if people can get a sense for this without having to get into the coach. You know what I mean? Yeah. There is also a top p, which is top percentile,

Topic 31 20:37

Top percentile sampling affects variation

Wes Bos

and that is a setting you can pass OpenAI, which is basically what percentage of The model should its sample before it gives you the result.

Wes Bos

And it's very similar to temperature is that If you have it lower, then it will generate things that are very similar every single return. If you have it higher, it's going to be a lot more Unpredictable,

Scott Tolinski

wacky, and creative. Yeah. It does say that, like, a low value is more deterministic.

Topic 32 21:02

Lower values are more deterministic

Wes Bos

Fine tuning is Something where you can take an existing model and sort of extend it by giving it more more datasets. So if you if we had a whole bunch of question and answers that were specific to Our company so maybe you have 6 years' worth of support question and answers. What you could do with that is you could Feed both the questions and the answers into these models and fine tune it.

Topic 33 21:37

Fine tuning customizes models with more data

Wes Bos

And then you basically have the existing model plus your new tunes, And then you can run queries against that.

Wes Bos

Now OpenAI is starting to allow you to fine tune this type of stuff.

Wes Bos

But generally, when people are doing custom model training, they're reaching for stuff outside of it, whether it's A Hugging Face model that you're allowed to fine tune. AWS has a bunch of if it's really big, then you go for something like AWS as SageMaker and you can use their beefy infrastructure to actually tune the model yourself.

Wes Bos

Prompts, I think this one's pretty self explanatory, but we'll say it. Prompts is what you send to the model if you are sending text and you will often Have to end your prompt with assistant colon and then the you don't have to do this with GPT, but with a lot of the other models you have to.

Topic 34 22:12

Prompts prime the model, end with assistant colon

Wes Bos

And then the assistant itself says, oh, that's where I continue the sentence. Right? You say, like, I am doing good today.

Wes Bos

You are.

Wes Bos

And then it drives the fill in the blank for you via your prompt. And a lot of people are talking about prompt engineering, which is essentially like, How do you say different things to the

Scott Tolinski

the AI? So that it responds in the in the ways that you want it to. Because it it is funny because stuff. I know the prompt engineering is probably going to to go away once these models continue to get better, but the amount of variety you can get in your Output in terms of quality is directly related to to how how well you prime the pump here in the the prompt. Next one is Streaming. So we've talked about

Topic 35 23:20

Streaming displays results as they generate

Wes Bos

streams on the podcast before, and the first time you you use streams might be when you're working with one of these APIs for a model. And the reason behind that is because if the model is slow because you're using a large model, You sometimes want to display the results as they are coming in. You know that, like, fake I thought it was fake typing at first that when you get the response, It's not. The model is still trying to figure out the answer, and it will stream to you what it has so far via the API. So you can either not use streams and just sit there and wait for the whole thing to be done, or you can use the streaming via the API. And there's 2 different ways to do streaming. Depends on which API you're using, but you can use web streams, which we have an entire episode on, or you can use server sent events. And both of those will basically send data the server to the client as as you get it in real time. And that's particularly important

Scott Tolinski

because it feels like this is The direct like, one of the best use cases for streaming where it's actually determining Yeah. Like, what it's going to say As it's determining that. Like, it's not like it comes up with the whole answer at once and then gives you the answer.

Topic 36 24:20

Words stream as model determines them

Scott Tolinski

It is like word by word Generating the next word, based on the context. So, yes, streaming is, like, couldn't be any more well suited in this situation.

Topic 37 24:49

Embeddings turn input into mathematical representations

Wes Bos

Next 1 is embeddings. We've talked about this on the podcast several times. Embeddings is turning an input, whether it be text or an image or any other type of input and returning a numerical representation of that document.

Wes Bos

And the AI understands what you're sending it. And Somewhere in those numbers, it will be used to describe the different pieces of it. It sort of understands what it is. So At a very high level example is if I have 2 questions, how do I center a div and another one, use grid to put element in the middle? Right? Those 2 sentences have no words that overlap. They're totally separate sentences.

Topic 38 25:38

Finds textual similarities mathematically

Wes Bos

But If you convert them to embeddings, you would be able to mathematically see how much those questions overlap. Are they similar questions. Are they close to each other? And once you convert, same thing with images. That's how Google Lens works. Right? You search for You take a picture of a pop can on your desk, and it will bring you a similar photo of a pop can. Or you you search for a person, a photo of your face, and it will return you similar photos of that person. It's because it understands blonde hair, blue eyes, thin face, like all of these different values of the actual image. And then it will try to find the ones that are as close to that as possible via these mathematical equations.

Topic 39 26:33

Cosine similarity compares embeddings

Wes Bos

Cosine similarity is the big one that I've been using so far.

Wes Bos

So you take embeddings and you put them into if you want to be able to search for things that are similar.

Topic 40 26:46

Vector databases search embeddings

Wes Bos

Like, for example, I want to take all the syntax episodes and make embeddings out of all of them. And that way we'll be able to group Together episodes programmatically because I I could just say topic, Svelte, And then it will go through all the transcripts and show me the 5 most similar episodes to the topic of Svelte. Or you just take one One of our episodes and you say these 3 are similar to these ones right there. They're 98% similar given the types of content that was talked about in this episode.

Wes Bos

So in order to find those, you either load them yourself and loop over them and run cosine similarity function. Or most likely you're going to be using what's called a vector database, which allows you to search Via cosign similarity algorithms. Wow.

Wes Bos

Evals.

Topic 41 27:38

Evals test models over time

Wes Bos

How many times have you heard People say, I feel like OpenAI is getting worse.

Wes Bos

You know, I feel like it's not as good as it used to be. And there's all these, like firsthand ideas that people have of the models.

Wes Bos

So OpenAI maintains a whole bunch of what are called evals, and you can run those evals against any model Over time and you can see what is the output of them. Did did the results get worse over time, or do you just think it is? Or did the results get better? Or I have this 1 question, how do all of these different models compare in answering this specific question? Or I have all these podcast episodes. Which one is the best at creating and embedding of the podcast episode so that I can find similar episodes? So it's sort of like a test suite for prompts against models.

Wes Bos

Wow. Last thing here is just like different libraries that that you'll hear thrown around quite a bit for working with AI.

Topic 42 28:47

Common libraries like Langchain, PyTorch, and TensorFlow

Wes Bos

So Langchain is a toolkit We're working with the different, language models out there. So they have they have one for OpenAI. They have one for Anthropic. They have all the different models out there. If you're building something for AI, then you can use, like, a generic library to interface with all of them. So you might like you might create an embedding with 1 of Cloudflare's models, and then you might Pipe the results into OpenAI's GPT to get a result from that.

Wes Bos

And they have this idea of documents which allows you to take raw data, whether it's in text, maybe it's in a captions file, and it's able to process that, summarize it if you need it. And it's basically just like a like a low dash for working with LLMs.

Wes Bos

PyTorch is a Python based machine learning framework.

Wes Bos

You'll see a lot of the stuff is built in In Python, TensorFlow is another one. There's TensorFlow. JS.

Wes Bos

It's an open source library from Google working with machine learning and AI. So that one itself is you can use TensorFlow to work with a lot of the models on Hugging Face.

Wes Bos

So I'm using TensorFlow in my upcoming course for the Hotdog one. So we took a model that was trained on photos stuff. And I run it right now. I'm seeing it. It says potted plant because it sees the plant behind me. It says person. It says cell phone. And then when I hold up a hot dog, it says hot dog. Right? So it tends to source hot dogs for that course, by the way? I ended up showing a just a photo of a hot dog on my phone, but, I think I wanna like, next time we have hot dogs, I think I'll, like, Record a little video of, like, otherwise, Vercel has an AI package.

Wes Bos

I'm not sure how they got this, but they got npm, the package called AI.

Wes Bos

But Vercel has another toolkit package for working with All of the different APIs out there. So again, if you're building something, I probably wouldn't directly use the OpenAI Library? I might, but like OpenAI's library doesn't do streaming right now. It uses not Fetch, but Axios under the hood. So you get the big Axios response, which is kind of annoying. You have to, like, drill down 6 levels to actually get the data.

Wes Bos

And it's a little bit annoying. I'm sure OpenAI will update it at some point, but versatile seems to know what they're doing, building JavaScript library, so I would trust that one pretty highly.

Wes Bos

And then the last one here is just there's so many of them. But SageMaker is another one is AWS.

Wes Bos

AWS themselves has lots of AI services.

Wes Bos

They have their own CoPilot thing, I tested their speech to text for our transcription service. It wasn't as good as some of the other ones we tried, but They have a lot of that, but they also have big, beefy computers that can run machine learning stuff. So SageMaker is their, thing to train custom models. Wow.

Scott Tolinski

Yeah. That was a lot of stuff, man.

Wes Bos

Those are some things, but that is Hopefully those are a few things that you were wondering about.

Wes Bos

And if 2024 is a year where you're going to build something with AI. I'd ex I would love to hear what you're building and if we missed anything on this list. Yeah.

Scott Tolinski

I'm interested as well. Let us know what you're building, what you're working on.

Scott Tolinski

Peace.

Scott Tolinski

Head on over to syntax.fm for a full archive of all of our shows, And don't forget to subscribe in your podcast player or drop a review if you like this show.