🪪 Certificate chains, Dingo, and ML in Go with Riccardo Pinosio and Jan Pfeifer

Jonathan Hall:

This show is supported by you, our listener. Stick around till after the news to hear more about that. This is Cup and Go for Sunday, 12/07/2025. Keep up to date with the important happenings in the Go community in about fifteen minutes per week. I'm Jonathan Hall.

Shay Nehmad:

And I'm Shay Nehmad.

Jonathan Hall:

Hey, Shay. Don't talk to you on Sunday very often.

Shay Nehmad:

Yeah. It's a it's a chill episode. Like, it's easy like Sunday morning. I am here to update that, two weeks ago, I said that I'm going to a basketball game, and I'm gonna take a Waymo. And that happened, and the basketball game was awesome.

Shay Nehmad:

Cool.

Jonathan Hall:

And the Waymo didn't catch on fire.

Shay Nehmad:

No. It drove super smoothly through, like, quite a lot of traffic challenges next to the Chase Center, people, like, walking on the street on the way to the game and whatever.

Shay Nehmad:

It was a super smooth ride. Not sponsored by Waymo, by the way. Of course. I don't even know if they use Go, so they can go and, you know, whatever.

Jonathan Hall:

I don't know.

Shay Nehmad:

But, yeah, it was a very good game. Awesome. Any other NBA fans, if you're following, Denny Avdiaz having a really good season. I'm I'm enjoying I don't know. Maybe if Denny's listening.

Shay Nehmad:

I don't know if he knows Go. Oh. But Denny, if you're listening, Go Denny. He's like this Israeli player in DnB.

Jonathan Hall:

If Waymo is using Go, do you think they've upgraded to Go one point twenty five point five already?

Shay Nehmad:

I would re highly recommend it because of the new security releases that I'm about to tell them about.

Jonathan Hall:

Yeah. Tell us about those.

Shay Nehmad:

So Go 25.5 and Go 24.11 are released with two security fixes related to the x five zero nine package. X five zero nine, if it sounds familiar to you, if you're like, wait, crypto stuff usually I don't remember, but this I remember. This is certificates, which is something you usually see on every website. Right? And well, hopefully, you see on every website you visit.

Jonathan Hall:

Not on my website. I don't mess with that silly stuff.

Shay Nehmad:

Yeah. We should we should have you should look for that little, you know actually, browser user experience has done a pretty good job. Even my mom, when, like, she visits an unsafe website, she can figure out that it's going on right now.

Jonathan Hall:

Nice.

Shay Nehmad:

And underneath, you know, TLS, like SSL, there is five zero nine. And there are two very easy to understand bug fixes there, which is, again, something pretty rare with the security fixes, but I like. One of them is from, Philip Antoine, Katana Cyber, which we mentioned on the show before. I really like the their website, which is when you construct an error string in the host name error, there's no limit to the number of hosts that will be printed out. So if you wanna DOS Go's, like, some Go code, probably a server that does stuff with certificates, you can set a certificate with a really, really, really big amount of hosts, and then the last one will have an error.

Shay Nehmad:

And then it will just concat a really, really long error string and might cause runtime problems, because it's like quadratic runtime.

Jonathan Hall:

Mhmm. That

Shay Nehmad:

makes a lot of sense. Right? As an attack vector, it's just like a field you control. Yep. How would you fix it?

Riccardo Pinosio:

What do

Shay Nehmad:

you think, Go team? Or Filip?

Jonathan Hall:

I don't know.

Shay Nehmad:

So there's this weird case where, well, the the specs says you can send as many as you want, but you just have to be sort of programmatic about it. So a, they're using strings builder instead of just concatenation. Which is a good trick to know generally, right? StringsBuilder is a more efficient way always, I wanna say, question mark. Is that true, or am I misinforming the people?

Jonathan Hall:

Say it again?

Shay Nehmad:

What was the question? Is the strings builder always more efficient

Jonathan Hall:

Always.

Shay Nehmad:

Than than just concatting, or am I misinforming

Jonathan Hall:

the I it I think it always is. Whether it's worth the extra boilerplate.

Shay Nehmad:

Oh, got it. Got it.

Jonathan Hall:

Is is the question. Right? But I think it's always more efficient.

Shay Nehmad:

Yeah. So they use strings builder, and also just pragmatically, they're like, okay. If a certificate has a 100 or more, we're not printing them. So all the devs that relied on the error message having a 102, get fewer domains. And the second fix is also pretty cool.

Shay Nehmad:

Do you remember how certificate chains work?

Jonathan Hall:

Vaguely. Like, you can you can, like, certificates can sign certificates can sign certificates.

Shay Nehmad:

You have a few root certificates that you Right? And then they give out permission to certificate, like, Let's Encrypt or whatever to give out certificates in their name, etcetera, etcetera. Yeah. What do you think if I get a constraint, like an exclude constraint in my certificate? So like, let's say my our website, right, wwwcouplego.dev

Jonathan Hall:

Mhmm.

Shay Nehmad:

Let's say we personally had the certificate there. Although Right. I've I you set that up, so I don't know. It's probably some managed service. Right?

Jonathan Hall:

I think transistor.fm handles that for us.

Shay Nehmad:

Yeah. So let's say we manage the whole thing. We build it from the ground up, you know Mhmm. Whatever, and we held the certificate. And then we had api.acouplego.dev.

Shay Nehmad:

And we didn't want the certificate to cover that because that was served from another server. Sure. Yeah. Should we be able to then define a certificate on star? If we excluded api.kapago.dev, should we then allow have, you know, the the the s anstar.kapago.dev in our certificate?

Jonathan Hall:

So we could say star.kapago.dev except for api.kapago.dev? Yes. Should we I don't know. Like, according to the spec, I don't know, but it seems logical that it it could potentially be allowed.

Shay Nehmad:

You just you struck the nail on the head because this is just not well defined in the spec. Okay.

Jonathan Hall:

This is

Shay Nehmad:

just a case that's not well defined in the spec. It's like two edge cases, extra features. Right? Star and exclusion are both not like the happy path flow, so it sorta slipped through. And they fixed it.

Shay Nehmad:

It's a minor security bug, but to me, it's it was interesting just because of the way, you know, you you think about specs like these as, oh, if we just implement everything perfectly to spec, everything will be fine. But when the rubber hits the road where you actually have to implement these specs, you discover they're not they're just as faulty as the code that implements them. Like, sometimes they're incomplete at all.

Jonathan Hall:

So what was the bug, and and how did they fix it?

Shay Nehmad:

Bug is sort of a, you know, strong word to use for But let's say you have an exclusion, and then in the one of the leaves of the I I mentioned the chain. Right? Yep. So one certificate below, you have the star. You should now you should get an error.

Shay Nehmad:

You should get So

Jonathan Hall:

it was ignoring the exclusion, like like the star was overriding the exclusion previously?

Shay Nehmad:

Yes. And it

Jonathan Hall:

And it it will throw an error

Shay Nehmad:

that looks like, you know, star.kapago.dev is excluded by constraint, blah blah blah. Okay. Which to me makes more sense. Like, when you evaluate the name constraints now in a certificate chain, if you have an excluded subdomain, precludes the use of a wildcard, like a star. Got it.

Shay Nehmad:

Which makes sense.

Jonathan Hall:

And it

Shay Nehmad:

is a CV. It is a CV. Okay. But I'm I'm not sure why. Like, the thing, it is like, it it did get a CVE number.

Shay Nehmad:

It's even a 6.5, but I don't understand how I would use it. Like, I don't get it. I think it's just they discovered a thing that was wrong, it's because of it's in the crypto space, it just automatically gets a CV or something. But I racked my brain, and I don't don't understand how I can use it to actually attack something. So the Leaf certificate can claim the subdomain using a star, but both of them are, by definition, owned by the same certificate owner and everything's trusted.

Shay Nehmad:

It's just like a a minor bug, but I don't I don't understand how the the only thing that would happen is that your excluded subdomain will get the certificate. So everything is just, like, maybe too secure. I don't know. Anyway, it was a interesting bug to look at, and, yes, Waymo should definitely upgrade if they use Go. Rest of our listeners that probably do use Go and Denny Avdia, which I hope you're not using Go because you're supposed to be practicing your three point shot right now, you should upgrade.

Jonathan Hall:

Awesome. I have a quick update to share. I I could have sworn we talked about this on a previous episode, but I can't find which episode, so maybe we didn't. Maybe I dreamed that.

Shay Nehmad:

It has the update on past news items label.

Jonathan Hall:

I I thought we talked about this, but maybe I just imagined talking about I don't know.

Shay Nehmad:

So before before you go into it, if any of the Trello developers, Atlassian developers are here on the show, if you could have parameterized labels, that would be awesome. If I could, like, add updated on past news items and then a date, that would be really, really good, or a link.

Jonathan Hall:

Anyway, this is a an accepted and, as far as I can tell, merged proposal. So I think it will be on Go 1.26, although it's not mentioned in the draft release notes yet. So but that doesn't mean anything. It's still draft release notes. The proposal is to allow type parameter in the right hand side of an alias type declaration, which previously was not allowed.

Jonathan Hall:

So you can create type aliases in Go. You probably know that, where you say, like, type foo, and then you say equals something. So normally, would say, like, type foo int, and then it creates a new type called foo. This underlying type is int, right? Yes.

Jonathan Hall:

Or you could do type foo equals int, and now foo is an alias to int, and then you could use foo and int interchangeably, and the compiled code is identical. And and there, you know, there's no there there are aliases rather than a new type of the underlying type.

Shay Nehmad:

What's the case where you usually use those aliases? The I can't imagine you actually overwrite int.

Jonathan Hall:

I don't do it for int. No.

Riccardo Pinosio:

Yes. Although,

Jonathan Hall:

there are aliases, you know, byte for, like, a byte slice. Well, that's that's a that's a special case. Byte Mhmm. Int eight and byte are aliases of each other built into the language. Right?

Jonathan Hall:

So there is that. There's a few cases like that. I it's usually used at least I usually use it, and the reason it was added to the language is when you're migrating types between packages, and you want both packages to remain compatible with each other during the migration, which might last months or years even. So you could use an alias to say that, you know, type foo equals other package dot foo.

Shay Nehmad:

I also use it when there's just a really long type I have to type. Whenever it has like way too many things, you know, for example, a function pointer, that's that's just a trick that's still with me from C plus plus Like whenever you have to define a function pointer type, you you define the alias first just so it has a nice name, and it doesn't have all the function calling looking things like open parens, call parens within parameters, which is why I really like this this, like, proposal, because it hits on a similar issue.

Jonathan Hall:

So the proposal is that you can now put type parameters or generics in the right hand side of an alias type, which isn't something that has, like, bitten me that often, but there have been a couple times where I'd wish I could do that. So I think it's a nice change to the language. Here we are, almost 10 versions after generics were added. We're still making generics better.

Shay Nehmad:

Make making generics great again.

Riccardo Pinosio:

Yes.

Jonathan Hall:

Alright.

Shay Nehmad:

Talking about making the language better, question mark. Someone brought to our attention, a project called Dingo. Oh.

Jonathan Hall:

Maybe the dingo ate your baby.

Shay Nehmad:

Dingolang.com, which is not Go, or is it, question mark?

Jonathan Hall:

It kinda looks like Go, kind of.

Shay Nehmad:

Yeah. So it's this meta language that's supposed to fix, air quote, some things that are wrong with Go. It offers some nice so so the way it works, it's like a meta language above Go. Right? You can find it on GitHub, and it did make seems like it made, it has like a thousand stars and whatever.

Shay Nehmad:

It is you know, when I looked at the readme, and we shared this sentiment, it looks like, oh my god, the readme's so long, it has a website, it has a logo, it has all these tags on it, and the road map, and and features, quick start. Like, that's so much documentation. It looks so nice. I was sure it was like a a proper project. It is very early, so that's one thing to remember.

Shay Nehmad:

It's like super early, it's like one person and their, you know, subscription, which to me just raises the the bar on, okay, for something to be open source and for me to actually trust it, it's not enough to have a lot of stars and a really, really good looking read me. I basically, I have to find someone that actually uses it. But before we talk about the the meta of the project itself, what even is it? Right? Try to explain it.

Shay Nehmad:

It's not that hard.

Jonathan Hall:

Well, I think you you you kind of explained it already. It's a it's a transpiled language. Mhmm. So you write a dot dingo file, and then you transpile it to Go, and it adds things like enums and error handling sugar, and there's a short list of things here that it that it does.

Shay Nehmad:

It basically makes the Go look kinda more rusty and kind of more TypeScript y. So it it takes

Jonathan Hall:

some Yeah. Features from the

Shay Nehmad:

Yeah. Result types, and and you can match on the result, and it's like, okay. So, you know, some types with pattern matching, and the null coalescing thing, which is, you know, everybody who uses TypeScript or JavaScript will know. It's like, user question mark, address question mark, city question mark, dot name, question mark question mark unknown. So it's like Mhmm.

Shay Nehmad:

Okay, you can define a variable called something, and then try to reach super deeply into a struct that you don't know if the thing is there or not, and worst case, you'll you have a fallback without writing the word like if. And it also has some functional utilities, like map and reduce and filter. Mhmm. On the so if you are one of these people that that want to make Go slightly more rusty, would you tell them to reach for a dingo?

Jonathan Hall:

Yeah. So I would tell them to reach for Rust.

Shay Nehmad:

I I think there are two cases here. If it's greenfield, you know, you're just working on something new, or you just wanna nerd around with languages, then maybe, you know, it's just a side project, maybe you can check Dingo out, but this is a super new project. If you look at the issues, some like, half of them are trolling, half of them are the hello world example from the readme is not transpiling at the moment.

Jonathan Hall:

Yeah. I noticed the the the package's own CI pipeline is you know, half of its tests are failing right now. So

Shay Nehmad:

Yeah. So it's very early stage, but let's say it worked. Let's say, like, the CI was green and all the examples worked and whatever. Putting aside, like, the the the fact that the project is kinda new, you know, I would seriously question if you need something like this. I do think it's an interesting project to do.

Shay Nehmad:

Like, I would love to write it for fun, in the same sense that and I would love to do for learning, in the same sense that I would love to do, you know, Advent of Code, which we talked about two two weeks ago. Yeah. And, you know, there's no point, nobody benefits from me solving advent of code and sharing like how I solved it. It's not like something that anybody could use, but it's interesting and people might learn from it. Think here it's the same thing where, you know, putting aside all the AI generated read me nonsense, if you actually look at the features, you're like, oh, you know, maybe I would like to use null coalescing or find something similar that I could do within my Go code.

Shay Nehmad:

Mhmm. Like, if there's a specific struct I find myself that are not coalescing all the time, maybe I can wrap it in in in some, you know, interface that that makes it look similar ish. Right. Like a GET or a panic or something like that. True.

Jonathan Hall:

You know, I think it's a fun toy project. I don't I have concerns about it, like, for serious work. One of those is why would I transpile to Go rather than just compile straight to Go objects that run on the Go runtime? I suppose I suppose maybe the main reason not to do that would be that there's no way to distribute libraries as run as object files right now. So unless the Go compiler tools are extended to allow compiling arbitrary files to Go objects, there's there would be no way to, like, write my library in Dingo and have someone else use it in a Go program without having the intermediate Go.

Jonathan Hall:

So maybe that's valid. Yeah. But

Shay Nehmad:

And, you know, some examples are kind of hard to argue with, especially enums. Like, enums is a notoriously weak spot in Go, and the Dingo enums looks look pretty nice. You just like define an enum and it generates all the go code for you, which is what I used to do with protobuf, and, you know, that's a good example. Result types, I don't know about that. Error propagation, that's actually nice, and people have talked about it for a long time.

Shay Nehmad:

We talked about it in the show that they had a discussion, two end old discussion. Remember that? Yeah. But they were like, okay, we're not gonna talk about error handling anymore. So I guess people are now implementing it in metalanguages as well, which we can't complain about because they can't implement it in the real language Yeah.

Shay Nehmad:

Because it's not allowed anymore, etcetera, etcetera. I just wanted to point out before we drop this that I think it's a cool, like, fun toy project, like you mentioned. But remember that, you know, just the the fact that something looks really good right now doesn't mean it's good. I feel like almost how I need to talk to my parents about deepfakes right now. This looks like a deepfake of a of a real open source project, but when you look at the hands, you know, they have six fingers, and when you look at Yeah.

Shay Nehmad:

Fabric, it it's like so that's what I feel right now. It's like a toy project by one person, only a single contributor with this Jack person. Jack, super cool stuff, but because of the landing page and and all the if you don't look actually at the code itself and try to run it, it seems like a very serious project with lots of work put into it, where in reality, it's a lot of tokens being generated, and the quality is sort of not there yet. So I just want people to to keep that in mind when they look at a project like this or generally, like open source coming up. It's gonna be harder to detect projects that are actually good versus projects that are, you know, cool and whatever, but they're still immature.

Shay Nehmad:

And to me, the the thing that discover that, like, uncovers it is that all the markdown files are really well written with no typos, but like folder names and and GitHub descriptions, like things that are not AI generated normally, do have typos, like landing page instead of landing page. But then you open the landing page itself, it's like it has things that humans will never write. Debugstrategies.md. You know what I mean? Like, nobody's gonna write that on a personal project.

Shay Nehmad:

So cool stuff, cool ideas, fun toy project. I don't think, like, anything more at the at the moment. And to be fair, it does say on the readme a lot of places, like, oh, this is not ready for production. This is just, you know, a progress a thing active development thing. So, you know, it's not trying to be something it's not.

Jonathan Hall:

It also says production ready, real API server, no toy examples. So it caught itself in that regard. Yeah.

Shay Nehmad:

Well, you know, it might have been different chat sessions that wrote to each one with different context windows. Yeah. So that's Dingo. That's our opinion on it. I don't think I'll be checking it out.

Shay Nehmad:

I'd like I won't follow the development because it doesn't solve any pain that I care about. Agreed. But people trying to push the language in in different and interesting ways is always good, whether it's from the inside or from the outside. Jonathan, because of this Sunday vibe of this episode, something I realized we've been going for very, very long.

Jonathan Hall:

Yeah. It's twenty five minutes already as the in real time.

Shay Nehmad:

Yeah. Yeah. Maybe the edited episode is, like, two minutes. I hope I hope there's slightly more use usable content on the that's not gonna be thrown to the editing room floor. However, there is one topic I want to say that we're not gonna talk about.

Shay Nehmad:

Oh. You brought to my attention this really cool Reddit post, this time not from the Reddit from the Golang subreddit Yeah. But from the Reddit engineering subreddit. A, you sent it to me with the old.reddit.com.

Jonathan Hall:

That's where I got it. I that's how I found it.

Shay Nehmad:

So yeah. Jonathan uses Linux, by the way, just unrelated.

Jonathan Hall:

And Shy will never let you forget. Let the record show, Shy brings it up more than I do.

Shay Nehmad:

What is this post about?

Jonathan Hall:

What is

Shay Nehmad:

this post about?

Jonathan Hall:

It's basically a a shit post on Python. Not really.

Shay Nehmad:

My favorite.

Jonathan Hall:

Yeah. So Reddit has been rewriting some of their back end services from Python to Go, and they've just recently switched over the comment endpoints to use Go. And that's what this is about. But we're not talking about that, as you said.

Shay Nehmad:

Yes. We don't wanna talk about that because I suddenly realized that Reddit, you all are like my neighbors. The Reddit HQ is here in in San Francisco. I, like, I would like to find one of you. If any of our listeners know any of the Reddit developers, maybe someone who worked on this migration, try to connect them to us.

Shay Nehmad:

We would love to have them, like, here on the show. And I will also comment on that Reddit post and and say that. I'll tag whoever the yeah. Katie Shannon. I'll try to reach out to her.

Shay Nehmad:

If any of our listeners know Katie, let her know that we're looking for her. But, yeah, obviously, we love these stories of, oh, we used Python for a thing. It was horrible. And then we used Go, and everything's awesome. And the gratuitous, you know, let's look at the 99% latency comparison.

Shay Nehmad:

But I'm actually happy that this post is kind of old. It's four months ago. Because now, you know, they've, at this point, they've over they've already spent enough time with it in in production that they have insights about, okay, we know it's faster and whatever, but our developers are having a good experience, a bad experience, they're enjoying it. This is a migration that will continue to other modules or not. You know, they they're slightly more more informed than just turning on the benchmarks and discovering that the compiled language is faster than an interpreted one.

Shay Nehmad:

Very cool.

Jonathan Hall:

Alright. I think we have a couple lightning round items. Let's do our our break. Couple lightning round items, and then we actually have an interview. Yeah.

Jonathan Hall:

We're gonna be talking about ML AI stuff in Go with Hugo, so stick around for that.

Shay Nehmad:

Lightning round.

Jonathan Hall:

So first up in in this week's lightning round, you may recall I was in Utah. Gosh. Was it two months ago? It's been forever, it feels like. Nice eyes.

Jonathan Hall:

But also feels like it was yesterday. I was there at the Go West Conference, which was a lot of fun. The videos from the conference are now online. If you want to see the fancy gopher tie I was wearing, you can go check that out. Organizer of the Amsterdam Go Meetup group.

Jonathan Hall:

Backing out of there. And does anybody here listen to Cup Go?

Jonathan Hall:

Yeah? Alright. I'm the cohost of Cup of Go. Has anybody seen my YouTube channel Boldly Go? Got a few hands there too.

Jonathan Hall:

Cool.

Shay Nehmad:

Is there any talk you'd recommend other than your talk?

Jonathan Hall:

Yes. Definitely.

Shay Nehmad:

Which of course everybody will need to watch with the inflammatory title. You're already running my code in production.

Jonathan Hall:

Yeah. There were a bunch of great talks. I'll I'll just call one out that I really thought was good because it was kind of unique, and that was the fundamentals of memory management in Go. And this might sound kind of boring. If you think memory management is boring, then you should watch this talk because it approaches it from a different angle.

Jonathan Hall:

It's not like, here's the technical stuff about how memory management works. It's more like why memory management is important and why you should care, and you actually probably should care at least twenty minutes worth. So go watch that talk. And honestly, there were a bunch of great talks there, but I don't have time to talk about them all. This is a lightning round.

Shay Nehmad:

My lightning round thing is do you know how slow is channel based iteration? Slow. Yeah. It is slow. It is actually very slow.

Shay Nehmad:

And if you're the sort of person that likes looking at benchmarks and benchmark code, but you're looking for a short blog post to do it, then Zach Musgrave has got you covered. He posted this blog post called how slow is channel based iteration, and it has all the gratuitous benchmarks you could you could have with a simple topic as how should we do iterations. So go check that out if you're interested. It's from Doldhub, which is a interesting thing to to start with. It's like a technical you can clone the DB and create branches and whatever, that that sort of approach.

Shay Nehmad:

Very cool stuff, Zach.

Jonathan Hall:

And last up on the lightning round, tomorrow, or or maybe today if you're listening to this. I don't know when this episode's coming out exactly, but Monday, December 8, 1PM Central Eastern European time, ask me anything with the Goland team. If you use Goland or want to use Goland or have questions about Goland, check this out. Sorry we're not mentioning it sooner. Know I a lot of you are not gonna listen to this episode as soon as it comes out, so you'll miss it already.

Jonathan Hall:

But if you have the chance and you're interested, link to that. Ask me anything with the Goland team, Monday, December 8.

Shay Nehmad:

Yeah. You can fire up, you know, Goland as as soon as you hear this, and by the time it boots up, you'll be the AMA will be over already.

Jonathan Hall:

I think that's it.

Shay Nehmad:

Yes. Let's take a short break and then run to our interview about AI and ML and all that good stuff. Go MLX. This show is supported by you, as Jonathan has mentioned at the top of the episode. This is a hobby.

Shay Nehmad:

We do it for fun, but it is kinda expensive. If you wanna support us, you can do so by doing a ton of things. Listening, which you're already doing, is a great start. The most direct way is to join our Patreon, which is a place you can drop, you know, a monthly contribution and help us offset the costs of running the show. And we really appreciate all our existing and future Patreons.

Shay Nehmad:

Existing more, because by definition, you've paid more already. So, you know, you got a special place in our hearts. To find the links to all the things, you can go to kapago.dev, which I have already mentioned a ton of times as a domain example on the show. I don't know if it's considered, like, native advertising. That's when I'm trying to explain certificate chains, the example I gave is cupogo.dev.

Shay Nehmad:

Is that, like, not legit? I don't know. Anyway, I'm thinking about it. I'm sorry if you found that a bit, you know, tactless. But you what you can find on that domain is a link to our Slack channel, linked to our Swag Store, all past episodes with transcripts, and you can also find our email address, news@cupofgo.dev.

Shay Nehmad:

Other than all that, a great way to support the show is to leave a review on Spotify or Apple Podcasts or wherever you listen to your podcasts or sharing the episode with a friend or a coworker. If something we mentioned is useful to you, you're like, oh, Dingo. Someone on my work Slack mentioned it and suggested that we move to it. So I should share the episode and let him know that Jonathan and Shay think that's not a good idea. You know, that'll be good.

Shay Nehmad:

Other than that, we wanted to, you know, apologize on behalf of our swag store thing. It has a lot of production delays recently, and people have been experiencing you know, they're they're ordering the thing, it's not coming. If that happens to you, let me know via email. I'm I'm working on it. I know one person has done so already, and I'm I'm still working on it.

Shay Nehmad:

I'm sorry it's taking so long. But, yeah, just some delays in our swag story. So if you ordered something and it didn't reach in a timely fashion, shoot us an email, we'll take care of it.

Jonathan Hall:

Alright. I think that's it for our break. Stick around for our interview, and we'll see you next week.

Shay Nehmad:

So, John, you know, someone recently told me that Go is a very verbose language. You know what I mean? Gotta type if error is not equal to nil. Very, very verbose. And I was like, maybe you know, it's like French.

Shay Nehmad:

They get paid by the letter even if they don't even if you don't really use it. Like, if error is not to nil for things they don't if only I had someone on the call who introduces letters that we can't hear in French all the time.

Riccardo Pinosio:

So what you're saying is that as go programmers, we should demand to be paid according to the lines of code that we produce. Right?

Shay Nehmad:

Yeah. Instead of hourly bill. It's like it's just like French.

Riccardo Pinosio:

All the hero handling just churning our money.

Shay Nehmad:

Yeah. Error spelled with, like, e u o r o u r r e u x. Error. Error. Hello, Ricardo and Jan.

Shay Nehmad:

Please introduce yourselves.

Riccardo Pinosio:

Maybe I'll start. Yeah. So I'm Ricardo. I'm head of R and D at Knights Analytics, and I've been working for around two years now also on open source and Go, in particular for machine learning, running machine learning models in in Go. That actually started for the needs that we had at Knights Analytics, our data management platform, to actually be able to run these kind of models in a streamlined way without requiring Python microservices or other stuff.

Riccardo Pinosio:

And we then thought, well, let's open source what we have been working on. And, well, some of the listeners might remember that one year and a half ago or so when Jonathan was still in The Netherlands, we did an episode on Hugo, which is a library that is kind of equivalent of or aims to be the equivalent of, well, Hugging Face transformers. For those that don't know, I guess, Face transformers is a very commonly used Python library to run machine learning models. And we aim to develop something that could be used by the Go community with the same kind of ergonomics, right, so make it easy to run this kind of pipelines. And so we open sourced it, I think, was around one year one a year and a half ago.

Riccardo Pinosio:

And since then, we've been doing quite a lot of work extending what you can do with it. And part of that has been also working with Jan, who I will can can introduce himself in a moment. And that has actually supercharged, I think, what the library what the library can do. So we discussed I still remember Jan was like well, it's been a year a year with and I've been working with Dan also. And at the beginning, we were brainstorming, hey, this could actually be a Go equivalent of, like, the stack that we use in Python where you have Hugo as sort of like the high level embeddings and then GoMLX as being the sort of equivalent to PyTorch under the hood, you know, doing the difficult heavy lifting.

Riccardo Pinosio:

And we've been quite some progress in that direction. So Hugo's matured. It started also to be picked up by other projects. So for example, if you've heard of the Bento streaming library Go, now they have preprocessors that rely on Hugo for NLP and machine learning streaming preprocessors. We've also seen recently that, for example, for the Go bindings of Redis, the examples that they gave on how to actually calculate embeddings and store them in Redis uses Hugo.

Riccardo Pinosio:

And actually, of this has been possible because of the work that Jan has been doing. So maybe we'll go over it in a moment, but then I'll leave the field to Jan to introduce himself because he has a very impressive CV.

Jonathan Hall:

Uh-huh. Alright. That great intro.

Shay Nehmad:

Such an intro. You know? How

Jonathan Hall:

about you

Jan Pfeifer:

introduce yourself?

Riccardo Pinosio:

Really like warm up. You know, for, like, stage for the star of the evening.

Jonathan Hall:

Alright, Jan. Tell us about yourself.

Jan Pfeifer:

Okay. I I've been in this industry for long, so I'll I'll I'll till the end of it, which is the more interesting part, and and you you ask if I want I should go further. But I'm I'm in the end or the beginning, depends on how you see it, of my fire journey. Fire as in financial independence, retire early. Right?

Jan Pfeifer:

It's not that early because I've worked a lot, but I started working very early in my life, always in computer science. But, anyway, I'm basically retired, and I've been, the last year and a half, fully working open source. That's my my hobby and my passion in GoMLX and some related projects. Before that, I worked at almost all my most of my more important and impactful professional life in machine learning and applied research for almost fifteen years in Google and ten years before that in Yahoo, a few startups, and a few few other stuff. But that's a very long story short.

Jonathan Hall:

Alright. So we know who you guys are. We know that you're working on Hugo and GoMLX. Let's talk a little bit about what's new. May may maybe we could do that for just a couple a couple minutes.

Jonathan Hall:

What's new since the last interview for those who who listened in a year and a half ago when we did that?

Riccardo Pinosio:

I'm Ricardo Pinozzio. I'm I'm a machine learning engineer, and I currently work at Knights Analytics and which is a company that builds, essentially, solution for master data management.

Jonathan Hall:

What are the big highlights in the last year and a half in the Hugo world?

Riccardo Pinosio:

Yeah. So I think when we did the first interview, we had essentially a small set of of pipelines that or things that you could do. Right? The one that actually basically rebuilt it for and that most people actually started using it for was the calculation of embeddings. Right?

Riccardo Pinosio:

There was a time when actually vector search was very popular, became very popular. And so everybody was calculating embeddings and then storing embeddings in vector databases and so on and so forth. And we had that need, and that was sort of the beginning, what we had at the time. Since then, there's been quite a lot of development. So the first big highlight, I would say, is that we now at at the time, we only supported Onyx Runtime as a back end.

Riccardo Pinosio:

So that meant that if you wanted to use any type of model Onyx Runtime is something from Microsoft. And if you wanted to use any type of model, you basically needed to install on your system the c libraries, right, that would allow you to essentially then run inference. You, at the beginning, was basically just binding to the ONNX runtime API. And so you needed the c library for both tokenization. Tokenization is when take a sentence, need to split it into words, let's say.

Riccardo Pinosio:

And also for that inference, for example, calculation of the vector embeddings. And that is that was fine for our purposes because for our product, we basically just containerize everything, and then we can put whatever SO we want and so forth. But binding to c has a lot of issues when you can't do that. Right? And we started seeing a lot of people that said, oh, actually, it would be nice if you could do this without in in native go, basically.

Riccardo Pinosio:

Right? And and the work that we have been doing with with with Jan is has been we have been able to deliver that in the sense that now we have a Go back end for Hugo. That is called Simple Go. It was built by Jan. He can explain what it does better than me.

Riccardo Pinosio:

But that meant that we can now you can now use Hugo without having to install any of the c stuff. Right? That is also, for example, what streaming library does. Right? You don't need to use the ONNX Runtime back end.

Riccardo Pinosio:

It will come at some performance performance cost, but that means you don't need to see. And that to be able to deploy, they don't deploy YUGO on cloud and so on and so forth, it's big win. The second thing that is new is that we have extended the capabilities of what YUGO can do. On the one end, we can do more pipelines. So for example, we can do do image recognition, which people have already started using.

Riccardo Pinosio:

So we extended beyond just vector embeddings. We can do image recognition. We can do re ranking, like cross encoding. And we can now since recently, also, we can do generation. So actual, you know, use our file three models, this kind of stuff to actually be able to build chatbots.

Riccardo Pinosio:

And we are actually that's actually the most thing that we're still very much active now on to try to improve that part of, well, if I want to use a generative model and build my own local LLM, right, my local chatbot or whatever, that's that is the focus of our attention at the moment. Also because we want to use it in our product. So that's that's the the main one of the main drivers. And the other thing that came in Hugo and also this relies a lot on what Jan has built is training, the ability of actually doing fine tuning of your models. So not just inference, but also, say, give examples to the models and, for example, make your vector embeddings better for your use case.

Riccardo Pinosio:

And that relies heavily on on GoMLX because without GoMLX, it wouldn't it would not have been possible. Actually, at the beginning, I started doing that by writing the bindings to the Onyx training API. So Onyx Onyx ONNX Runtime had a training API for on device. And then I wrote the bindings for that. There was still bindings to c.

Riccardo Pinosio:

And then exactly the moment when I finished the bindings, they deprecated the library.

Shay Nehmad:

Mhmm.

Riccardo Pinosio:

So that was, like, very nice. So where do I go from there? Fortunately, GoMLX then came into the into into the view. And with GoMLX, we are able to also train the models without requiring any Python. So that is kind of like the the upshot of of the new things.

Riccardo Pinosio:

And as you can see, a lot of these things rely on what Jan has been doing. So maybe I'll leave Jan to explain a bit about Simple Go, GoMLX, what GoMLX is, and and

Jonathan Hall:

That's that's my my big question. As before we were recording, I said, one nice thing about hosting a podcast like this is I can be stupid without shame. So I'm going ask a stupid question without any shame. What in the world is GoMLX?

Riccardo Pinosio:

That's a very good question.

Jan Pfeifer:

Yeah. GoMLX is like PyTorch or JAX or TensorFlow for Go. It's a machine learning framework. It's it's meant to build and train your models and serve them also. It's it's for people who want to do ML.

Jan Pfeifer:

So it's a little bit a lower level than where Hugo is, whereas Hugo see more as where people want to use ML. So we I think GoMLX is where someone who wants to train a model, fine tune a model will go to. So these are layers. Right? Hugo is further up in the layer, the stack, Then comes GoMLX on machine learning framework.

Jan Pfeifer:

And GoMLX itself, while I I did the simple Go implementation, which is it works, it serves a purpose, but it's not by no means as professional as another back end that GoMLX also use, which which is XLA, which is the real professional accelerator for for speedy computations. But it's raw. Nobody wants to use it directly. You always use it through some machine learning framework. XLA is by Google, and it's the one the one that Ajax and TensorFlow uses, and PyTorch can also use it if you're doing large scale because XLA is really good for that.

Jan Pfeifer:

But XLA is just number crunching. You still need the ML framework on top to make things for humans to use. Okay. And yes. So one of characteristic of GoMLX that fits your needs is that it works with the concept of back ends.

Jan Pfeifer:

It's a generic ML framework that can run on XLA, can run on a back end that is pure gold, so there is no need to install anything in C. And we are we're already working different back ends, so we want to make it run on WebGL, maybe in Vulkan also. We also wanted to use ONIX runtime as a back end to execute. So there is a few of those plans in the pipeline still. Does this does this take a picture of where these things are?

Jonathan Hall:

Basically. So I mean, I've I've not used any of those other tools you talked about. I tinkered with TensorFlow many, many years ago when it was brand new, just kind of to, you know, get my toes wet. But that's as far as I've ever done any machine learning at all. So but I have a better sense now of what GoMLEX does than I did before I asked.

Riccardo Pinosio:

Everything after that is easy. I mean, as in if you use the TensorFlow at the beginning, now it's much easier, the the whole situation. So

Shay Nehmad:

And and it is important to note, this is like just to clarify for the listener, this is like AI, the the bigger picture. This is like machine learning. This is not just for LLMs and things like that. So if I were to train a model that uses some regression to find, I don't know, to match audio

Riccardo Pinosio:

Yes.

Shay Nehmad:

And does something Mhmm. Not like a large language model or even necessarily a transformer. Like, I can use GoMLX to do the classic dog or cat, like, classifier, for example.

Jan Pfeifer:

Or the other day, someone asked to do a air conditioning controller, a smart air conditioning controller, and the output of of them the of the machine learning is how much energy to put in the in the cooler and and and the inputs are the temperatures. So all types of of machine learning problem. AI or LLMs, all these terms are a bit confusing. Is one type the most successful by far, but one type of machine learning? Machine learning is the generic tool, mathematical or statistical tool, to to to learn things by example, if you will.

Shay Nehmad:

Yeah. So I just want to make that distinction clear for users because there's for our listeners. Sorry. Yeah. Because there's a lot of things coming out right now that are they that call themselves AI frameworks.

Shay Nehmad:

But if you open them under the hood, they're just like, oh, this is actually just calls to OpenAI in a library. It's not actually, like, machine learning. And it still might be a useful framework or whatever, but it's very far removed in the, like, layers of of machine learning, AI, AI engineering, and software. I would consider in that regard GoMLX to be pretty low level.

Jan Pfeifer:

Yeah. The the thing is if you put the stamp AI in your product, you immediately get investment from everywhere, even within the companies. Right? So if I could rename GoMLX, I'll put an AI somewhere in the name. And Yeah.

Jan Pfeifer:

But yeah.

Shay Nehmad:

Now now now you gotta get with the times. Now it's agents.

Jan Pfeifer:

Oh, yeah. AgenTic

Shay Nehmad:

machine learning framework. That's that's the tick that's the golden ticket now. I wanna ask maybe this is this again, just like Jonathan, an obvious question, but I think this will be actually a good leading question for you both. Why? So I think every listener who did work with data science and had, like, machine learning projects, especially in production, knows that normally, right now in the industry, the standard is to do this with Python.

Shay Nehmad:

Even in your examples, in the tutorial for GoMLX, I think, or maybe it was Hugo, one of them, the tutorial was in a Python Jupyter Notebook. It had Go code in it, but the you know what I mean? The the which was cool to see because I've never seen a Jupyter Notebook with Go code inside it before. That was really cool. But the extension of the file is literally iPy notebook.

Shay Nehmad:

You know what mean? Like iPython notebook. My my big I guess the the big question I have is why go through all this effort to even do it in Go and not just use, you know, Python libraries that already exist?

Riccardo Pinosio:

I I would say something on the because you mentioned the the Jupyter Notebook is that I mean, in principle, Jupyter Notebook can run not just Python. Right? So I think originally, when it was developed, it was I mean, could have run also R. I'm not sure if you if you guys know R as, like, the programming language. It was actually my one of my well, my second programming language that I worked with after this.

Shay Nehmad:

Yeah. Yeah. I r and Julia and all these, like, scientific MATLAB, all these, like,

Riccardo Pinosio:

scientific Yeah. Exactly. So it can run it's just a kernel. So it can run multiple programming languages. But, actually, Jan also wrote the framework tool to be able to run Go, right, in Jupyter in Jupyter Notebooks.

Riccardo Pinosio:

And, actually, be I didn't know that. So we started working on the machine learning thing, and then I then started finding, you know, Jan's name on all sorts of cool libraries for machine learning in Go, like the Go MB, I think. Right? That's that's the name, Jan. So so that that that's very nice.

Riccardo Pinosio:

Why? From my perspective and then I think I can give his his own. But from my perspective, well, the reason for us was very practical is we needed to deploy solutions or and build software that did not rely on Python for a variety of reasons, security reasons. The kind of stuff that we develop needs to be deployed in situations where not even one vulnerability is allowed. We actually have zero vulnerabilities in our in our in our our software with all of the three year old number of scanners that we ran.

Shay Nehmad:

That hackers. So pose pose the challenge. The zero vulnerabilities. Go prove him wrong.

Riccardo Pinosio:

Yes. Please do not. Please don't do that. But but and so we couldn't use pie we couldn't use Python for that was one of the reasons. The other reason was it simplified deployment because, otherwise, you need to have microservices, for example, and then you need to have a microservice that serves prediction of the model if your main application is in Go and so on and so forth.

Riccardo Pinosio:

And it's just streamlined to be able to do it in in Go just completely streamlined our architecture. And And so that's essentially one of the main drivers for us to be able to do that. Jan, if you want to chip in on this point.

Jan Pfeifer:

Yeah. Well, my story is a little different, but has some common points. I worked in ML. I was there when TensorFlow was created in Google, right, for a long time. I always disliked Python.

Jan Pfeifer:

More than I disliked Python, disliked how people use Python without types and things like that. I was always in the applied research team, and that's where the rubber meets the road, if you will, of crazy researcher code that, you know, is super fancy, does really smart things, are really bad at code, and then we have to productionize that thing. And so I disliked that. I also disliked TensorFlow. I had many issues with it, but I use it every day.

Jan Pfeifer:

But it was a giant itch that I've been meaning to scratch. And to also go, I always loved Go. I used it from the start. Being Google was easier, and and I think machine learning should exist in every language. It's like a database, and it should be a good framework in every language.

Jan Pfeifer:

There was no machine learning in Go, so I thought, okay. Let's do this. And I think Go is a awesome language to productionize these things, as Ricardo mentioned. So it needs something like that. It it it could someone could leverage this to do awesome services.

Jan Pfeifer:

I mean, machine learning should be everywhere, and and you need services to drive that and and to to come with users' requests and drive this into the proper ML modeling and stuff like that. And Go is perfect for that. Now Go is also very good for the problems some machine learning practitioner or data science do. It's very easy to to optimize processes that are not necessarily machine learning, like preprocess data sets and stuff like that, parallelism, supernatural, and Go, which in Python is incredibly slow in in some cases.

Riccardo Pinosio:

Also, I would say it's it's interesting because Go is a particularly good language, I I think, also for data scientists that are not software engineers. So if you're if you come from a data science, more machine learning background, right, and Python, obviously, is very popular because as a machine learning researcher or data scientist, you don't necessarily want to worry about the production aspects, right, in principle. The alternative used to be like, Okay, well, then you have to do it in c, right, or whatever. And those languages, that's much more difficult. You're not going to as scientists are not going to I mean, I'm generalizing here.

Riccardo Pinosio:

But a lot of them are not going to go through the pain to do that. And the beauty of Go is that it it's kinda like in between. Right? It's a it's a very elegant, language, something that, you know, data scientists or scientists can actually use and learn very quickly. And that is, I think, also a great advantage.

Jonathan Hall:

I I agree. My my brother does data science work, and he and I work on a project together. And so Python and R are his comfort zone. Right? But I've never used either one to any real extent.

Jonathan Hall:

So, you know, we're working on this project together, and I'm writing it in Go. And he he's I wouldn't say he's proficient in Go yet, but he can certainly understand it. And he's commented several times that he he likes it. It's easy for him to understand, you know, coming from the Go and and our background. I I think it's a shame that there aren't more people using Go.

Jonathan Hall:

I I I think the strengths that Python has as a language, being relatively easy to understand, fairly accessible, Go has virtually all the same advantages. It's just missing the runtime the the libraries and and the runtime integration that, thankfully, you guys are are now building for us.

Riccardo Pinosio:

That, in a way, yeah, that's kind of also the goal, right, and the aim. Yeah.

Jan Pfeifer:

Now, I should say that it's a lot about the ecosystem, the flourishing ecosystem in ML that we have in Python. We don't yet have a goal, but I think there is a space. This world is big, and there is space for more than one flourishing ecosystem of ML. Now ML is a lot or data science is much more than the language itself. It's also the tooling is important.

Jan Pfeifer:

Right? And that's why part of this work has been doing other associated projects like a kernel for Jupyter Notebooks, you can write easily write notebooks in Go, plot graphs, see how training is going, things like that. These are needed to to make this successful or usable in practice. It's not just the auto differentiation and machine learning math. You have to have a good environment.

Riccardo Pinosio:

And you also need this kind of soft stuff, let's say, around, right, to to to strengthen the ecosystem. Yeah.

Shay Nehmad:

So the the long and short of it is subjectively and objectively, Go has some benefits in your opinion for this sort of workloads. And subjectively, but importantly, you're, like, tired with pythons nonsense. That's I I relate so badly. So I feel you.

Jan Pfeifer:

I can I I say something about the state of affairs in that if I were to guess? I think from ML, what you need to do ML, in in Go right now, if you go and then Go MLX, maybe you have 2% of what exists in go in Python. But the 2% covers, I don't know, 80% of the use case. So maybe I'm being too it's more than 2%. I don't know.

Jan Pfeifer:

But, you know, it it Mhmm. It it covers a lot of ground and and enables lots of use case. For many things, you one can use Go if they are familiar with it or they choose to do it.

Riccardo Pinosio:

Yeah. I mean, it's the eighty eighty twenty rule kind of thing. Right? Like, I mean, also for also for Hugo, the main driver and this is still the big workhorse of why people are using it is the embeddings for semantic search. Right?

Riccardo Pinosio:

Mhmm. Because it's a use case that, you know, many applications are now just integrating that kind of stuff. Right? The being able to do image recognition thing is useful. Some people will use it, but it's definitely the, you know, the the main driver.

Riccardo Pinosio:

And the other main driver, I hope, some point, it will also be the the generative stuff because I think that that is definitely, you know, useful to be able to run your your chatbots locally, for instance, using Yugo. But yeah. So there's definitely that 2% or 3% of the work is often drives most of the use cases. Yeah.

Shay Nehmad:

I think a lot of people who do AI or AI sort of look in workloads right now, myself included, whenever I have to do something like run a model locally or embeddings, embeddings is a great example, and I wanna work in Go, which I do, like in my own projects, I use Go, like, I had to embed some text right now. Yep. I just spin up Olama and everything that's AI is a network all the way. Like, it's never in my in my actual Go runtime, which has the benefit of, you know, I don't have to write anything, but has this downside of, okay, I have to run an OLAMA server. I have to download the model.

Shay Nehmad:

And to me, it also feels like I have no idea how it works. It's like, okay, I'm spinning it might be running locally on my machine, but, you know, I'm not writing Olama. I'm not writing these models. I'm not actually tuning them. If they don't work well, I'm like, I'm out of luck.

Shay Nehmad:

I don't know what to do, and I don't have any anything to to tweak in. If I were to take this specific workload, just as an example, because it's not a workload that I'm doing right now, and I were to say, okay, I wanna use, like, the the I really wanna do it, quote unquote, myself. I wanna embed this text myself, chunkify it myself, and not, like, have a third party do it for me over a network call. Even though it's happening locally, it's still, like, you know, do you think I would have more insight or more control in the process, or would it sorta look the same? It would be like, get embeddings and you have no idea what's happening.

Shay Nehmad:

Or will I have more, like, capability to tweak it and understand what's happening and go, like, deeper in the weeds if were I to use, like, you know, the Go tooling?

Riccardo Pinosio:

I I would say I mean, of course, I would say so, right, in the sense that so on the Olama side, though, one thing that that I also don't don't understand, maybe that has changed, is indeed, right, it's an it's a network call away. But actually, Olama is written in Go. Right? So at some point, I was also looking at, okay, just is there, like, an SDK for Olama that I can use without having to go by the network? And and and the time, I couldn't find it.

Riccardo Pinosio:

Maybe maybe that has changed. But I would say that there are benefits in doing it not over the network, but using something like Hugo, for example. Well, one of them is most simply is that you can build, for example, workflows that also fine tune your embeddings. So you can use your you can use the off the shelf open source models. But then if you have a specific use case that requires you to fine tune the embeddings for that use case, I think Hugo plus QMLX makes it quite easy now to be able to do that and also get insights on what's happening in terms of all of the different metrics that you can calculate there, quality metrics.

Riccardo Pinosio:

That's that's at least my 2¢ on this. Yeah. What do you think?

Shay Nehmad:

Is it like will it allow me to have a deeper understanding of what I'm doing with these workloads?

Jan Pfeifer:

Certainly allows, but if your use case is just to use the your the given AI that you'd LLM that you're given, then then it doesn't matter, right, to be a co local or in the network except for the latency and that you're paying. Yeah.

Shay Nehmad:

I mean, the use case is to use a thing that's off the shelf until it doesn't work very well. And then,

Jan Pfeifer:

you know

Shay Nehmad:

what I mean? If it's a Go library, I always feel like, you know, let's say I'm using a serialization library. I mean, you're a basic user of, let's say, protobuf. And you use it for a month and it's good, and then you want a specific feature. Like, you wanna figure out exactly how one off works.

Shay Nehmad:

You when it's go, you can, like, double click into the code you're looking into. You know what I mean? Go to definition, understand what's going on. What I'm trying to understand is, will Hugo and GoMelix allow me to have that insight of, like, clicking into embed things and maybe understand what's going on? Or are they similar to the Python experience where you click on the thing and then two clicks in, boom, you're in c.

Shay Nehmad:

It's just like grabbing c code and you it's hard to have that understanding of what's going on. Definitely definitely, I'll have more understanding than just running, you know, something locally and over the network and working with it. But I I'm trying to understand, is the is everything implemented in Go? Or at some point, will I meet, oh, it's wrapping a c library that actually does the hard work?

Jonathan Hall:

As long as it's not wrapping Python.

Jan Pfeifer:

So let me me say something. If you want Go is very fast. For many things, it's it's super good. But if you're doing math, number crunching, and and it matters. Right?

Jan Pfeifer:

If you have every percentage point you squeeze, then you do the things in c, c plus plus, or Rust, then you mix assembly. And and so the lowest levels, if you're you're using the very fast engine, will be done in one of those. Now not everything needs all that performance, and hence, the Go engine. Right? The the the simple Go engine that I wrote.

Jan Pfeifer:

It's much slower, but for many problems, it's there. It's easy simple to read. It doesn't have any black magics, makes it simpler to read and and to evaluate. But once you money matters, if you have a thousand servers, you're serving a million users, then you want to do whatever is the fastest, and then there's really what I call black magic code inside these things that that are super optimized libraries to the the last to squeezing out the the maximum performance the CPU allows for a GPU or a TPU. And those will be in assembly, plus plus more recently, some in Rust.

Jan Pfeifer:

But it's still, I think, it's mostly C plus plus Yeah.

Riccardo Pinosio:

I think it's going to be difficult to compete with on that level with c libraries. Like, even if I mean, if you use Hugo and you keep clicking in your ID, at some point, you will need to see. Right? I I can tell you that because yeah. So it it's very hard to compete with the amount of man hours that basically went into optimizing that thing.

Riccardo Pinosio:

Right? But what we can do is offer, like as as Rian was mentioning, it's not just about the performance. It's just about the ecosystem. Right? How easy is it to use?

Riccardo Pinosio:

In Hugo, it's like three lines. Right? With three you import the the library Hugo library, and then you can set up your pipeline, you know, quite easily in in in just a few lines a few lines of of go. Now, with a simple go back end, you don't even need to see stuff. Right?

Riccardo Pinosio:

So it's it's quite straightforward to do. You do see it now, for example, in if you go to the documentation on set embeddings of Redis, like of the Go SDK for Redis, you will see that they will they now calculate you. They use Hugo to calculate the vector embedding. It's quite simple how to do that. Right?

Riccardo Pinosio:

If you want that kind of performance that Jan was mentioning, yeah, you will hit c also in Go in in in Hugo. There's no way around it.

Jan Pfeifer:

Can I take a different take on this question? Is I think it always depends on what's the goal of the user. If the goal of the user is to improve their embeddings, to retrieve better results for whatever the user wants to search, then they they only need to know the machine learning level, what is the model doing, and how to train better. And all of up to that level can so go. Only if they want to know how matrix multiplications that optimize cache locality needs to be made.

Jan Pfeifer:

Then if they really want the fastest one, they will end up in their one DNN library that is done by Intel and C C or C plus plus I don't know.

Jonathan Hall:

Right.

Riccardo Pinosio:

Yeah. So even the framework, like, even the framework level, you know, you can do it in Go. It's really just a optimization part, let's say, that you you can say. Yeah. Cool.

Jonathan Hall:

So we talked quite a bit about sort of the the architecture, how you're doing this, Python versus Go, a bunch of the technical stuff. Let's let's do a more fun question. I'm curious if there's any sort of ML, AI projects or problems that you've worked on that you think are particularly fun. Have you been identifying ice cream flavors or or any anything that that's just fun to talk about over a beer.

Jan Pfeifer:

Should I start or we can Yeah.

Riccardo Pinosio:

You you can start. Yeah.

Jan Pfeifer:

Okay. I'll I'll say two that I'm very happy about. Alright. The first one was an exercise in in learning reinforcement learning with AlphaZero. You know, the famous DeepMind algorithm that beat everyone in in chess and then Mhmm.

Jan Pfeifer:

Or in Go. Right? They started with AlphaGo. I reimplemented it and for a small game called Hive. And because of the Go back end, that thing compiles to WebAssembly, and they just serve them for it from GitHub.

Jan Pfeifer:

You can play it from you can play it. And it was trained with GoMLX, and it's served or compiled to WebAssembly in GoMLX. And it works, and anyone can learn how to do this type of games at alpha zero. If you want to implement it, it works very easy. Cool.

Jan Pfeifer:

Another one that is was super fun was working with flow matching or stable diffusion to generate images. I did a a small I didn't have the hardware to train seriously, but it does generate flowers, pretty flowers out of nothingness, out of randomness. And it's a very educative and

Riccardo Pinosio:

I haven't seen these flowers. Now you should have to show me that generated flowers. On my side, so I have actually, we did a lot of projects that are very interesting using Hugo. Unfortunately, lot of them are kind of proprietary things that I can't talk about. But what I would say is the actually, for me, the most the the the most fun thing has been to see how other people are using this, right, in their project.

Riccardo Pinosio:

So we actually talked with Jan. You know? We are starting this thing, and we're putting out these open source libraries motivated by our own needs. But then will some people use them, then will some people contribute back to them? Right?

Riccardo Pinosio:

And and there's always this question around open source of, like, you know, there's, like, some a lot of people do the work, and then you always hope that some contributions will come in. And, you know, it's been at the beginning, there wasn't that much contributions coming back in. But, actually, in the the last year, that has improved. Right? So people have started using Yugo in, as I mentioned, Bento.

Riccardo Pinosio:

We have seen it we have seen it used by in red in Redis docs, but then also, you know, people opening more issues on the Yugo page. Hey. We wanted to do image recognition. Can you implement the image recognizer? Can you use the cross encoders?

Riccardo Pinosio:

A lot of, you know, Go developers that needed variations on the semantic search use case came on the on the Hugo GitHub and asked, okay. Well, do do you do you have the cross encoder? Because I want to build my own note taking app, and I need to be able to do, you know, nice ranking, right, and embedding. And it's just been really fun to see people using something you, you build. Right?

Riccardo Pinosio:

So this is Yeah. For me, it's the first kind of open source library that people are actually actively using. Jan has a lot more experience than me in open source. So for me, it's been a very fun ride for this. Just you know?

Riccardo Pinosio:

Yeah. Nice. And then there's always a bit of thriller. Right? Like, oh, will people like this thing?

Riccardo Pinosio:

And then, know, oh, maybe they will criticize how this was implemented. You know? Because Go Go programmers can always be quite critical of what's happening and so on and so forth. But, you know, it's been it's been really fun.

Jonathan Hall:

Well, I wanna thank you guys for for both coming on the show and for the the tools you're building. Even though they're not tools that I directly use, I'm really glad that they're there because I think it helps to advance the general you know, it helps to advance the Go ecosystem, but also just in general, just to advance

Shay Nehmad:

Yeah. If if you're back end slash data scientist person at a company and you're just like someone is making the argument, oh, no. But we have to use Python because Python is for data science. You you heard it here. Maybe you can get by with using Go.

Shay Nehmad:

Yeah. Yeah. For

Jan Pfeifer:

many applications, you can. For many common applications, you're perfectly And

Shay Nehmad:

any any any person that joins this, like, bandwagon, you know, helps advance it just a little one more, you know, one bug report, one feature. Maybe you have your own little library, and suddenly, boom. Six months from now, this tiny niche of data science is also available in Go.

Jan Pfeifer:

Yes. I think, to tell you the truth, it should be available in every language, but my favorite one is Go, so I'm

Shay Nehmad:

Yeah.

Jan Pfeifer:

I'm pushing for in our our corner of the world.

Jonathan Hall:

Well, on that note, since Go is your favorite language, it's a great segue into the question we ask all of our guests, which is who has been the most influential person for you, so there's two of you here, in your Go journey? Perhaps somebody who introduced you to the language or somebody who helped you learn new idioms, or something like that. Who has influenced you the most in your go journey?

Riccardo Pinosio:

You want start, Jan?

Jan Pfeifer:

Okay, yeah. I started it early. I started all by myself out of curiosity in Google, was created there. Right? And it was perfect for what I was doing.

Jan Pfeifer:

I was controlling hundreds of thousands of of of servers doing machine learning. And I needed something to control all these parallelism, things failing all the time and and recovering. So I did a giant orchestrator there in Google. Everyone wrote one of those in in Mhmm. In in Google life.

Jan Pfeifer:

And it was hard to get code review for it because very few people were still doing. And at some point, the the the Go team had a very strict readability requirements for Go. But since we didn't have anyone else in the team, when we needed to get the product going, after some negotiation, Robert Grismer was reviewing my code for six months, eight months, and then sending him jobs, and he was telling me it was super useful. I mean, not only for Go, but also in general about language. I I think it's a good very interesting

Jonathan Hall:

What I wouldn't give to have someone like that review my code for six months.

Jan Pfeifer:

That was a lucky yes. Yeah. It's a lucky experience. Yeah.

Shay Nehmad:

Grab grab all the comments, feed them into GoMLX, and build a little you know what I mean? A small language model that reviews your code the same.

Jan Pfeifer:

There's there's some things I hope Robert will never see my GoMLX code because something is not possible.

Riccardo Pinosio:

After this one, he will. Right?

Shay Nehmad:

One other downside of open source, people seeing all your

Jan Pfeifer:

graphs. Yeah.

Jonathan Hall:

How about you, Ricardo?

Riccardo Pinosio:

I I'm afraid I'm not gonna be I'm gonna be not very original on this. I would I would mention two people. So one is my coworker and also co contributor on Hugo. That is Rob Kevel, and that is not here today, but he's also one of the main driving force behind Hugo. And my profile is different from any sense that I've been working with Go.

Riccardo Pinosio:

Well, now it's three years, but not more than that. Before that, it was Python. And before that, it was actually r. And before that, it was Lisp. Right?

Riccardo Pinosio:

So I actually come from a very different tradition. So actually, when I was mentioning about the types, you know, I I actually don't necessarily care too much about types in principle. I think you can do very good things without them. But I had a very different approach to programming when I when I started working with Rob. And he is the, in a way, the quintessential Go programmer in terms of let's keep things simple.

Riccardo Pinosio:

Let's not generalize before we actually need to generalize. And I learned a lot from him also in working with Hugo. And the second person I will mention, and now it's gonna sound like I'm glazing him, but definitely is Jan. Because in working with Jan in the last year on but also NewGo, but also on, for example, the OnX library. I have learned so much about,

Shay Nehmad:

you know, that you know, we

Riccardo Pinosio:

can use Go for other things that is not not only cloud back end engineering. And so these are my two candidates. Yeah.

Jonathan Hall:

Great. Good answers. Well, thanks again, guys, for coming on and and talking to us about what you're working on. We will, of course, have links in the show notes to both Hugo and GoMLX. Is there anything else that you guys would like our readers or our readers.

Jonathan Hall:

I I guess someone could read the transcript. Our listeners. Is there anything else you'd like our listeners to be aware of? Anything you wanna you wanna share? Resources relevant to the topic.

Riccardo Pinosio:

One thing I will mention quickly is that related to what I was saying before, for us, it's very important that people use libraries because that's also how we can actually get bug fix bug reports and so on and so forth. So if you have you don't need to be a data scientist machine learning. Right? So you can be like a developer or a back end developer, and, hey, you have we want to use some machine learning in your Go application. You think of a use case.

Riccardo Pinosio:

Please come and, you know, try Yugo, try GoMLX if there is something that you find that is not there. Because obviously, we develop otherwise, we develop Hugo in particular based on our own needs for IT analytics, our own company. But that is that would be wonderful. So some people have already started doing that, but we are always looking for more. Yeah.

Jonathan Hall:

Awesome.

Jan Pfeifer:

I I will make a last call on this. Also, if you want to learn about ML, you know, Go GoMX was done for researchers and for productionization, but it's awesome place to learn about ML and experiment with new things. So if you're curious and you like Go, don't hesitate. Go go for it. We are all helpful there.

Jan Pfeifer:

We are few but helpful people, and it's a great platform to learn or not.

Riccardo Pinosio:

Few but friendly.

Jan Pfeifer:

Alright.

Jonathan Hall:

Well, thanks again, guys. Appreciate your time and sharing your knowledge with us. Thank you, Jonathan. Thank you. Have you back on in a year and a half and talk about the new advancements in Hugo.

Riccardo Pinosio:

Yes. We'll look even more shiny, I hope.

Jonathan Hall:

Awesome. Thanks, guys.

Shay Nehmad:

Rover, exit out. Goodbye.

Creators and Guests

Jonathan Hall
Host
Jonathan Hall
Freelance Gopher, Continuous Delivery consultant, and host of the Boldly Go YouTube channel.
Shay Nehmad
Host
Shay Nehmad
Engineering Enablement Architect @ Orca
🪪 Certificate chains, Dingo, and ML in Go with Riccardo Pinosio and Jan Pfeifer
Broadcast by