← Back to home

⭒ How to Raise a Creole: PTX, DSLs, and the Case for Optimizing the Universal Dialect

My brilliant teammates have published a post here, you should read it. Full credit goes to them, I played a small part in the condensation portion.

First, I want to open with a little riddle, which is this:

How can you end up as a native speaker of a language that wasn't anybody's native language to begin with?

One upon a time, long before we had Duolingo, we had merchants scattered across the ports of Malacca, calling out prices in mutually unintelligible tongues and mutually bewildering each other with their replies.

When speakers of mutually unintelligible languages are forced together and their brains have hardened past the point of neuroplasticity to learn each other's tongues, they converge on a pidgin, which is a stripped-down, functional common language that nobody speaks natively but everyone can use.

Now, when these mutually unintelligible speakers go on to have children, their children grow up hearing them speak pidgin. And these children, with their hungry, highly-neuroplastic brains, absorb this language, and they end up as native speakers of a language their parents weren't even native in.

This native tongue, my friends, is called a creole.

Standard Kernel's Mission

Now, after telling the story of those merchants who lacked a common language but found their way to a pidgin anway, we're ready to tell the story of Standard Kernel, and our search for such a universal dialect.

Once upon a time, there was a fractured kingdom of GPU programmers, each with their own dialects (DSLs), born from different yet equally opinionated philosophies of what programming a GPU should feel like.

We at Standard Kernel seek to speak to the whole kingdom, by teaching our LLMs to move freely between clans and optimize across DSLs, so they can write super efficient kernels regardless of whose dialect they're beholden to.

We need a pidgin. Not just any pidgin, mind you, but a pidgin that's sustainable to raise a creole in.

My Momma Can't Raise No Creole in PTX

You can probably see where this is going. Our DSLs, of course, are the mutually unintelligible merchants. And their pidgin, the stripped-down functional language they all secretly share, is PTX.

Upon first glance, PTX seems like the most natural candidate for a universal language. So as long as our merchants target the GPU, they'll compile down to PTX. And so as long as we serve these merchants, we'll optimize our kernels in PTX.

So it follows, then, that we'll teach our dutiful LLMs to optimize PTX, right?

Let's raise a good creole LLM to speak PTX like the good creole it is. You see where this is going? "The dream would be for our LLM to go native in PTX, to become a creole speaker, the generation that grows up with the pidgin as a first language and develops fluency in it."

Alas! No LLM can be a PTX creole. PTX is extremely verbose at the token level, and even when it fits in the context window, it's one of those painful languages that resists intuition.

Toward a Pidgin Worth Raising a Creole In

It turns out there's a way for our LLMs to read PTX that doesn't require suffering, and that's by simply not asking them to read PTX.

No, we have a special creole, that's not the DSLs, and not PTX.

For protective reasons, I won't tell you how we got our hands on a suspiciously good representation that our LLMs find suspiciously readable. We call it the "condensed normalized representation," which is a bureaucratic mouthful of a name for something we're quite fond of. :-)

And what I will tell you is that our condensed normalized representation does quite well, and lets us learn and combine optimizations across DSLs in a ways that are unconstrained by any individual DSL. Read the full post on the official Standard Kernel website, here.