An extreme caricature example of a "lumper" would just use the word "computer" to label all Turing Complete devices with logic gates. In that mindset, having a bunch of different words like "mainframe", "pc", "smartphone", "game console", "FPGA", etc are all redundant because they're all "computers" which makes the various other words pointless.
On the other hand, the Splitters focus on the differences and I previously commented why "transpiler" keeps being used even though it's "redundant" for the Lumpers : https://news.ycombinator.com/item?id=28602355
We're all Lumpers vs Splitters to different degrees for different topics. A casual music listener who thinks of orchestral music as background sounds for the elevator would be "lump" both Mozart and Bach together as "classical music". But an enthusiast would get irritated and argue "Bach is not classical music, it's Baroque music. Mozart is classical music."
The latest example of this I saw was someone complaining about the word "embedding" used in LLMs. They were asking ... if an embedding is a vector, why didn't they just re-use the word "vector"?!? Why is there an extra different word?!? Lumpers-vs-splitters.
It's good to be aware of that from an engineering standpoint, because the host language will have significantly different limitations, interoperability and ecosystem, compared to regular binary or some VM byte-code.
Also, I believe that they are meaningfully different in terms of compiler architecture. Outputting an assembly-like is quite different from generating an AST of a high-level programming language. Yes of course it's fuzzy because some compilers use intermediate representations that in some cases are fairly high-level, but still they are not meant for human use and there are many practical differences.
It's a clearly delineated concept, why not have a word for it.
Yes, I know. You could argue that a C compiler is a transpiler, because assembly language is generally considered a programming language. If this is you, you have discovered that there are sometimes concepts that are not easy to rigorously define but are easy for people to understand. This is not a rare phenomenon. For me, the difference is that a transpiler is intending to target a programming language that will be later compiled by another compiler, and not just an assembler. But, it is ultimately true that this definition is still likely not 100% rigorous, nor is it likely going to have 100% consensus. Yet, people somehow know a transpiler when they see one. The word will continue to be used because it ultimately serves a useful purpose in communication.
It's basically a flowchart showing all of the different things that we mean when we say compiler/interpreter/transpiler, and which bits they have in common.
Funny, but it has two paths for transpiler - the kind that parses and outputs source from an AST, and the asm.js kind, that actually just uses a high-level language as an assembly-ish target.
So you do know the difference.
In my book, transpilers are compilers that consume a programming language and target human-readable code, to be consumed by another compiler or interpreter (either by itself, or to be integrated in other projects).
i.e. the TypeScript compiler is a transpiler from TS to JS, the Nim compiler is a transpiler from Nim to C, and so on.
I guess if you really want to be pedantic, one can argue (with the above definition) that `clang -S` might be seen as a transpiler from C to ASM, but at that point, do words mean anything to you?
import functools as ft
def fact(n):
lst = range(1, n)
return ft.reduce(lambda acc, x: acc*x, lst)
Amusing that there's not a list comprehension in sight.An assembler is a type of compiler that takes in an assembly language and outputs machine code.
A transpiler is a type of compiler that takes in a language commonly used by humans to directly write programs and outputs another language commonly used by humans to directly write programs. E.g. c2rust is a C to unsafe Rust compiler, and since both are human-used languages it's a transpiler. Assembly language isn't commonly written by humans though it used to be, so arguably compilers to assembly language are no longer transpilers even though they used to be.
The existence of a transpiler implies a cispiler, a compiler that takes in code in one language and outputs code in that same language. Autoformatters are cispilers.
A transpiler to me focuses on having to change or understand the code as little as possible - perhaps it can operate on the syntax level without having to understand scopes, variable types, the workings of the language. It does AST->AST transforms (or something even less sophisticated, like string manipulation).
In my mind, you could have a C++ to C transpiler (which removes C++ constructs and turns them into C ones, although C++ is impossible to compile without a rich understanding of the code), and you could have a C++ to C compiler, which would be a fully featured compiler, architected in the way I described in the start of the post, and these would be two entirely different pieces of software.
So I'd say the term is meaningful, even if not strictly well defined.
I think the note about generators may be a good definition for when one language is "more powerful" than another; at least it's a good heuristic:
> The input and output languages have the syntax of JavaScript but the fact that compiling one feature [generators] requires a whole program transformation gives away the fact that these are not the same language. If we’re to get beyond the vagaries of syntax and actually talk about what the expressive power of languages is, we need to talk about semantics.
If a given program change is local in language X but global in language Y, that is a way in which language X has more expressive power.
This is kind of fuzzy because you can virtually always avoid this by implementing an interpreter, or its moral equivalent, for language X in language Y, and writing your system in that DSL (embedded or otherwise), rather than directly in language Y. Then, that anything that would be a local change in language X is still a local change. But this sort of requires knowing ahead of time that you're going to want to make that kind of change.
Sadly https://people.csail.mit.edu/files/pubs/stopify-pldi18.pdf is 403. But possibly https://people.csail.mit.edu/rachit/files/pubs/stopify-pldi1... is the right link.
For example, Google Scholar search for "transpiler" yields just 3200 results, compared to ~1.4M for "compiler".
Off the top, lets compare that to "serverless."
"BabelJS is arguably one of the first “transpilers” that was developed so that people could experiment with JavaScript’s new language features that did not yet have browser implementations"
Just my two cents. Haxe was created long time ago, and BabelJS is arguably not one of the first "transpilers" people can play with.
[1] https://en.wikipedia.org/wiki/Haxe
[2] https://haxe.org
Poly-transpiler? It will also trigger more people.
When used, it has often been implied that a compiler that outputs to a human-readable programming language wouldn't be a "real compiler".
Sure, a transpiler is a specialized form of compiler. However that doesn't mean it's not much clearer to describe a transpiler using the more specific name. As such recommending someone replace "compiler" with "transpiler" (when appropriate) does not mean using compiler is wrong. It simply means that, outside of some very niche-interest poetry, using transpiler is better!
So eloquently put, what starts off as just simple syntactic conversion usually snowballs into semantics very quickly.
These things live on a continuum. Still, I think the different worlds are useful. They put forward different concepts and ideas. It helps framing things.
Is JIT also meaningless?
But ultimately if you don’t want to use a word, don’t use it. Not wanting to hear a word says more about the listener than the speaker
[1] https://github.com/FranklinChen/p2c
[2] https://en.wikipedia.org/wiki/Stalin_%28Scheme_implementatio...
So, where does BabelJS sit? Somewhere in between, depending on what language features you used in the input code. Obviously generators require heavy transformations, but other features don't.
> This is pretty much the same as (2). The input and output languages have the syntax of JavaScript but the fact that compiling one feature requires a whole program transformation gives away the fact that these are not the same language
It is not really the same as (2), you can't cherry pick the example of Babel and generalise it to every transpiler ever. There are several transpilers which transpile from one high-level language to another high-level language such as kotlin to swift. i.e; targeting the same level of abstraction.
Wonder what this person would say about macro expansions in scheme, maybe that should also be considered a compiler as per their definition.