0
$\begingroup$

Wondering how does a language that indirectly compiles to languages of existing systems or software receive proper type hints, autocompletion, diagnostics, and syntax highlighting for customized file extensions (such as XML components). I am aware of Language Server Protocol which is supported by most editors, but I am not sure it is capable of handling languages where the compiler used and where certain additional file extensions are up to the implementation (the platform)?

Multi-compiler language: a language that targets multiple languages behind a platform, where the platform is a software that allows for scripting, such as Node.js, or Unreal Engine. You may think of it like a "Haxe" language that targets other languages, but instead of calling them languages I call them platforms, because the final languages have varying applications.

There is no single compiler to use, although I've a single programming language, whose semantics and implementation may vary a bit according to the developer's project.

Dependency manager

I have only planned (not finished) a scripting toolset that consists of an unique programming language, and a dependency manager (or package manager). These parts of the toolset overlap with each other.

In that dependency manager, package manifests would describe a package that belongs exclusively to one platform. The package registry available to that package varies according to that platform, meaning that platforms do not conflict.

Platforms describe which compiler subset of the same programming language to use, which will handle additional file extensions other than that of the programming language, such as user interface components described in XML files, and also determine how the package is run when using an execution command.

Example manifest:

{
    "id": "com.n1.n2",
    "version": "0.1.0",
    "platform": "http://www.nodejs.org/2009",
    "compilerOptions": {
        "sources": ["src"]
    }
}

Here, the platform property is an URI identifying the platform. In my design, platforms would be explicitly installed through a command. Platform URIs beginning with file: would locate a platform in the host file system without the need to install it.

For example, there could be the Node.js platform, or the Unreal Engine platform. The semantics and available APIs would vary according to the platform, including interpretation of certain, if not all, meta-data (meta-data are plain attributes attached to definitions).

Another thing complicating things a little more is that I had the idea of describing package scripts such as the build script as actual nested packages instead of using development dependencies. I am not sure if this is possible.

Compilers

Compilers would be implemented as subsets of the single compiler of the language. Here is a compilation cycle taken from my project:

Chart

I only suppose it is possible to do this. The major work for implementing a compiler subset is solely the bytecodeartifacts step (that is, usually transpilation from the unique bytecode to a code in another form) and handling customized file extensions such as user interface components in XML.

The Haxe language might have something in familiar with this, but one difference is that a library in Haxe may support more than one target (for example, a Haxe library may be designed to target either C++, SWF, or Lua). In my case, the packages available in the registry vary according to the manifest's assigned platform.

$\endgroup$
5
  • 1
    $\begingroup$ Can you describe what you mean by a "N-platforms programming language"? I'm not sure I fully understand some of the question. In particular, I don't fully understand how most of the question body relates to the question title. However, one of the main ways languages get editor support is by using the Language Server Protocol (LSP). Many editors support this and many languages have "language servers" that communicate with editors using this protocol. $\endgroup$ Commented Apr 3, 2024 at 15:24
  • $\begingroup$ @DavidYoung I'm aware of LSP, but I was not thinking it would support handling languages in that way (i.e. when the language leaves custom file extensions and some behavior up to the implementor). $\endgroup$ Commented Apr 3, 2024 at 16:22
  • 3
    $\begingroup$ If the language doesn't precisely specify its semantics then it's hard to provide an LSP because you have to guess what implementation-defined decisions are made by the user's chosen implementation. If it does (perhaps there are different semantics for each target platform, but as long as there are specified semantics), then the LSP should implement those semantics. $\endgroup$ Commented Apr 3, 2024 at 17:15
  • $\begingroup$ @kaya3 I see... so I give up on this idea forever :-| $\endgroup$ Commented Apr 3, 2024 at 17:24
  • 2
    $\begingroup$ I think it's better to think of a language's semantics as something self-contained, without regard for what language it's compiled to. If the program will do the same thing on all target platforms then there is no problem to solve; if not, then just pretend you really have N slightly different languages and you need to provide LSPs for each. $\endgroup$ Commented Apr 3, 2024 at 17:36

2 Answers 2

3
$\begingroup$

A multi-compiler programming language can receive editor support (like autocompletion, type hints, and diagnostics) by separating its front-end (language semantics) from its back-ends (platform compilers) and connecting the front-end to editors through a Language Server Protocol (LSP) implementation. The LSP doesn’t care how many compilers or targets exist it only needs a consistent parser, type checker, and symbol resolver that understand the language syntax and semantics.

So, you’d design a shared core that handles parsing, syntax trees, and type inference for your language. Each platform-specific compiler (for Node.js, Unreal, etc.) would then build on this core to generate target-specific code or artifacts. The editor, through your language server, would talk only to that shared front-end.

If platforms expose different APIs, you can load platform-specific type definition files (like TypeScript’s lib.dom.d.ts) or metadata descriptors depending on the project’s manifest. This way, even though your language compiles to multiple platforms, it still provides unified and context-aware editor support.

$\endgroup$
2
$\begingroup$

There are several things an IDE needs to know to give rich support to the programmer:

  1. The syntax of the language - keywords, parsing rules, etc
  2. The semantics of the language - for instance, its error handling model, its type system
  3. The standard library - what built-in functionality does the IDE need to know about because it's available to every program
  4. The module system - where should the IDE look for user-defined functionality, within the current project and its dependency configuration

In some languages, each of these can themselves be customised. For instance, operator overloading could be seen as a limited customisation of the syntax. Or there may be macro and meta-programming abilities which allow users to write programs which don't look like the "core" language, based on rules they provide as a loadable module.

The more you allow different modules to influence the language, the more complex the definitions of the syntax and semantics need to be; but how they influence it still needs to be defined somewhere, so that is what the IDE needs to be shown.

If the differences between targets aren't defined within a module system, but are just ad hoc changes to the language, then you don't really have one language at all, you have a family of related languages. The IDE integration will need to be built for each of them; perhaps using a common core to save duplication of implementation. An interesting example is VB.net and C#: the syntax is completely different, but most of the semantics, and the standard library, are shared; an IDE would probably see them as separate languages with lots of overlapping details.

What the IDE doesn't generally need to know is how the program will be compiled. You can compile a PHP program to WebAssembly with the right toolkit, but that doesn't change what PhpStorm needs to show to a user.

$\endgroup$
1
  • $\begingroup$ And all those things can be supplied by a language server. $\endgroup$ Commented Apr 5, 2024 at 9:06

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.