Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Onyx Programming Language (wasmer.io)
197 points by bkolobara on Dec 1, 2023 | hide | past | favorite | 138 comments


I would really like if the language reference[1] had a "rationale" or "principles" section describing the decisions at a high level. At the moment it dives straight into the minutiae of syntax, built-in types, etc.

The language marketing insists it is "data-oriented", even in the footer of every page. But I couldn't find any more detail on this.

My favourite example of this is Austral, which has a detailed rationale[2] that justifies the choices made and the tradeoffs. It really helped me decide whether I was interested in the language.

Odin, cited in the blog post, has a shallow rationale on its FAQ page[3], but seems to have had more time/maturity to document specific "data-oriented" features like built-in SoA[4].

[1] https://docs.onyxlang.io/book/Overview.html

[2] https://austral-lang.org/spec/spec.html#rationale

[3] https://odin-lang.org/docs/faq/

[4] https://odin-lang.org/docs/overview/#soa-data-types


Thanks for the feedback! I agree my documentation is very lacking in that department. I hope in the next few days to have a page in the documentation that describes my rationale, and the tradeoffs made during development.


Roc also is a great site on how to present a "new" language. Specially the sections that explain what they mean by being "fast", "friendly" and "functional".

https://www.roc-lang.org/


Thanks for these links!

Austral rational documentation is indeed really interesting. My preference goes into another direction (notably toward expression-based languages), but it's refreshing to read well explained design decisions.

The section about linear types is also a gem to this regard. I would even consider this as a reference source to introduce the motivations and principles behind the Rust type system and borrow checker. Only the syntax would have to be made a bit more rusty.

[1] https://austral-lang.org/spec/spec.html#rationale-linear-typ...


> For many people, semicolons represent the distinction between an old and crusty language and a modern one, in which case the semicolon serves a function similar to the use of Comic Sans by the OpenBSD project.

That's a delightful sentence, and in context of how the design makes opinionated choices based on its goals entirely reasonable.

(I don't know if I'll agree with all of those choices after I've thought about them sufficiently to be comfortable having an opinion, but I do very much appreciate that they're explicit, deliberate, and explained)


Same with the section on capability security!


To go along with that I really need a "Why should I use this" type of thing.

Similar to what you are suggesting.


It seemed inevitable that a new language would choose WASM as a runtime.

We get a bytecode runtime to rival JVM and CLR, but with a diversity of stake holders (no single vendor trying to push a language or enterprise offering), it's JITed even on Apple devices and on the damn web! Pretty amazing.

What makes me excited about this: Getting to poke around, at runtime, my compiled .NET C# code using PowerShell (in Snoop) was pretty eye-opening to me. The "standard" interoperability between PowerShell (Core?) and .NET objects as well, of course.

It feels great to have a common runtime making calling out to other languages so seamless. Yet it seems to me like it never reached the popularity it deserved, but maybe it will get another try, with a lot more enthusiasm (no more Oracle/MS brake-foot).


Except the detail that it is mostly driven by Google, given Chrome's market share.


Wrong, Google drove NaCl for a long time (portable LLVM bytecode) but then Mozilla countered with asm.js which kickstarted and heavily influenced the development of Webassembly.


Wrong, because you are only focusing on what lead to WebAssembly 1.0, almost a decade ago, not what happened afterwards the Web turned into ChromeOS.


> A data-oriented, expressive, and modern programming language

What problems does this uniquely solve that other languages & environments don't?


Unfortunately they bury the meatier sales pitch in the documentation instead of putting it front and center on the home page: https://docs.onyxlang.io/book/philosophy/onyx.html and perhaps also https://docs.onyxlang.io/book/philosophy/wasm.html and https://docs.onyxlang.io/book/philosophy/memory.html

Basically, it's really all about certain WASM use cases, and everything right down to its memory management strategy is optimized for those cases.


It looks to be a nicer C ultimately.

It looks like it fits pretty squarely in the "if you want to do C level programming but you'd also like nice lambdas, sum types, and type inference"


Yeah but we have a lot of languages that do that now. Some of them are even mentioned on Onyx's homepage.


And only need it to compile to webassembly.


Oh, I missed that part. That's quiet the design choice.


Exactly. Three things that should be on the front page of any new language: 1) examples of syntax (check), 2) quick start instructions (check), 3) short description of motivation/rationale including what other languages were considered and why do they fall short?


Notably, this is the description on GitHub[0]:

>A modern language for WebAssembly.

[0] https://github.com/onyx-lang/onyx


Great question. It looks like a cool language, but why should I use it?


Looks neat. Some obvious basic questions that I imagine most people would wonder:

1. Is it value based (C++, Rust, Go, TCL) or reference based (basically everything else).

2. How is memory freed? GC?

3. If GC how do you deal with the fact that Wasm GC is still in progress?

4. What about concurrency? Is it single threaded?

5. How does error handling work?

6. Does it support modern type features - Option<>, sum types, etc.

7. It's imperative, does that mean things that should be expressions (e.g. if/else) are statements?


Those are all great questions. I will add to the docs to explain more details later, but for now:

1. It is value based, like C. There is obviously passing by pointer, but you do that explcitly.

2/3. Memory is manually managed. To make this easier, like Zig and Odin, everything that allocates should take an allocator as a parameter. This allocator can be the general purpose heap allocator, or it can be an arena, ring buffer, or even tracked allocations that can be freed all at once. More details on that to follow in the documentation.

4. It does have multi-threading support, thanks to the atomics WASM proposal. The standard library currently has support for basic multi-threaded primitives like mutexes, semaphores, and condition variables. Note, there is currently not async/await or any kind of concurrent execution.

5/6. It does have modern type features like sum types. It has Optional(T) and Result(Ok, Err) out of the box. That is the preferred method of error handling.

7. It is mostly statement oriented, but there are features like Elixir's pipe operator (|>) and if/else (value if condition else otherwise) expressions.


About 2/3, does the compiler / language somehow help me remember to free memory I've allocated, or avoid double frees? Are there smart pointers or sth like that? (I've been coding lots of C/C++ :-))


8. What does blazingly-fast build times mean quantitatively? How does it scale with project size?


The largest project I have in Onyx is currently around 30000 lines (not all that large I know). That project on my Intel i7-1165G7 laptop compiles in about 80ms.

There are currently no large Onyx projects that can really test how this number scales, but I would guess the growth is not linear. So for an off the cuff estimated, I would say a million line project could compile in about 4-5 seconds.

Also worth noting that the entire compiler is currently single-threaded. So that number could get much better.


That's impressively fast. Well done!


Clearly Wasmer wants to do WebAssembly. But for a whole new, general-purpose programming language, what's the case for that target, versus, say, LLVM intermediate representation?


Onyx is entirely independent from Wasmer. It is a separate project developed by me, over the past 3 years. It does however use Wasmer for running on Windows, MacOS, and optionally on Linux.

I started making Onyx because I simply wanted to create my own programming language for the fun of it. I choose WebAssembly as the target for my compiler because it was relatively easy to write my own full-stack solution, instead of relying on something like LLVM. That language evolved over the years until I had something I felt was worth sharing. When the Wasmer team reached out to see if I wanted to do a collaboration blog post, I said yes.

The main reason I didn't go with LLVM was simply because I wanted to write everything myself (for fun) and to have a super-fast compiler. LLVM is not the fastest backend, but it is understandable for all the optimizations that it does.


Very cool project! If I wanted to try it out in a repl based environment (maybe like a jupyter notebook?) where/how would you recommend I get started?


Onyx does not really support a REPL environment at the moment because of the way it is compiled. Its execution cannot be paused in the same way as Python or JS. If you want to give it a try without committing to installing anything, I would recommend the Onyx Playground at https://try.onyxlang.io


Could you please explain a bit about what you mean by how it's compiled? My immediate thought when I saw this was dreaming of a tiny modern lisp repl that I could store in the cloud and run on any computer.

Even pointing to somewhere in the code that I could start poking around would be great.


The issue right now is how all WebAssembly modules in general are executed. WASM modules currently must contain everything need to run them when they are built. There is no way to dynamically define a new function or variable after they have been built, which would be required for an Onyx REPL to work.

You could definitely use Onyx to create a modern lisp interpreter that can run in the browser without any dependencies, but it might be worth porting an existing C project if that is your goal. I'm not gonna stop you from making something like that in Onyx tho; it would be very cool!

There is a however possibility for a REPL for Onyx to exist outside of the browser. This is because Onyx has another runtime it can use alongside WASM called OVMwasm. This runtime is entirely custom (you can see the source code in the 'interpreter' folder on the Onyx GitHub), so it could be modified to do non-standard things with WASM, like add new functions dynamically. That would take some work to achieve, but in theory should be possible.


Well it's Friday and i have some time to kill. Before i start, creating a language is hard and i am just making random comment on the internet.

From the "why webassembly" section :

> When making a programming language, you need to decide how your programs will actually be executed.

In abstract maybe, and obviously some language features might make an interpreter or a compiler harder/easier to make. But in general and in practice the language semantic and the "execution mode" can be treated independently. Some of the languages mentioned even have multiple style of execution (AOT, JIT, interpretation , compilation etc...)

> Onyx does not rely on libraries like LLVM to generate its code, because WASM is so simple to target.

In term of correctness maybe. It will be interesting to see what happens as performance and more complex code pattern need to be supported. Most language evolve some type of high level representation between the AST and they target of choices for language specific transformation and error reporting (SIL,clangIR and HIR? not sure about that last one).

>My strategy was to wrap libwasmer.a, the standalone library version of Wasmer, into my own custom WASM loader, to allow imported functions to be linked against native libraries

Doesn't this both limit the portability and introduce potential security risk (and thus negating the whole point of wasm) ?

I think the inter reaction with the outside world is what WASI and the other is trying to address.


The most interesting part of this language to me is that it's streamlined for WASM compilation, and I think that should be put up front more than it is


I'm in no way against a new language! Love programming languages, particularly ones that push boundaries and challenge assumptions!

(Great work on that front btw Onyx Team)

But! When I see powered by WebAssembly. I can't see a good reason to choosing a unique language when any language in theory can target WASM/WASI.

In other words Javascript/Python is looking at your lunch and is very hungry.


When I see "powered by WebAssembly", I take it as a sign that the language is going to be designed around WebAssembly's VM interface rather than the implicit standard of the C memory model and POSIX-like API which basically all popular programming languages are designed for. Even languages like Python have the concepts of stdin/stdout/stderr, a filesystem, threads, and a single globally accessible heap baked into their design. It would take a lot of work to amend Python's design to be able to easily use features that are somewhat unique to WebAssembly like the various things a .wasm file can import or export. As it stands, languages like Python can be run on WASM, but they use inelegant hacks in order to be able to replicate the execution environment features that their interpreters require such as malloc, and usually it is assumed that the target will be WASI. A language that is designed for WASM from the ground up would make it so that WASM can be used to its fullest potential as a lightweight execution environment that makes extremely minimal assumptions about what interfaces are going to be provided to the program.


That is exactly the way that I see it too.

While Onyx does have core library support for doing standard POSIX things like stdin/stdout/stderr, file operations, and networking, Onyx can be used in an entirely different environment without any of these things. In fact, the standard libraries are (almost) entirely separate from the compiler (i.e. the compiler makes no assumptions about what the standard library is), so if the standard library does not suite your use case for one reason or another, you can write your own. It will be a bit of work, but there would be nothing in your way.


> Onyx can be used in an entirely different environment without any of these things. In fact, the standard libraries are (almost) entirely separate from the compiler (i.e. the compiler makes no assumptions about what the standard library is), so if the standard library does not suite your use case for one reason or another, you can write your own. It will be a bit of work, but there would be nothing in your way.

Isn't that true for most native languages as well ? C++ have multiple stdlibs.Rust,C and c++ can be used against bare metals.

The orthogonality of the std lib and the compiler and language semantic (maybe at the exception of the types) seems pretty common these days.


C/C++ compilers are tightly coupled to their runtimes, including their standard libraries. GCC is particularly bad about this.

On MacOS there is only one C standard library, and it is the interface to the operating system.

On Windows, your choice of standard library is tightly coupled to your tool chain.

On Linux, you can kind of mix and match, but it also impacts your loader.


Your comment seems a bit self contradictory to me. You mentioned a strong coupling between runtime and compilers, but then go on giving example of the same compilers runtime with different runtime.

I suspect that the terminology we are using might be the culprit. Thinking of it as compilers, distribution/package and target might be helpful. Both LLVM and GCC are quite flexible in how they can be packaged and distributed to target various platform. A compiler distribution typically would include a (default) runtime, default options, std libraries and supporting tools (linker,debugger etc... etc...) for each supported target in the case of X-compilation.

A distribution is definitely strongly coupled to a host os/configuration. But the compiler proper not really.


That is a good point. I don't do a lot of embedded bare metal work, so I kind of forgot that is possible. I guess it's something that is not unique to Onyx, but a good thing to know nonetheless.


As a related fun example that I use at work all the time, take a look at newlib sometime!

https://sourceware.org/newlib/

And related to that again is https://keithp.com/picolibc/


I don't think i fully agree. Yes it might be possible to design a programming language that have a better synergy with wasm than the one we currently have, but IMO this won't have a particularly big impact on dev prod or performance. wasm was design to be the target of current languages, so in a sense it does capture a lot of things that already exists or can be mapped easily to existing languages. Designing against wasm might create a case of overfiting. But only the future will tell.


Here is a specific example, again using Python, of how a programming language not designed for WASM might not be ideal: suppose you are writing a Python script which is to be compiled to WASM and needs to import a function. How do you specify that in the script? There could be some Python function like `wasm_import("my_function", "u32", ["u32"])`, which allows the programmer to specify a function to be imported... and the problem with that is that in Python everything normally happens at runtime, but the compiler which is turning your script into a .wasm file needs to know in advance what functions you are going to import. So you face a dilemma: either you mess with the fundamental assumption of Python that everything happens at runtime, or you add new syntax which allows specifying metadata, or you use magic comments, or do some other hacky thing.

This might seem strange to those who are used to today's dominant WASM paradigm, where the programmer usually controls both the VM and the .wasm program — such as website developers who create a .wasm file and then also write the Javascript to load it into the browser. But one of the most exciting things about WASM, to me, is the potential for domain-specific execution environments in which the code which loads and executes the WASM is written much earlier and by a different entity than whoever writes the WASM.


I think the relatively less-hacky approach would be to have the python code use the 'import' syntax to get those functions, and then how you intercept those and redirect them to DTRT is a decision with multiple possible approaches depending on your preferred set of trade-offs.

In Perl I'd probably do something like having a .pm file that does

    package MyBindings;
    
    use v5.38;
    use Exporter 'import';
    
    our @EXPORT_OK = qw(function names to export go here);
    
    use Wasm::Bindings;
    
    my_function u32 [u32]
    other_function ...
    ...
where the stuff after 'use Wasm::Bindings' is some sort of import-defining DSL (I invented one to pseudocode in, there's probably a 'real' one already you'd be better using in practice) and the 'use Wasm::Bindings;' statement switches in a different parser for the rest of the file when you load it normally. Then you could have external tools that simply know 'ignore everything up to that use line, those lines are Perl code, everything after it is yours though and should be handled however required.'

You can register custom loaders in https://bun.sh/ that I think would let me provide similar functionality there (maybe via a .wasmbind file extension or similar) and vague memory says python has ways and means to do such things but I've not looked in some years.

(I suspect you're still right that a language designed for WASM will end up being more ergonomic in that regard even so, and also that over time we'll likely find other ways it's more ergonomic that I haven't currently thought of, but I still don't think the situation for other languages is -quite- as bad as you suggest)


The problem you are describing isn't really related to WASM or any particular target system or programming language. And targeting wasm wouldn't really address it. It looks to me like a packaging issue.

If i follow you correctly, you are simply saying that in python it's possible to refer to some code in a way that the compiler can't anticipate and thus can't make sure the relevant code is present at runtime.

First this a problem in pretty much every language. Even in very static languages like c++, one can dlopen his way into this exact situation. JVM/.Net also have missing classes situation when someone screw up the packaging

This is solved by having a sensible code laoding API. Java classloader or .NET equivalent manage that.


Serious question, in what use cases do you see JavaScript especially (but also Python) competing in the "on top of WASM" space?

Any time I see a language that compiles directly to WASM my instinct is that it will compete with Zig, Rust or C/C++/emscripten.


In the "on top of WASM" case, I imagine that the advantage of Python and JS will be exactly the same as the "not on top of WASM case". Namely, mainly development convenience at the cost of performance.


So it's scenarios where the deployment target has to be WASM, and these languages are chosen to be used on top of it. Like some of these new FaaS clouds which are targeting WASM-only deployments, or e.g. Fastly edge compute?

That makes sense, the mention of JS got me thinking about browsers which is where I spend most of my time.


Add syntax to the list - it’s horrible. The target audience is more used to the basic syntax of javascript.


I quite like its syntax. I also don’t think the target audience is beginners, this seems like a passion project someone is making for themselves and people who have a similar taste to them.


> passion project someone is making for themselves and people who have a similar taste to them.

From that angle i quite like it. Personally I also like the syntax.


I've been coding JavaScript (Typescript) for many years, and like Onyx syntax me too


How optimized is the generated code? One of the benefits of using LLVM is the amazing set of optimization passes, after all. Will Onyx have to reimplement a lot of these optimizations?


The compiler does not have many optimizations currently implemented. Currently, the compilers just uses constant folding and generally reducing unneeded operations if possible. There are plans to do function inlining and better tree-shaking, but those are a ways out.

The "nice" thing about WebAssembly is that the job of generating optimized machine-instructions is up to the WASM embedder, which Wasmer is doing very well right now. Onyx does have a second runtime that is currently only implemented for x86_64 Linux, that does have some more optimizations, but it does not compile to native code; just to an interpreted byte-code.


Can Onyx's compiler be compiled to Wasm to itself run in the browser?

It's niche, but I've thought it'd be cool if there were a mini Wasm-based game building tool in the browser, but that requires a Wasm language with a compiler that can run in the browser.


Currently, it does not. But it is written in C with a reasonable separation between the OS things and the compiler things, so compiling with Emscripten should be possible without too much modification.

This is similar to what the Onyx Playground currently does. To do that I just POST your project files to a server, save everything to disk, run the compiler, delete everything, and send the resulting WASM file back to the browser, to be run in a WebWorker.


> To navigate these limitations I came up with a way for any WebAssembly embedder, like Wasmer, to interact with native system components. ... I added a custom section to the WASM binaries output by Onyx that specifies which native libraries it wants to link against.

So the main reasons for using WASM and all of it's advantages are lost. It's neither sandboxed nor self contained any more, but slower than using "native" compilation and still missing libraries (compared to JVM, CLR, Beam,...). Don't get mé wrong, I know about the advantages of WASM for the language implementor (although I'd use WASM GC), but as a user, but I don't know why I should use Onyx.


Nice name analogous to ruby.

From wikipedia,

Onyx - A banded variety of chalcedony, a cryptocrystalline form of quartz.

Ruby - A clear, deep, red variety of corundum, valued as a precious stone.


Is Crystal also analogous to ruby?


Ahh yes. Totally forgot about crystal.

Crystal - A solid whose atoms (or molecules) are arranged in a repeating pattern. Crystals are found naturally or can be made artificially.

Someone should use Diamond for the next programming language, really like the wiki definition.

Diamond - is a carbon crystal formed under heat and pressure.


> Fast Compilation Onyx's compiler is written entirely in C and features incredibly fast compilation. The web-server for this website was compiled in 47 milliseconds.

Ok, that got my attention. Its refreshing to see a new language which doesn't takes 10 seconds to build a Hello world example.

Couple of questions;

1. The Windows installation is a bit tedious. Any plan to support Scoop for installing Onyx?

2. Given that gamedev seems to be a focus, with the raylib integration, how is the performance compared to something like c. Will a comparable onyx program get me to at least 50%+ of the performance of c? I haven't used webassembly much so I am curious how this works out.

I was just thinking of creating a new project to try out Raylib 5. So your post has perfect timing for me!

Best of luck to your language. Looks lovely in the online playground. Now going to set it up on my device


I've been trying to build a programming language targeting only wasm but never seem to get the time to work on it.

Like the syntax, mainly the pipe operator |>

A few questions,

1. Any plans to use binaryen to optimize the wasm output?

2. How is memory management handled? Are you going to use wasm-gc?

3. How does the C FFI work? Do you convert the WASM types to C types?


> Onyx is strictly type-checked. However, the type-inference systems in Onyx usually allow you to omit types.

>

> x := 10

I...

Why is this a feature in every new language? Can't we have a language that is more verbose and explicit, not less? I'd love it if named parameters were mandatory, not optional.

(Named parameters is when you name parameters/arguments you pass on to functions, like you can do in python and groovy: `foobar(arg1: 123, arg2: 'hello')`)

Most of my problems I run into is due to implicit behaviour that no one bothers to explain. In Onyx here, having type-inference means I now how to remember that x := 10 means x will be a signed 32-bit integer instead of, you know, the code that I'm writing remembering it for me.

And I'm just guessing here. Maybe x is a double. Or unsigned since its initial value isn't negative. Or maybe it was signed 32-bit integer but the Onyx developers changed it to 128-bit long long for version 666.0.0. The point is I have to look this stuff up or remember it instead of, you know, it being right there.

I don't even know what you gain by doing this. Less code is less messy but also hides a lot of information from you. Hiding information should be something an IDE does, not the language itself.

Thank you for reading my rant.


Well on the other end of the extreme we had Java, which had an awful lot of this:

    HashMap<String, List<int>> hashmap = new HashMap<String, List<int>>();
Latest Java lets you omit the left type annotation (I hear, haven't tried it):

    var hashmap = new HashMap<String, List<int>>();
And in many languages the type parameters on the right can also be inferred --- so long as later code determines them --- so you get something like:

    var hashmap = new HashMap();
Hopefully we can all agree that the first option is needlessly verbose. There's more contention between the last two, I think.

My preference would be to do some type inference, but maintain the property that you can tell the type of every expression without looking outside of it (except perhaps for an immediate enclosing function call). This requires, for example:

- The third option isn't allowed, you need to write the second option instead.

- You must annotate function argument types.

- In Rust, you couldn't write `.collect()`, you'd instead write `.collect::<Vec<_>>()`.

- The `x := 10` example is actually somewhat ambiguous. If the language fixes the type of `10` then `x := 10` is legal. If it's an unspecified type (as is typically the case), you'd have to write the type down.


> The `x := 10` example is actually somewhat ambiguous. If the language fixes the type of `10` then `x := 10` is legal. If it's an unspecified type (as is typically the case), you'd have to write the type down.

For that case I like a type signifier as part of the number literal expression, like this: `x := 10f32` or `x := 10i32`.


I have often used a Java utility class of static inferring generic constructors:

    public static <E> ArrayList<E> newArrayList() { return new ArrayList<E>(); }
    public static <K,T> HashMap<K,T> newHashMap() { return new HashMap<K,T>(); }
So code can look simpler, but just as clear:

    import static GenericConstructors.*;

    ...

    ArrayList<String> names = newArrayList();
    HashMap<String, List<int>> hashmap = newHashMap()
The "x := 10" example is one reason it feels safer to me to declare the variable type, and infer the constructor types, than the other way around.

--

What would resolve this whole issue is standardized editor/IDE visualization support for showing all inferred types, just one toggle button/key away.

Inferred types simplify writing code. But when reading code, why should we have to mentally emulate the language's inference algorithm? It is, by definition, supposed to be automating that for us.


Inferring diamond types has been built into the language for _decades_.

Your static utility functions save only 2 characters, but will add massive confusion for other developers.

  List<String> names = new ArrayList<>();
or even better just use var:

  var names = new ArrayList<String>();

Those static functions are neither simpler nor cleaner. Don’t do this.


> Well on the other end of the extreme we had Java, which had an awful lot of this: HashMap<String, List<int>> hashmap = new HashMap<String, List<int>>();

This is untrue. Inferring diamond types has been built into the language for over a decade.

    Map<String, List<Integer>> map = new HashMap<>();


I don't think you're actually disagreeing, since justinpombrio said Java had a lot of type parameter verbosity, which implies it no longer does.


Would you be willing to use a language that looked verbose in the text files, but you had an IDE that hid the verbose parts from you?

Take your Java example. If the code in the .java file looked like

    HashMap<String, List<int>> hashmap = new HashMap<String, List<int>>();
but IntelliJ showed it to you like

    hashmap = new HashMap();


I see what you mean, but do appreciate the quality of life improvement of such things. In your example, that looks a lot like a native int. It might be something different, but 99.9% of the time it’s going to be an int. I think it’s reasonable to say that unless told otherwise, a thing that looks like an int is an int. Same with `x:=3.14`. That could be a long double, but it’s vastly more likely to be a plain old float. Why not make that the default?

An extra advantage I see is that non-default types stand out. At a glance at the code, `x:=10` is the most common int type. It’s plain, unadorned. It’s the usual, boring thing. Things that are less common stand out: oh, this is unsigned for some reason. Guess I should see why. I like the pattern where unusual things are easier to visually scan for.

But at the end of the day, darn it, compiler, you know very well that I mean 3.14 to be a float. Stop making me say so each time!


I can't remember which language, but there is one where every number is a double and there is no int type.

Is Onyx like that? I don't know. It would take me maybe two minutes to look up. The point is I have to look it up to be certain when that certainty could be part of the language.


The other side of "why do I have to remember this when I could write it" is "why do I have to write it when the compiler already knows it?"

I love complete inferred type systems, using one has completely changed how I think about types and what I expect from a programming language. I feel ill every time I have to use go or typescript. Like I am teaching the compiler how to compile. This is not my job, it's a waste of time and energy, it should already know. Things like rescript and ocaml are where my attention is going for typed languages in the future.


Same reason we don't have to type `(((2int + 2int)int) / 8int)int`. Type inference already exists in any ALGOL-like. Modern languages just tend to remove the arbitrary restriction that variables are the places where types must be added.


In C, there's no type inference. Integer literals have the type "int" by default, and they need a suffix to be unsigned or long. They get truncated and promoted implicitly for convenience's sake.

  /* 'a' : int
   * 'a' is truncated to char (self-evidently safe here(
   */
  char c = 'a';
  /* 1 : int
   * c : char
   * c is promoted to int
   * 1 + c : int
   * 1 + c is truncated to char
   */
  char d = c + 1;
  /* d : char
   * c : char
   * d and c are promoted to int
   * d - c : int
   * d - c is truncated to char
   */
  char e = d - c;
If you want to avoid this behavior, you have to use the type suffixes explicitly, pretty much like that "(((2int + 2int)int) / 8int)int" expression you're making fun of.

  // Zero. 32-bit integer left shifted by 32 bits is always 0.
  1 << 32;
  // 2^32. "U" makes it unsigned.
  // "LL" makes it "long long", 64-bits on Windows and Linux.
  1ULL << 32;
  // -2147483648 (signed 32-bit)
  // i.e. 0x10000000
  1 << 31;
  // 2147483648 (unsigned 32-bit)
  // i.e. 0x10000000
  1U << 31;
I don't think this is a good thing. It's very confusing.

I prefer the type inference approach, e.g. in Rust, where they're i32 if the type cannot be inferred and the literal has no type suffix. And I like that no two integers can be used by the same binary operator unless they have the same type, so you need to explicitly cast them to the right type.


The integer literals might be misleading from the real point here, which is that every node in an expression tree in C has a type. The compiler infers most of these types. It infers that (1 / 2) is an int and that (1 / 2.0) is a double. That is type inference, and it's exactly the same as the sort of type inference that figures out what "auto" means in C++.


> that (1 / 2.0) is a double.

That is not type inference. 1 has the type int. 2.0 has the type double. Then, 1 gets converted to double. Then the whole expression has the type double. This isn't type inference; this is like saying JavaScript has type inference because it deduces that the expression ('4' - true) has the "number" type (i.e. double precision floating point).

Compare with Haskell, where a numeric literal like 32 has the type (Num a) => a, i.e., it's polymorphic, and the type is actually inferred based on the context it's used in (it could be Int, Integer, Double, Rational, whatever). If you ask it the type at a REPL, it just tells you "32 :: Num a => a", whereas C would tell you that 32 has the type int (if there were a REPL for C).


> 1 has the type int. 2.0 has the type double. Then, 1 gets converted to double. Then the whole expression has the type double. This isn't type inference.

This is just a difference in terminology. To me, what you’re describing is exactly how a type inference algorithm works. This is also the traditional academic definition of type inference, but it sounds like you’re just using it to mean “inference of the types of variables” (which makes sense as the programmer-facing definition, to be fair, because basically all languages have type inference in other places)

> This is like saying JavaScript has type inference

JS doesn’t have type inference because it’s not statically typed, but Typescript does, and it works exactly how you described it.

The difference is, with compiled languages, the compiler needs to know the type of every expression and subexpression ahead of time, to know what code to emit.


I guess you're right. I think of type inference as "there is no pre-set type for a literal; the type is is inferred based on the context it's used in." But that isn't the definition of type inference. The definition is just that its type can be deduced at compile-time without explicit annotation. But that means that, as long as you don't have to do explicit casting for every expression (e.g. "(5i32 + 6i32) as i32"), all expressions are type-inferred no matter what programming language you're in.

Like, in Rust, the type of a literal depends on the context of the variable it's later used as.

  // Due to usage below first line,
  // 5 is retroactively reanalyzed as if it were 5u8
  let a = 5;
  let b: u8 = a + 1;
Whereas in C, literals always have a set type, and the only reason you can use, e.g., int literals in non-integer expressions is due to the great amount of implicit type conversions in C.

C++, which has a form of type inference, works differently: the type of a variable is always the same as the type of expression initializing it. The closest equivalent in C++:

  // "a" is inferred to be an "int" because 5 is always of type "int" 
  auto a = 5;
  uint8_t b = a + 1; // Implicit truncation from "int" to "uint8_t"
There true equivalent of an integer literal 5 from C (on 64-bit Linux) in Rust would be 5i32, because it's always the same sign, type, and size in every expression. There's never any doubt about the type of an expression or a literal, and hence no need for some type inference algorithm, only implicit conversions. Hence, the equivalent in Rust of the above C++ is this (depending on the platform):

  let a = 5i32;
  let b: u8 = (a + 1i32) as u8;


Well, technically

    int(int(int(2) +::<int> int(2)) /::<int> int(8))
We must disambiguate integer- and float-point arithmetic operators for sure.


> Same reason we don't have to type `(((2int + 2int)int) / 8int)int`. Type inference already exists in any ALGOL-like.

That's not why you don't have to do that.

Many (most statically typed?) Algol-likes have strict types for literals (e.g., 2 is always a signed int, if you if you want specifically unsigned you might say 2u and if you want a double you say 2.0, and if you want a single-precision float you say 2.0f, or something) and strict rules at how math between them works and what types it produces, this has been true since long before tyoe inference became common, and is why you don't have to say 2int+2int — 2 is syntactically defined as int.

There is no inference


It is still type inference though. If 2 is an int, the compiler perform type inference to figure out that the expression 2 + 2 is also an int. It is just that traditional languages only use type inference for expressions, not declarations.


It's type propagation. Which can be seen as a form of inference or not, depending just where you draw the line.


So is auto, decltype and templates. In C++ we properly call it type deduction to distinguish it from actual H-M style inference which C++ lacks. The details of how it is called doesn't detract from parent's argument.


Yes. I hadn't meant my comment as argument on either side, just added context.


What's the type of the binary + operator in C?


Reasonably, x in this case would be generic/polymorphic (just as 10 is a generic/polymorphic constant) with its type bounded by the equivalent of Num trait. Haskell (and to some extent Go) gets it right.

Also reasonably, each source file should start with declaring the version of a language it was written in, so that e.g. Onyx 666.0 changing its default integer width wouldn't affect the code in the file with "///OnyxVer=665.0" at the top of it. Now that is a feature I would like every new language to have.


I think it’s fine if and only if there are 1st party tools for editors to provide hints about the type the language will infer. If I can mouse over and see the type it thinks the var is, and see static analysis errors if I try to treat it as the wrong type without casting first, then it’s fine.

That is, as long as the core tools for the language provide that info somewhere that’s not more than barely more hidden than explicitly typing it out, I think that’s Ok.


> Why is this a feature in every new language?

Because superfluous type checker incantations are, aside from breaking up flow of thought when writing code, annoying visual noise when reading it.

> I'd love it if named parameters were mandatory, not optional.

Named parameters are a different thing than type incantations, and I agree that it would be good for any non-operator function that takes more than 1 argument to require named parameters.


>Because superfluous type checker incantations are, aside from breaking up flow of thought when writing code, annoying visual noise when reading it.

I disagree. Implicit typing for the most part makes things harder read. The less I have guess about what type something is the better. I never use stuff like "var" or "auto", unless forced to for this very reason.


I don't see the value of, say, dot_product(vector1=x, vector2=y) compared to dot_product(x, y).


> I don't see the value of, say, dot_product(vector1=x, vector2=y) compared to dot_product(x, y).

I would include functions which implement (especially unary/binary) mathematical operators where the operands either are interchangeable or have a clear conventional order, in languages where you cannot make them actual operators, within the exception for operators.


My personal rule of thumb is 3-ish. More than that, and I switch from positional to named args. I appreciate how Python can let you insist on named-only args for when you want to remove the temptation to use a long list of positional args.


2 parameters are usually fine, but then I have seen quite a few bugs with atan2(y,x).


Programming is not unlike writing on an human language. Sometimes you use a proper noun, sometimes a pronoun sometimes you omit subject or object completely, even in rigorous technical writing. There is no hard rule, learning to communicate unambiguously and efficiently it is an art.


But that would only help you if you assign the numeric literal directly to a variable. If you use it in an expression like `foo(10)` or `10 * bar` you would still not see the numeric type specified.

So if you want the type to be always explicit, the type specifier should be coupled with the literal like `10[i32]` or similar, so you can write `foo(10[i32])`.


I might not be against that...

Assigning variables or calling methods (when you discard the return value) would look the same as is typical in many languages:

    int x = foo(); // foo() returns int but we can see that in what type x is
    foo(); // we don't care what foo() returns
Named variables in method calls would have to have the type so as to not hide what type a method-as-parameter returns:

    CoolObject cool_obj = bar(int arg1: foo())


Imagine if you had to write english sentences annotating every noun and verb "(Bob: Person) (went: Verb) to the (beach: Place)." That's what mandatory type annotation feels like to me.


> Can't we have a language that is more verbose and explicit, not less?

What's wrong with the old verbose languages?


No no, I with you this, I hate that Go allows this as well.

If you have a type system, then force the user to specify exactly what type they want something to be. It's completely reasonable in my mind to write var x float32 = 10, the compiler can then deal with adding the .0 if it wants.


Eh, rust has pretty solid type inference (sounds like you'd hate it), but with the rust analyzer extension, you can get inlay hints that tell you the inferred type of every variable, so you're never actually left wondering. Seems like the best of both worlds.


I'd prefer the other way around. If the language was verbose but its tools (IDEs, analyzer extension) hid redundant info from you.


Anyone have any experience with Onyx that would care to comment?


> When linking, I find the libraries on disk, load them, call a function in each dynamic library that returns a list of procedures that can be used when linking. This external linking enables Onyx to use any native C library, from Raylib and OpenGL, to PostgresQL and OpenSSL.

This is really cool. I wonder if there is any overhead using native libraries from wasm like this. My intuition says it does similar to node-ffi and JNI but I hope it's not as bad.


I haven't directly measured any throughput or latency numbers. I know it is relatively fast, but I know it is not as seamless as I would like.

Currently, for WASM to call any imported function, its arguments have to be formatted in a particular way that is specified by the wasm-c-api. Then it can invoke the imported function. Only after that can the arguments be translated back into the native calling convention in order to call the real native function. I don't know how that compares to node-ffi and JNI, but I imagine its not as slow.

I have a piece of documentation I need to write that will explain how calling native libraries works from Onyx, so that should be able to go into more detail.


I am getting network errors loading some paths on the onyxlang.io website. That's not a good advertisement for Wasmer.


Oh no! Do you mind providing the paths where the errors occurs or the requests ids returned by Wasmer, so we can investigate further?

Thanks!


I think the requests that were failing were pretty random actually but I didn't get any response from the following. After a few minutes the request would timeout with an error NS_ERROR_NET_HTTP2_SENT_GOAWAY in shown Firefox.

https://onyxlang.io/

https://onyxlang.io/playground

The second one is a 404 but I wasn't even seeing that earlier (this is linked from the first page of the docs).

Currently, I don't have any issues but the problems I was having lasted at least a few hours which is why I mentioned it.


Thanks a lot for the info, I'll relay to the team to investigate further


Does it support hotreload?

To me that's the piece that's missing in all these new languages

Being able to hot patch code saves you from having to recompile everything and reload the state of your program, saves lot of time

I was excited when Zig announced planning it, but it's been like 4 years now and they went radio silence unfortunatly


That is a planned feature because I really want that too! I believe it should be possible and not take years to develop, especially because Onyx does have its own WASM runtime that I can completely control how and when certain things happen, like pausing for a hot reload. This runtime is currently Linux only, but I should be able to address that relatively soon. I don't know if it would be possible with Wasmer as the runtime, without Wasmer making severe modifications to its code generation and runtime. Hot reload is a tricky problem, but I believe for the developer experience, it is worth trying.


How did you find the experience of compiling to wasm? Do you build an IR first at all or just ast -> wasm?


I found it relatively easy. I planned a lot of things out of whiteboards and notebooks, but the actual code writing was mostly straight forward. For Onyx, I went straight from AST to WASM. The trickiest part was probably deciding how non-primitive things would be represented in WASM, like structures and tagged unions. But all of that came later after I had the basics down. I started with the simplest little compiler that could only compile binary operations on integers, and basic function calls. Then I expanded the compiler outward once I had the underlying data structures and code paths figured out.


Why does every language designer feel compelled to invent unique syntax? We need a common syntax for all new languages. Only internals differ.

This arbitrary combination of :: { () : core std f fmt -> is just annoying.


Since the guy who makes this is in the thread and I'm not the first to say stuff about the website…

Seems cool but I was trying to read one of the examples on <https://onyxlang.io/> and the carousel changed pages, throwing me off.

Would you consider not using a(n auto-advancing) carousel with code examples?

Not unrelated: <https://shouldiuseacarousel.com> (short answer: "no")


I totally agree the auto-advancing carousel was a bad idea. I wanted to add something with movement to the page, but that was definitely not the right idea. The carousel has been removed!


I'm not sure a specific compilation target that is meant to be as simple and generic as possible makes a new language a good idea.

New languages mean you start all over and wipe away decades of tools, knowledge, libraries, development environments, package managers, workflows and now people are in the wild west again. To justify that you have to have some enormous benefits.


Not to be confused with Onnx, a popular language for machine learning models, that is also commonly used with WebAssembly. Even the logo looks similar.

https://onnx.ai/

https://cloudblogs.microsoft.com/opensource/2021/09/02/onnx-...


Not to be confused with Fedora Onyx, the budgie version of atomic Fedora.

https://fedoraproject.org/onyx/



Not to be confused with Onyx, the son of rappers Playboi Carti and Iggy Azalea.


Not to be confused with Onyx the gemstone. Or the other way around


Not to be confused with Onyx the clothing store.


Funny to see Onyx be mentioned in the wild! :D


Not to be confused with Onex Hacking Tool.

https://github.com/MasterScott/onex


nit: seamless not seemless.

Actually TIL seemless is actually a word but it's a synonym for unseemly.


I have irrational anger towards := assignment operator, probably because of Pascal course in high school.

Interesting if anyone else have such stories regarding language syntax and features.


In Go, when you export function names they are capitalised. Eg MyFunc() would be exported and myFunc() is not. This always looks so jarring to me and I’ve avoided Go largely because of it, which must rank as one of the more snowflake reasons to avoid a programming language.


Yes, I hate it. And in general, capitalized function names break mental model of inferring token kind by its name. In my world, capitalized token is always a type, lowercase is a variable (or property, function, etc).

Popular counter-argument is that it can be fixed with syntax highlighting, but for me it's a flaw in the language syntax.


For brave service in the forces of self-awareness, I hereby award you https://trout.me.uk/medal.jpg

May you wear it with pride.


It bugs me too. Mainly because I don't see anything wrong with "=", plus it's shorter and more intuitive.


The example in the blog post could have been written a little bit better, but in Onyx ":=" is actually a combination of two operations: defining a new variable with ':', and assigning the value with '='. If the type can be inferred on the right hand-side of the assignment, you can leave out the type between the ':' and the '=' and it effectively becomes a single operator. An alternative way to write that could have been 'input: str = "111 110 121 120";'.

To assign in Onyx, you only have to use '='.


As a mathematician, I see everything wrong with using “=” for something other than equality. Especially when writing something like x = x + 1.


I also have this irrational anger, but as a result of PL/SQL. It's also an objectively hideous operator that is inconvenient to type.


I have an irrational (?) hate for having to declare variables at all because it constantly hassles me while testing and copying/modifying code.


Coming from Go, I got used to :=. I would’ve preferred if it defaulted to const though


Coming from Pascal, I like := :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: