Creating your own computer language

#1

I’m an enthusiast of computer languages design and implementation, and would like to start this thread to comment about compilers, language implementations, pros an cons of language patterns (functional vs declarative vs procedural) and other techniques (JITting, immutability). Needles to say I’m definitely not an expert in many of these topics, so I hope we can comment on this and learn together. :slight_smile:

On my behalf, yes, designing an implementing computer languages are one of my passions (even my master thesis was related to this topic). I’ve been also working on a BASIC (yes, BASIC) compiler project (https://github.com/boriel/zxbasic) but also worked in other projects (some of them lost forever :frowning: )

I’ve always wanted to implement my own programming language (and I really mean it), but it’s a daunting task many people discourages me to do. We could comment about this too: joining an existing effort or not, which tool stack to use, which language to bootstrap from, interpreted vs compiled vs “jitted” languages…

So, if you were to use a computer language how would it be? If you’re already very happy with an existing one, which one is it an why do you love it?

2 Likes
Please Introduce Yourself :-)
#2

Thanks for creating this thread! :grin: I’m always up for a good discussion about the design and implementation of programming languages. Warning: long, wall-of-text post ahead.

Have you seen the make-a-lisp project? I went through that process, and it was really eye-opening. Lisp is a great language (or family of languages, technically) for implementing your own language – I have 3 or 4 books on that topic alone, explaining how to implement interpreters and compilers using Lisp. If you haven’t tried Racket, I’d highly recommend it. It’s tailor-made for making your own language.

Fuck the naysayers. Making your own programming language might not be practical, and it’s absolutely difficult, but it’s incredibly educational, and worth it for what you learn alone, even if the language you create never gains widespread adoption.

That said, I’m less interested in justifying what I do and more interested in doing it. So: let’s talk shop! :grin:

It’s hard to say what my ideal language would look like…I could name features I would want (or not), but it’s hard to decide on some points, and ultimately I think it’s impossible to build a language that’s ideal for everyone.

You can’t please all the people all the time, and last night, all those people were at my show.

– Mitch Hedberg

That said, my ideal language, just for me, would be one where:

  • Typing is there, but it’s optional to specify types.
  • Typing is strong. You shouldn’t be able to implicitly coerce types.
  • Memory management is automatic, or at least Rust-like – in other words, I shouldn’t ever have to call malloc.
  • Whitespace doesn’t matter (much as I love Python, it got that one wrong).
  • Lisp-like macros are available, such that it’s trivial to write a code generator.
  • Everything is an object. I like this in Ruby especially, but I fucking hate Java. It should be inspired by JS, Ruby, and Smalltalk.
  • Functions are first-class objects.
  • Beginners can jump in and start writing code immediately. There should be near-zero learning curve, and the language should be optimized for the “principle of least surprise”.

This is just off the top of my head. Given these features, I’d probably go with a Lisp dialect similar to Clojure, but not JVM-based. I’m willing to forgo Lisp-like macros, but if I’m giving up features it’s not really my ideal language, right? :wink:

Getting back to your compiler: BASIC is awesome, I don’t care what they say. Making a BASIC dialect for the Sinclair ZX Spectrum is downright esoteric, and I absolutely love it. If you need a hand with that, just let me know.

I noticed you said “compiler”, not “interpreter” – I don’t believe I’ve ever seen a compiled version of BASIC, but it makes sense! Especially on a machine like the ZX Spectrum. The examples on your github page, in the screenshots, look absolutely stunning! :heart_eyes: I see you have a way to run examples on an emulator…that’s fantastic. Are you just a huge fan of the ZX Spectrum? What inspired you to create a BASIC dialect for the Spectrum in particular? And what sort of problems are you currently running into?

This is getting way too long. Sorry. I get excited about programming languages, and Lisp in particular. :laugh: Ping me back, man. I’m always up to chat about language design and such.

3 Likes
#3

quick reply. Language should be used. I.e. you cannot be only one user/developer of it. So building community at least 50% should take of creator time. and 30% should be related to documentation :slight_smile: so coding will be only 20% or you should find teammates that will cover those topics.

open source it - there are no doubts. it cannot be hidden.

Start to form a pile of data(notes). be ready that you’ll do it for 5 years.

Divide your wishes to 100 and focus on perfecting that 1%.

1 Like
#4

I don’t mind long posts. Actually I think we need to slow down and read / elaborate higher quality content (i.e. this thread) instead of plain quick responses.

Thanks for this link!! :+1: Indeed I’ve just discovered some good ideas in the python implementation!

I agree. For example working in the ZX Basic project have helped me to sharpen my skills (git-flow, testing implementations, refactoring strategies, programming patterns) regardless of the language itself, and yes, I simply enjoy doing them.

I agree with all of your ideas. But the above might be opposing goals: if you use implicit (optional) types, you might expect some type coercion (btw that of the wat made my day :rofl::rofl:). Otherwise if we think of something like:

     var a = 5    // <int> (i.e. int32 | int64)
     var b = 6.5  // <float> (i.e. double)

then, to operate a + b one of them must be coerced:

     var c = a + b  // The code analysis: coerce(<int>, <float>) => promote <float> (implicit)
     var d = float(a) + b  // Explicit by the user, code dirtier on longer expressions, like:
     var e = float(a) + b + float(a) * b + float(a) + b + float(a) * b

I’m a bit eclectic on this i.e. allow type promotion only. Although in case of explicit typecasting, it can (and I give some importance to this) be done using a cleaner syntax (something usually not given much care). Consider these 3 alternative notations to the last declaration:

     var e = a:float + b + a:float * b + a:float + b + a:float * b
//
     var e = float:a + b + float:a * b + float:a + b + float:a * b
//
     var e = (float a) + b + (float a) * b + (float a) + b + (float a) * b

Since I’m very used to Python and C, the 1st and the last ones are more familiar to me.

Absolutely agree, and if possible, not using a GC.

For the rest yes, I agree. Perhaps the spacing was nice because it enforced some sort of python readability, but that can also be achieved with a code formatter on a git commit hook.
List macros is a bit complicated specially if one wants to end up with a compiled executable. On the other hand having things like templates or generics or decorators will provide a lot of versatility.

Any help is welcome, but I also understand it’s not a very attractive project. As @arthur.tkachenko pointed out, the language should be used and have some traction. ZX Basic is definitely used (and that makes me happy and it’s the main reason I have keep working on it for 10+ yrs) but by people used to program in BASIC or with vintage machines. Most of them are not comfortable with modern languages or don’t know that much about compiler (and frankly, I need to document the project to attract other developers; currently the Wiki only provides doc about the BASIC language and how to use the compiler). This is another reason I’m most interested in creating a language (to describe something?) as a new project.

There’s another (incomplete) list of programs created with ZX Basic here if you have curiosity.

Warning long paragraph ahead :cold_sweat:

Well, I had this idea on mind since I was a child! (“running basic like machine code!” :baby:). But after graduating I started a job and parked this idea. Later I bumped into a group of retro-hackers and enthusiasts and noticed they mostly programmed in BASIC (the interpreted one), Z80 asm and few of them in C (z88dk, a C compiler), and we started discussing this idea.

For most of us, BASIC was the first computer language we met, and that bring us some nostalgic feelings. :slight_smile: It was designed to be easily learned and to be interpreted (creating a line-numbered BASIC interpreter in python is rather easy). For the machines at that time it was more than enough: they had 64K (and were called microcomputers, very similar to what microcontrollers are today, but with 1/100th of the current PIC’s memory :rofl:). I’m not sure BASIC is a language that could scale well for large codebases :roll_eyes:.

The challenging parts:

  • Due to its nature, compiling BASIC into efficient machine code for a small-memory machine can be complex: Z80 only has one 8 bit accumulator register to operate with. It only does additions and subtractions of 8 / 16 bit integer numbers. So even a multiplication or a division requires several assembly operations. BASIC variables are always float (which are very expensive to operate with) or strings. ZX Basic tries to do some type inference to use integers whenever possible, but types can also be explicitly declared.

  • For the syntax I first tried to make it as much compatible as possible as the original Sinclair BASIC to allow people to quickly get used to the language, but also added other extensions (i.e. Functions and subroutines, control-flow constructions, inline assembly, etc).

  • But definitely, by far the hardest part has been the successive refactorings (I can comment on a follow up if interested).

Currently I’m improving arrays implementation to allow them to be declared at any memory address (similar to pointers in C?). Another thing in mind is Tail Call Optimization (TCO).

I like long posts (also is my reply, sorry): actually they’re needed for transmitting a lot of information :joy:
Thanks for your interest and hope my long double reply was not too much. :slight_smile:

2 Likes
#5

No problem! That’s awesome, glad it could help. :grin:

Good point! What I had in mind was to explicitly coerce types when operations require it, much like in your examples. I agree, only allowing type promotion seems like the way to go. Otherwise, you’re naturally going to lose precision (like going from a long to an int) and you might get results that violate the Principle of Least Surprise.

Oy, right. Using a GC comes with its own set of headaches…I mean, if you implement using a language that has its own GC, it’s not so bad, but then you have to worry about that GC’s behavior (which you have no control over). Then, if you implement your own GC, it’s like a whole new system, beside the language! That’s like a 5 year project, all on its own. Much better to adopt a system like Rust, if you’re worried about performance. Of course, that makes the language harder to use…I guess I’m still on the fence here. Convince me!

Not at all! Proper macros, Lisp macros (not at all like the ones in C) are translated at compile-time – I’ve actually implemented a simple-yet-functional macro system myself in the make-a-lisp project. Template systems and generics are fine and well, but a true macro system is so much more powerful, especially with reader macros. Decorators, though…that’s another matter entirely. I’d love to have decorators in my ideal language, come to think of it. I’ve missed those from Python, back when I used Django to make websites…

I had heard that the ZX Spectrum had an active community still writing code, but I had never met anyone, online or otherwise, that actually did it – that’s awesome! I grew up in the 90s and 00s, and sadly missed out on the Apple II, Commodore 64, and all of the other wonderful, hackable machines of that era. How would you say the ZX Spectrum stacks up against the Apple and Commdore machines?

Is it possible to make it a language that scales to large codebases? I remember BASIC as relying on a lot of goto statements, global variables…obviously those would have to go, but maybe you could write a language that builds on the concepts of BASIC (INT, maybe, for “intermediate”?). Just a thought I had – not sure if it fits in with your plans.

Have you ever read Hackers: Heroes of the Computer Revolution, by Steven Levy? If not, it’s an excellent history of hacking, being partially devoted to assembly code wizardry. I’m also reminded of the Story of Mel here. One of my goals in life is to write a game entirely in 6502 assembly, to learn some of those tricks and really get a feel for how a computer works at the lowest level.

Wow! I know the 6502, a little bit, but not the Z80 – does it have X and Y registers, like the 6502? What about a stack pointer? How are the memory addressing modes? Can you recommend any solid documentation on the Z80? I honestly like to read that sort of thing, just for fun. I’m obsessed with the machines of the 80s, really.

Consider me interested! :grin: What is it about the refactorings that’s been so challenging? Is it just the general breakage that comes with a major refactoring, or is it something specific to compiler design?

Now that sounds challenging! How are you handling conflicts in requested addresses? For example, let’s say you have a program requesting an array of 4, 32-bit integers at address 0xff00. Then you have a request later on for an array (doesn’t matter what size/contents) at address 0xff02. Do you have some sort of packing algorithm, for when the arrays fill up the available slots? I’m super curious about how this works in particular. I could see a simple solution working well, to a point, but eventually you’d need a system similar to malloc.

The make-a-lisp project may be able to help there. :slight_smile: TCO is part of the challenge, although to be fair, it’s probably harder in BASIC than it is in Lisp. I’ve only implemented the latter, so you know better than I do.

Thanks for replying! long double, that’s gold. :laugh: Keep 'em coming!

1 Like