13 KiB
13 KiB
Compilers: How To Make a Programming Language
Other compiler-related topics out of scope:
- What are programs anyway (Forth, LISP)
- Computation: Turing machines, lambda calculus
- Compiler optimization
- GC (motivated by LISP)
aesthetics:
- stay motivated; stick close to a c-like, imperative, procedural model because that's what people are used to
motivation / goals / what questions are we trying to answer:
- I want to be able to make a toy procedural language.
- like C, Algol, JS, Lua, etc.
- possible user motivations:
- I want to make a game scripting language
- I want to make a DSL for my job (and I want syntax highlighting!)
- I want to do simple static analysis of my projects
ok what do we want to cover
- classical compiler structure (lexer -> parser -> codegen)
- semantic analysis / type checking
Title | Page |
---|---|
Type Checking in Compiler Design | https://www.geeksforgeeks.org/type-checking-in-compiler-design/ |
Type Checking | https://www.brainkart.com/article/Type-Checking_8086/ |
Type Checking (Slides) | https://www.slideshare.net/dipongkersen81/type-checkingcompilier-design |
What is Static Type Checking? | https://www.tutorialspoint.com/what-is-static-type-checking |
Type Checking in Compiler Design | https://www.wikitechy.com/tutorials/compiler-design/type-checking-in-compiler-design |
Type Systems | https://www.csd.uwo.ca/~mmorenom/CS447/Lectures/TypeChecking.html/node1.html |
V: Type Checking | https://www.youtube.com/watch?v=-TQVAKby6oI |
- modern compiler structures (IR / SSA, optimization)
- interpreters vs. JITs vs. AOTs vs. "transpilers"
- executables and linkers
- regular expressions?
- regular languages / grammars / language structure / automata
- terminology?
the c compilation process
- translation units
- preprocessing -> (the whole compilation process) -> object files -> linking
- c compilation model is not in favor any more, don't like compiling all these files separately
- ABIs and FFI
- should maybe be in separate article
experts / consultants:
- Bill
- NeGate
The actual progression
- Simple expression interpreter (parse and evaluate)
- Classical compiler construction (lex -> parse -> output), semantic analysis / type checking
- motivation: complex structures! recursion! etc.
- many of these resources exist and cover different aspects of the process in different ways
- Grammars and language structure
- Types of output (interpreter vs. AOT vs. JIT, etc.)
- We can probably find resources on specific ones of these
- Modern phases (IR / SSA)
- Mention WASM?
- The terrors of the real world
- Executables, linkers, and debug info
- Also debug info
- The C ABI and FFI
- Debug info
- Codegen
- Specifically: machine code generation, of reasonable quality
- Note: not necessary for all "compilers"
- Topics: register allocation, instruction selection, instruction scheduling
- Some examples of optimization passes
- There are not a lot of resources for this. Place a public TODO here?
- Specifically: machine code generation, of reasonable quality
- Appendix:
- Grammar basics (BNF, EBNF)
- Need not go into exhaustive detail on categories of grammars
- C is not the only language
- LISP
- Forth
- This could be a whole topic maybe
- Grammar basics (BNF, EBNF)
Link dump
Books
- Engineering a Compiler: [Well liked] http://www.r-5.org/files/books/computers/compilers/writing/Keith_Cooper_Linda_Torczon-Engineering_a_Compiler-EN.pdf
- Compiler Design in C: [May have a full implementation inside] https://holub.com/goodies/compiler/compilerDesignInC.pdf
- Dragon Book: [Potentially outdated -- mixed reviews] http://ce.sharif.edu/courses/94-95/1/ce414-2/resources/root/Text%20Books/Compiler%20Design/Alfred%20V.%20Aho,%20Monica%20S.%20Lam,%20Ravi%20Sethi,%20Jeffrey%20D.%20Ullman-Compilers%20-%20Principles,%20Techniques,%20and%20Tools-Pearson_Addison%20Wesley%20(2006).pdf
Webpages
- lua grammar: http://lua-users.org/wiki/LuaGrammar
- pascal railroad diagrams: https://www.cs.utexas.edu/users/novak/grammar.html
- tons of links: https://github.com/aalhour/awesome-compilers
- expression parsing examples:
- pratt parsing and recursive descent: https://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/
- dunno, not recursive descent: https://www.cs.rochester.edu/u/nelson/courses/csc_173/grammars/parsing.html
- gary bernhardt's compiler from scratch: https://www.destroyallsoftware.com/screencasts/catalog/a-compiler-from-scratch
- lambda calculus interpreter: https://justine.lol/lambda/
- chibicc (full, readable C compiler): https://github.com/rui314/chibicc
- A Compiler Writing Journey (has many pages/topics): https://github.com/DoctorWkt/acwj
From NeGate
- Near-Optimal Instruction Selection on DAGs: https://llvm.org/pubs/2008-CGO-DagISel.pdf
- The Design and Implementation of Gnu Compiler Generation Framework: https://www.cse.iitb.ac.in/~uday/courses/cs715-10/cs715-gcc-intro-handout.pdf
- Lecture Notes on Static Single Assignment Form: https://www.cs.cmu.edu/~rjsimmon/15411-f15/lec/10-ssa.pdf
- Simple and Efficient Construction of Static Single Assignment Form: https://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf
- LLVM Greedy Register Allocator – Improving Region Split Decisions: https://llvm.org/devmtg/2018-04/slides/Yatsina-LLVM%20Greedy%20Register%20Allocator.pdf
- NULLSTONE Optimization Categories: ttp://www.nullstone.com/htmls/category.htm
Articles that we need to write
- Codegen (needs more details)
- Debug info