CoRecursive: Coding Stories
Tech Talk: Scala Native with Denys Shabalin
Host: Adam Gordon Bell
Guest: Denys Shabalin, Research Assistant at EPFL, Creator of Scala Native
Date: January 1, 2018
Overview
This episode dives into Scala Native, a project spearheaded by Denys Shabalin, which brings the Scala programming language (traditionally dependent on the JVM) to native code through an optimizing ahead-of-time compiler and minimal runtime. Host Adam Gordon Bell interviews Denys about the motivations, design trade-offs, implementation, and direction of Scala Native, and discusses how it enables new scenarios for Scala, especially in environments where JVM overhead is problematic.
Key Discussion Points & Insights
1. Scala and Its Dual Paradigm Nature
- Scala: A language blending object-oriented (OO) and functional programming, designed to be neutral and flexible.
- Unique approach: “You can perfectly do object oriented, classical object oriented programming and fancy functional programming at the same time, which is pretty unique because most languages are heavily on one side or another.” (B, 01:06)
2. What is Scala Native?
- Definition: Scala Native compiles Scala code to native binaries (e.g., x86, ARM), not requiring a JVM or any VM at runtime.
- Comparison to Scala.js: Like Scala.js targets JavaScript, Scala Native targets C-like native binaries (B, 02:13)
- Native Benefits: Lower memory footprint and fast startup, vital for use cases like CLI tools and low-latency applications (B, 03:57)
3. Motivations for Scala Native
- JVM’s drawbacks: Heavyweight, slow startup, and substantial memory use, especially for short-lived or user-facing apps (B, 03:57)
- “The JVM is a really heavy, heavy machinery... it’s really hard to support all this functionality without incurring some overhead.” (B, 03:57)
- Need for flexibility in areas like systems programming and native interop
4. The ‘Golden Cage’ Metaphor and System-Level Access
- JVM restricts low-level operations for safety (“the cage”), but Scala Native allows raw memory access and pointers for maximum flexibility (B, 07:45)
- “We let you use low level system level tools like raw access to memory, raw pointers and stuff like that... it’s actually necessary in some domains like systems programming.” (B, 07:45)
5. Interoperability and Memory Management
- C interop: Scala Native exposes language extensions for easy C and systems programming (B, 08:47)
- Memory management: Offers both GC and manual memory management; unlike Java which “purports to be memory managed, but if you look at these high performance apps, they’re using backdoors...” (A, 11:13)
6. Compilation Pipeline
- From Scala source to native binary:
- Parse & typecheck via upstream Scala compiler
- Emit NIR (Native Intermediate Representation) for Scala Native toolchain (B, 12:03)
- Linking & whole program code elimination
- Optimizer passes
- Emit LLVM IR and native code via LLVM
- Quote: “...you end up with native binary. But it’s not the one step thing like from Scala source native binary.” (B, 12:03)
7. Garbage Collection Design
- JVM offers G1 (latency), Parallel (throughput), deprecated CMS, and high concurrency (B, 16:18)
- Scala Native GCs:
- Boem: Conservative, slower, easy to use
- IMIX: Precise, faster, main focus for future improvements (B, 18:49)
- None: Allocation only, for benchmarking or short jobs
- Overhead: IMIX = +20%, Boem = +100% over “perfect” (B, 22:24)
- JVM GC typically <5% overhead thanks to concurrency
8. Language Fidelity and Extensions
- Language compatibility: Scala Native aims for “one to one Scala,” with only minor edge-case differences (null handling, casting) (B, 24:52)
- Extensions: Adds C-style constructs via annotations and intrinsics, not new syntax, ensuring compatibility and ease of cross-compilation (B, 27:09)
9. Cross-Compilation and Standard Library
- Full support for SBT cross-projects; allows a single codebase to target JVM, JavaScript, and native (B, 28:01)
- “The idea for cross compilation is you can create a cross project with one or more platforms...publish one jar per every platform you want to support.” (B, 28:01)
- Scala Native reimplements subsets of necessary JDK APIs in Scala for compatibility; draws from Apache Harmony for non-GPL licensing (B, 30:05)
10. Porting Large Codebases & Framework Support
- Barriers include missing Java APIs (especially IO and networking), but progressing (B, 32:53)
- Major port: ScalaC (Scala Compiler) to native, resulting in significantly faster cold builds: “cold build with ScalaC on JVM can be like two to three times slower than code build on native.” (B, 34:18)
- Challenges centered mainly on library coverage, not the Scala Native core
11. String and Data Type Interoperability
- Scala strings are immutable, backed by arrays and GC'd, distinct from C strings; helpers exist for conversions (B, 37:16)
12. Platform Support
- Focus on 64-bit (tested on x86-64 Mac/Linux, experimental ARM); should work on any 64-bit architecture (B, 38:28)
13. Early Adopters and Experimental Projects
- Command-line tools: Scala Native excels due to instant startup (B, 39:23)
- Dinosaur: Native-first experimental web framework (B, 39:23)
- Some early attempts at iOS exist but Objective-C interop missing (B, 41:10)
- Amazon Lambda/serverless use is theoretically appealing, not attempted yet (B, 40:53)
14. Technical Inspirations
- Scala.js: Provided precedent and motivation (B, 41:50)
- Swift (SIL): Inspired intermediate representation and pre-LLVM optimizations; main difference—Swift uses reference counting, Scala Native uses GC (B, 41:50)
15. Upcoming Features & Roadmap
- Focus on library/API coverage (multithreading, networking) (B, 45:03)
- Ongoing work on GC for improved performance
- Incremental performance/code quality improvements
- All language features largely stable (B, 45:03)
Notable Quotes & Memorable Moments
-
On Scala’s dual paradigms:
“It really tries to blend both together because there is both good and bad on both ends.” (B, 01:06) -
On the JVM as a barrier:
"The JVM is a really heavy, heavy machinery... it’s really hard to support all this functionality without incurring some overhead." (B, 03:57) -
On the ‘golden cage’:
“We let you use low level system level tools like raw access to memory, raw pointers and stuff like that... it’s actually necessary in some domains like systems programming.” (B, 07:45) -
On the major advantage of native for CLIs:
“For command line tools, it’s extremely important to start up quick, do your job and then die. This is the area where JVM is really bad at right now...” (B, 03:57) -
On cross-compilation:
“The idea for cross compilation is you can create a cross project with one or more platforms...publish one jar per every platform you want to support.” (B, 28:01) -
On porting ScalaC:
“Cold build with ScalaC on JVM can be like two to three times slower than code build on native.” (B, 34:18) -
On project philosophy:
"If you can observe the difference from the reference foundation as a bug, well" (B, 32:26)
Important Timestamps
| Topic | Timestamp | | ------------------------------------------------|:-------------:| | Scala’s dual paradigm & intro to Native | 01:06–03:47 | | JVM limitations & ‘golden cage’ | 03:57–11:42 | | Compilation process to native code | 12:03–15:36 | | Garbage collection in JVM and Scala Native | 16:18–24:34 | | Language compatibility & Extensions | 24:52–27:46 | | Cross-compiling and standard library strategy | 28:01–31:38 | | Porting frameworks / ScalaC to Native | 32:53–36:54 | | String/Array interoperability | 37:16–38:15 | | Platform support | 38:28–39:15 | | Experimental/adoption stories | 39:23–41:46 | | Technical inspirations, future work | 41:50–47:18 |
Resources and Community
- scala-native.org
- Twitter: @scala_native
- Gitter chat, GitHub issues and discussions (B, 46:35)
- “Right now we have a bit more than 60 contributors overall... it’s really nice to see people interested in the project.” (B, 47:26)
Conclusion
The episode presents a thorough, candid exploration of the motivations for and challenges in bringing Scala to native environments. Denys Shabalin outlines both technical innovations (such as advanced compilation, flexible memory management, and broad API emulation) and the broader vision—to make Scala more flexible and useful beyond the JVM. Key domains that benefit include command-line tools, experimental web frameworks, and potentially serverless computing, all enabled by rapid startup and lower memory demands. The project remains open and growing, with library support and GC innovations as prime areas for future contributions.
