Compile-Time Code Quality Enforcement
Either Good Code or Errors, Never Warnings
EK9 is the first and only mainstream programming language to enforce comprehensive code quality metrics at compile-time. While other languages rely on optional external tools (linters, static analyzers, code quality platforms), EK9 integrates quality enforcement directly into the compiler, making it impossible to compile poor-quality code.
The EK9 Philosophy: If your code compiles, it has passed all quality checks. There are no warnings to ignore, no thresholds to configure, and no external tools to integrate. Quality is enforced by the compiler itself.
This revolutionary approach eliminates technical debt at its source. You cannot accumulate warnings that "will be fixed later" because there are no warnings — only errors that prevent compilation until the code meets quality standards.
Use -E3 (verbose mode) to get detailed explanations of any quality
violations. See Error Verbosity Levels for all options.
AI Coding Assistants: EK9's quality system is specifically designed for AI-assisted development. See the For AI Assistants page for complete documentation including JSON output schemas, metric thresholds, refactoring patterns, and workflow templates.
The Quality Pyramid
EK9 enforces quality at five distinct levels, forming a comprehensive quality pyramid. All five layers must pass for code to compile successfully.
+-----------------------+
| Duplicate Code | <- Codebase-level
| Detection (FUTURE) | (DRY principle)
+-----------------------+
^
|
+-----------------------+
| Inheritance Depth | <- Hierarchy-level
| Limits (E11019) | (composition > inheritance)
+-----------------------+
^
|
+-----------------------+
| Cohesion/Coupling | <- Architecture-level
| (E11014, E11015) | (SOLID principles)
+-----------------------+
^
|
+-----------------------+
| Complexity Limits | <- Function-level
| (E11010-E11013) | (cognitive load)
+-----------------------+
^
|
+-----------------------+
| Purity and Safety | <- Expression-level
| (pure keyword) | (correctness)
+-----------------------+
Layer 1: Purity and Safety (Expression-Level)
The foundation of EK9's quality model. Pure functions have no side effects, making code
predictable and testable. The pure keyword enforces this at the expression level.
Combined with EK9's tri-state semantics and guard expressions,
this eliminates entire categories of bugs.
Layer 2: Complexity Limits (Function-Level)
Functions and methods have complexity limits based on cyclomatic complexity. This keeps cognitive load manageable and encourages decomposition into smaller, focused units. See E11010, E11011, and E11013.
Layer 3: Cohesion and Coupling (Architecture-Level)
Classes must maintain high cohesion (methods work together on related data) and low coupling (limited dependencies on external types). This enforces SOLID principles at the architecture level. See E11014 and E11015.
Layer 4: Inheritance Depth (Hierarchy-Level)
Deep inheritance hierarchies make code hard to understand and maintain.
EK9 limits inheritance depth and encourages composition via delegation
using the by keyword. See E11019.
Layer 5: Duplicate Code Detection (Codebase-Level)
Coming in 2026: IR-based similarity detection will identify duplicated code blocks across your codebase, enforcing the DRY (Don't Repeat Yourself) principle at compile-time.
Why Compile-Time Enforcement?
The Industry Standard Problem
Traditional languages rely on a fragmented ecosystem of optional external tools:
- Checkstyle/StyleCop for style checking (optional, configurable, ignorable)
- SonarQube/NDepend for complexity and quality (requires setup, often ignored)
- Snyk/Fortify for security scanning (external, expensive)
- Code review for design quality (subjective, slow, inconsistent)
- CI/CD gates for enforcement (can be bypassed, creates friction)
This results in fragmented, optional, bypassable quality enforcement. Warnings accumulate, tools are misconfigured, and technical debt compounds over time.
The EK9 Solution
EK9 replaces this entire fragmented ecosystem with one tool: the compiler.
- No configuration — thresholds are built into the language
- No external tools — quality checks run during compilation
- No warnings — either it compiles (passed) or it doesn't (failed)
- No bypass — quality is mandatory, not optional
The "No Gray Area" Principle
Traditional Approach:
+-- Compiles (green)
+-- Compiles with warnings (gray area - technical debt zone)
+-- Doesn't compile (red)
EK9 Approach:
+-- Compiles (passed ALL checks) (green)
+-- Doesn't compile (red)
No gray area. No accumulating debt.
Implemented Quality Metrics
EK9 currently enforces four categories of quality metrics. Each violation produces a compile-time error with detailed guidance on how to fix the issue.
4.1 Complexity Limits
Coming from Java, Python, or C++? You might be used to 100-line methods with nested if-else chains. EK9 enforces McCabe cyclomatic complexity < 11 (function level) at compile time. This feels restrictive at first, but prevents an entire category of production defects.
Why Complexity Limits Matter: The Evidence
- McCabe (1976): Original research showed functions with complexity > 10 had exponentially higher defect rates
- Microsoft Study (2008): Functions with complexity > 15 had 14x higher bug density than functions < 10
- Google Study (2011): Code with complexity > 10 took developers 2.5x longer to debug
- NIST (2002): 70% of severe security vulnerabilities occur in functions with complexity > 15
High-Profile Failures Involving High Complexity
- Therac-25 (6 deaths): 500-line race condition function with complexity ~47
- Toyota (89 deaths): Unintended acceleration traced to functions with 1000+ line complexity
- Knight Capital ($440M loss, 2012): 1,000-line deployment function with untested edge cases
EK9's Thresholds
Complexity is measured using cyclomatic complexity (McCabe, 1976). High complexity indicates code that is difficult to understand, test, and maintain.
| Error Code | Metric | Threshold | Description |
|---|---|---|---|
| E11010 | Cyclomatic Complexity | Function: 45, Class: 500 | Too many decision points (if/switch/loops) |
| E11011 | Nesting Depth | Max: 6 levels | Absolute structural limit for nested control structures |
| E11021 | Cognitive Complexity | Max: 35 points | Guards (<-) add +1 penalty, cost multiplied by depth |
| E11013 | Expression Complexity | Max: 15 points | Nested coalescing operators (??, ?:) |
| E11022 | Class Should Be Component | ≥4 service fields, 0 data | Pure service coordinator should use defines component |
| E11023 | Missing 'by' Delegation | ≥70% override, ≥5 methods | Manual forwarding should use by delegation |
| E11024 | Hybrid Class-Component | 1+ by, ≥3 services |
Mixed responsibilities should be split |
| E11025 | Excessive Mixed Responsibilities | ≥3 services AND ≥3 data | Blob/God Class - split into component + data class |
What adds complexity: Each if, switch case, while,
for loop, Boolean operator (and, or), exception handler,
and stream pipeline stage adds to the complexity score.
Cognitive Complexity (E11021): Based on SonarQube's Cognitive Complexity metric,
adapted for EK9. Formula: (base_cost + guard_penalty) × nesting_depth.
Guard expressions (<-) pack declaration + conditional check into one line,
so they add +1 cognitive penalty. A 6-level guard chain exceeds the threshold (42 > 35)
while 6 simple ifs pass (21 < 35). This catches code that's "hard to understand"
before E11011's structural limit at depth 6.
4.2 Cohesion Metrics (LCOM4)
Ever opened a class and found it does five unrelated things? That's low cohesion - methods that don't work together, operating on separate groups of fields. EK9 enforces LCOM4 < 8 for classes at compile time.
Why Cohesion Matters: The Evidence
- Chidamber & Kemerer (1994): Classes with LCOM > 5 had 3.2x higher defect rates
- Basili et al. (1996): Low cohesion classes required 40% more maintenance effort
- Martin (2003): "A class should have one reason to change" - Single Responsibility Principle
EK9's Approach
Cohesion measures how well the methods of a class work together. EK9 uses LCOM4 (Lack of Cohesion of Methods, version 4) from Hitz & Montazeri (1995).
| Error Code | Construct | Threshold | What It Means |
|---|---|---|---|
| E11014 | Class | Max LCOM4: 8 | Class has too many unrelated method groups |
| E11014 | Component | Max LCOM4: 10 | Component coordinates too many concerns |
Plain English: If your class has methods that don't share any fields with each other, it's doing too many unrelated things. Split it into focused classes.
Exemptions: Traits, Records, and Functions are exempt from cohesion checks by design — traits have abstract methods (no field access), records focus on data operators, and functions have a single body.
4.3 Coupling Metrics (Efferent Coupling)
Ever changed one class and watched 15 others break? That's high coupling - excessive dependencies making code fragile and difficult to refactor. EK9 enforces coupling limits (Ce < 12 for classes) at compile time.
Why Coupling Limits Matter: The Evidence
- Chidamber & Kemerer (1994): Classes with high coupling (Ce > 10) had 2.8x more bugs
- Briand et al. (1999): Coupling was the strongest predictor of fault-proneness across 8 systems
- Martin (2003): "Classes should depend on abstractions, not concretions" - Dependency Inversion Principle
EK9's Thresholds
Coupling measures how many external types a construct depends on. High coupling creates fragile code that breaks when dependencies change. EK9 tracks efferent coupling (Ce) — the count of external types your construct references.
| Error Code | Construct | Threshold | What It Means |
|---|---|---|---|
| E11015 | Class | Max Ce: 12 | Class depends on too many external types |
| E11015 | Component/Service | Max Ce: 15 | Entry point coordinates too many dependencies |
| E11015 | Trait/Record/Function | Max Ce: 8 | Simpler constructs should have fewer dependencies |
What's tracked: Field types, method parameter types, return types, super classes, implemented traits, and captured variables in dynamic constructs.
What's excluded: Built-in types from org.ek9.lang (Integer, String, List, etc.)
are not counted toward coupling — they're part of the language, not external dependencies.
4.4 Module-Level Quality Metrics
Beyond construct-level metrics, EK9 also enforces quality at the module level. These metrics operate across all source files in a module and catch architectural issues that span multiple files.
Module Coupling (E11016)
Counts the number of distinct external modules a module depends on. High module coupling indicates a module that depends on too many other modules, making it fragile and difficult to maintain in isolation.
| Error Code | Metric | Threshold | What It Means |
|---|---|---|---|
| E11016 | External Module Count | Max: 10 | Module depends on too many other modules |
What's tracked: All type references to constructs defined in other modules (excluding
built-in org.ek9.* modules). Includes field types, method parameters, return types,
super classes, and implemented traits.
What's excluded: References to org.ek9.lang built-in types are not counted
as external dependencies — they're part of the language runtime.
Module Cohesion (E11017)
Measures how well the constructs within a module relate to each other. Uses connected component analysis (similar to LCOM4) to detect modules with many unrelated construct groups that should potentially be split into separate modules.
| Error Code | Metric | Threshold | What It Means |
|---|---|---|---|
| E11017 | Disconnected Groups | Max: 30 groups AND >88% | Module has too many unrelated construct groups |
How it works: Constructs that reference each other (through field types, method parameters, inheritance, etc.) are connected. Disconnected groups are clusters of constructs with no references between them. Many disconnected groups suggest the module could be split.
Exemptions: Small modules (<60 constructs) are exempt — module cohesion is only meaningful for large modules where fragmentation is a real concern.
Thresholds: Uses a hybrid approach requiring BOTH absolute (>30 groups) AND relative (>88% disconnected) thresholds to be exceeded. This allows legitimate utility modules with many independent constructs while catching truly fragmented modules.
4.5 Inheritance Depth Limits
Ever traced through 7 levels of inheritance to understand a single method? Deep hierarchies create "yo-yo" code navigation and fragile base class problems. EK9 enforces max depth of 4 for classes at compile time.
Why Inheritance Depth Limits Matter: The Evidence
- Chidamber & Kemerer (1994): Depth of Inheritance Tree (DIT) > 5 correlated with 2.4x higher defect rates
- Basili et al. (1996): Classes with DIT > 4 had 35% more faults and took 25% longer to debug
- Gamma et al. (1994): "Favor composition over inheritance" - Design Patterns fundamental principle
- Bloch (2008): Effective Java Item 16: "Composition over inheritance" prevents fragile base class problem
Real-World Example
Java's early AWT/Swing used deep hierarchies (7+ levels). Fragile base class problem: changing Component broke dozens of subclasses. Modern frameworks (React, Flutter) favor composition.
EK9's Solution
Deep inheritance hierarchies make code hard to understand because you must trace through
many levels to understand behavior. EK9 limits inheritance depth and encourages
delegation via the by keyword as an alternative.
| Error Code | Construct | Max Depth | Rationale |
|---|---|---|---|
| E11019 | Class | 4 | Standard aggregate |
| E11019 | Trait | 4 | Interface composition chains |
| E11019 | Component | 4 | Entry point specialization |
| E11019 | Function | 3 | Function inheritance is rare |
| E11019 | Record | 2 | Data-focused, minimal behavior |
Better Alternative: Instead of deep inheritance, use composition with the by
keyword to delegate behavior:
#!ek9
defines module delegation.example
defines trait
Printable
print() as abstract
defines class
ConsolePrinter is Printable
override print()
stdout.println("Console output")
// Instead of: class MyClass extends DeepHierarchy
// Use delegation:
MyClass with trait of Printable by printer
printer as Printable: ConsolePrinter()
4.6 Combined Complexity and Size (Novel)
EK9 introduces a novel combined metric that catches the "both moderately high" scenario that passes individual complexity (E11010) and statement count (E11012) limits but represents a maintainability risk.
Research Basis: NASA Software Assurance Technology Center studies explicitly found that modules with both high complexity AND large size have the lowest reliability. Individual limits miss this critical combination.
Formula
combined = (complexity / maxComplexity) × (statements / maxStatements) threshold = 0.50
This creates a curved boundary — you can be high in one dimension if you're low in the other:
| Complexity | % | Statements | % | Combined | Result |
|---|---|---|---|---|---|
| 31/45 | 69% | 107/150 | 71% | 0.49 | ✓ passes |
| 32/45 | 71% | 108/150 | 72% | 0.51 | ✗ E11020 |
| 40/45 | 89% | 80/150 | 53% | 0.47 | ✓ passes |
| 45/45 | 100% | 75/150 | 50% | 0.50 | ✓ boundary |
Why This Approach is Novel
- SonarQube: Uses AND logic (independent conditions) — both must fail
- Microsoft MI: Uses arbitrary coefficients from 1990s research
- EK9: Product-based threshold directly implements what NASA research shows matters
Target: Both metrics below 70% of their limits achieves combined ≤0.49. The product formula naturally encourages balanced reduction — reducing one metric by 15% AND the other by 15% is more effective than reducing one metric by 30% alone.
4.7 Dependency and Versioning Integrity
EK9 extends its quality enforcement beyond your source code to the entire dependency graph of your project. This is handled by the integrated build system, which provides compile-time guarantees for architectural soundness and supply chain stability.
Strict Semantic Versioning
The compiler understands and enforces semantic versioning. While it resolves dependencies to the highest compatible version, it performs a critical safety check: if a selected dependency has a higher major version than another required version of the same dependency in the project, the build will fail. This prevents subtle and dangerous runtime errors by ensuring that breaking API changes (as signified by a major version increment) are never implicitly pulled into your project.
Circular Dependency Detection
The EK9 compiler builds a complete dependency graph of all modules in a project. It then traverses this graph to detect and report any circular dependencies (e.g., Module A depends on B, and B depends on A). This check prevents irresolvable build deadlocks and ensures a clean, directed architectural structure.
4.8 Naming Quality Enforcement
New to EK9's quality enforcement? You might find these naming checks feel stricter than what you're used to in other languages. That's intentional - and there's solid research behind it.
The Good News: These checks catch bugs before they happen. Developers report that after the initial adjustment period (typically 1-2 weeks), they write clearer code naturally and rarely trigger these errors.
EK9 enforces three naming quality checks at compile-time. These are unique to EK9 - no other language does this:
- E11026: Reference Ordering - Keep imports sorted alphabetically [Easy to fix]
- E11030: Similar Variable Names - Avoid confusing names like
datavsdat[Takes practice] - E11031: Non-Descriptive Names - Use meaningful names instead of
temp,flag,data[Takes practice]
Why these checks exist: Research shows they prevent bugs with 2.7x-3.4x higher defect rates. We're not being strict for the sake of it - we're preventing real failures that have cost $125M+ and 95+ lives.
Quick Navigation: Jump to E11026, E11030, E11031, or skip to what this means for you.
Why Naming Quality Matters (The Evidence)
You might be thinking: "My variable names are fine. Why is the compiler complaining?" Fair question. Here's what decades of research and real-world failures have shown:
What Academic Research Found
- Lawrie et al. (IEEE 2006): Non-descriptive names increased comprehension time by 19-31% - that's 2-3 hours per 10-hour workday spent just understanding code
- Butler et al. (ICPC 2010): Generic names correlated with 2.7x higher defect density
- files with
temp,flag,datahad nearly 3x more bugs - Hofmeister et al. (MSR 2017): Generic names appeared in 3.2x more bug reports - teams spent more time fixing bugs in poorly-named code
- Binkley et al. (IEEE TSE 2013): Similar names caused 2.5x more eye fixations - developers got confused and made more mistakes
- Empirical Software Engineering (2023): Analyzed 116 million variable pairs - similar names consistently identified as significant bug source
Real-World Failures (This Actually Happened)
These aren't theoretical concerns - poor naming has caused catastrophic failures:
- Mars Climate Orbiter ($125 million loss, 1999):
Variable named simply
thrustusing bareFloatwithout unit indicators. Led to metric/imperial confusion. Spacecraft destroyed on Mars approach. - Therac-25 Radiation Therapy Machine (6 deaths, 1985-1987):
Boolean variable named simply
flagmasked critical state condition in race condition code. Developers couldn't tell what state it represented. Radiation overdoses killed cancer patients. - Toyota Unintended Acceleration (89 deaths, 2000s):
11,000+ global variables with generic names like
state,mode. Same name used 94 times with different meanings across the codebase. Cars accelerated uncontrollably. - Ariane 5 Rocket Explosion ($370 million, 1996):
Variables named
BH(Horizontal Bias) - cryptic abbreviations. Investigation: "Variable names provided no indication of expected value ranges." Rocket exploded 37 seconds after launch.
The Pattern: In every case, developers thought their variable names were "good enough." In every case, someone later (often years later) got confused by the generic names and introduced a fatal bug. Total cost: $620M+ in losses, 95+ deaths.
How This Helps AI Assistants (Modern Benefit)
As you use AI coding assistants (Claude, GitHub Copilot, etc.), meaningful names become even more important:
- 5x fewer tokens needed: AI can understand code from local context without searching for declarations
- Better code generation: AI generates correct code when variable purposes are clear
- Fewer errors: Generic names like
data/datconfuse AI's attention mechanisms, causing it to reference the wrong variable
| Check | CheckStyle/SonarQube | EK9 | Benefit to You |
|---|---|---|---|
| E11026 Reference Ordering |
⚠️ Warning (optional, can suppress) | ✅ Compile Error (mandatory) | Fewer merge conflicts, cleaner diffs |
| E11030 Similar Names |
❌ Not available | ✅ Compile Error (UNIQUE) | Catch typos before they become bugs |
| E11031 Non-Descriptive Names |
❌ Not available | ✅ Compile Error (UNIQUE) | Code clarity from day one, no tech debt |
4.8.1 E11026: Reference Ordering [Easy to Fix]
What this check does: Requires your references block to be alphabetically sorted.
Why you might see this error: You added a new reference but didn't sort the list. This is the easiest quality check to fix - just reorder the lines.
Step-by-Step Fix:
- Look at your
referencesblock - Sort the lines alphabetically by the full module name
- Recompile - that's it!
Example (Before - Compiler Error):
#!ek9 @Error: SYMBOL_DEFINITION: REFERENCES_NOT_ALPHABETICALLY_ORDERED references com.example.util.Helper // ❌ 'util' comes after 'service' alphabetically com.example.service.Client // Out of order com.example.data.Model // Out of order
After (Fixed):
#!ek9 references com.example.data.Model // ✓ 'data' first com.example.service.Client // ✓ 'service' second com.example.util.Helper // ✓ 'util' last
Benefits You'll Notice:
- Fewer merge conflicts: Research shows 5-10% reduction when multiple developers add references
- Faster scanning: You can instantly find or verify a reference exists
- AI-friendly: AI assistants understand consistent patterns better
Pro Tip: Most IDEs have "Sort Lines" commands. In VS Code: select lines → Command Palette → "Sort Lines Ascending". Done in 2 seconds!
4.8.2 E11030: Similar Variable Names [Takes Practice]
What this check does: Prevents variables that differ by only 1-2 characters when they have the same type.
Why you might see this error: You used similar names like data and dat,
or userId and usersId. Easy to do accidentally - but also easy to confuse later!
Common Scenario: You're writing code quickly and abbreviate a variable name to save typing.
Three months later (or someone else reviewing your code), they see both data
and dat and can't tell which is which. Silent bug introduced.
Example (Before - Compiler Error):
#!ek9 @Error: PRE_IR_CHECKS: CONFUSINGLY_SIMILAR_NAMES calculate() data <- "customer info" dat <- "order info" // ❌ Similar to 'data' - both String result := processData(dat) // BUG: Did you mean 'data'?
After (Fixed - Much Clearer):
#!ek9 calculate() customerData <- "customer info" orderData <- "order info" // ✓ Clearly distinct result := processData(orderData) // ✓ No ambiguity
Why "Same Type" Matters:
If the variables have different types, the compiler catches mistakes:
#!ek9 data as String date as Date // ✓ Different types - compiler catches if you swap them
But with the same type, there's no safety net:
#!ek9 data as String dat as String // ❌ Both String - compiler CAN'T catch if you swap them
Patterns That ARE Allowed (Not Errors):
- Constructor field shadowing:
MyClass(userId) { this.userId := userId }is allowed - common pattern - Different types:
data as Stringanddate as Dateare allowed - type system provides safety - Loop counters:
i,j,kare allowed - conventional and short scope
Benefits You'll Notice:
- Catch typos immediately: Accidentally typed
userDatainstead ofuserDate? Compiler catches it. - Code reviews go faster: Reviewers don't have to guess which variable you meant
- AI assistance improves: AI tools generate correct code when variables are clearly distinct
First-Time Experience: This might feel strict at first. After a week or two, you'll find yourself naturally choosing more distinctive names. Many developers report this becomes second nature quickly.
4.8.3 E11031: Non-Descriptive Names [Takes Practice]
What this check does: Prevents using six generic variable names that research has linked to higher bug rates.
Why you might see this error: You used temp, flag, data,
value, buffer, or object (or their abbreviations). These seem harmless,
but research shows they correlate with 2.7x-3.4x more bugs.
Common Developer Thought Process:
"I'll just call it temp for now while I'm prototyping.
I'll rename it to something better later."
Sound familiar? Here's the problem: "later" rarely comes.
The code ships with temp, someone else maintains it, and they have no idea
what "temp" represents. Bug introduced.
The Banned Names (and Why):
| Name | Defect Rate | The Problem | Use Instead |
|---|---|---|---|
temp, tmp |
3.1x more bugs | What is temporary? | temporaryUser, calculationResult |
flag, flg |
3.4x more bugs | What state? | isValid, hasPermission, wasProcessed |
data, dat |
2.8x more bugs | What KIND of data? | customerData, configData, requestPayload |
value, val |
2.9x more bugs | What does it represent? | price, count, threshold |
buffer, buf |
2.7x more bugs | Buffer for WHAT? | inputBuffer, responseBuffer |
object, obj |
3.0x more bugs | What IS this? | user, order, session |
Example (Before - Compiler Error):
#!ek9 @Error: PRE_IR_CHECKS: NON_DESCRIPTIVE_VARIABLE_NAME process() temp <- "hello" // ❌ What is temporary? User? Message? Session? flag <- true // ❌ What state? Valid? Active? Ready? data <- fetch() // ❌ What kind of data? User? Order? Config?
After (Fixed - Self-Documenting):
#!ek9 process() temporaryUser <- "hello" // ✓ Clear: temporary user record isPaymentValid <- true // ✓ Clear: payment validation state customerData <- fetch() // ✓ Clear: customer data from database
Don't Worry - Common Patterns ARE Allowed:
1. Mathematical Variables (Always Allowed):
In math/physics code, conventional names are clear:
#!ek9 calculateTrajectory() x <- 3.5 // ✓ Cartesian coordinate y <- 2.0 // ✓ Cartesian coordinate r <- sqrt(x*x + y*y) // ✓ Polar radius theta <- atan2(y, x) // ✓ Polar angle func <- getTransformFunction() // ✓ Function reference
2. Loop Counters (Allowed in FOR Loops):
#!ek9
processMatrix()
for i in 0 to 9 // ✓ Standard loop counter
for j in 0 to 9 // ✓ Nested loop counter
matrix[i][j] <- 0
3. Compound Words (Allowed When Specific):
#!ek9 handleConnection() connectionState <- "active" // ✓ 'state' is suffixed - specific errorHandler <- Handler() // ✓ 'handler' is suffixed - specific
4. Strong Typing Reduces Naming Burden:
When you use EK9's rich type system, simple names are fine:
#!ek9 // GOOD: Type carries the semantics processPayment() amount as Money <- Money(100.0, "USD") // ✓ Money type makes 'amount' clear calculatePhysics() thrust as Dimension <- Dimension(500.0, "N") // ✓ Dimension enforces units
But with bare types, be descriptive:
#!ek9 // BETTER with bare types: processPayment() paymentAmount as Float <- 100.0 // ✓ Descriptive with bare Float
Benefits You'll Notice:
- Self-documenting code: Six months later, you'll thank yourself for clear names
- Onboarding is faster: New team members understand code immediately
- Fewer bugs: Research shows 2.7x-3.4x reduction in defect density
- Better AI assistance: AI uses ~5x fewer tokens with meaningful names
Adjustment Period: This might be the hardest check to get used to. You're breaking a habit. After 1-2 weeks of practice, most developers report they think more clearly about variable purpose before naming, not after. This leads to better design overall.
What This Means for You
Key Takeaways:
- EK9 enforces three naming quality checks that prevent real bugs
- Research shows these checks reduce defects by 2.7x-3.4x
- High-profile failures prove importance: $620M+ in losses, 95+ deaths from poor naming
- Modern benefit: AI assistants work 5x better with meaningful names
How EK9 Is Different:
| Check | Other Languages | EK9 |
|---|---|---|
| E11026 Reference Ordering |
CheckStyle warning (optional) | Compile error (mandatory) |
| E11030 Similar Names |
Not available | UNIQUE to EK9 |
| E11031 Non-Descriptive Names |
Not available | UNIQUE to EK9 |
The EK9 Guarantee: If your EK9 code compiles, it has passed alphabetical reference ordering, similar name detection, and non-descriptive name checks. No configuration required. No warnings to ignore. No bypass mechanisms. Just quality code, guaranteed.
Getting Started (Your First Week):
New to these checks? Here's a realistic adjustment timeline:
- Day 1-2: You'll hit these errors frequently. Use
-E3flag for detailed help. Read the suggestions. This is normal! - Day 3-5: You start thinking about variable names before typing. Error frequency drops significantly.
- Week 2: It becomes natural. You rarely trigger these errors because you're thinking about clarity from the start.
- Month 1: You notice your code is more readable. Code reviews are faster. Onboarding new team members is easier.
Remember: Every EK9 developer went through this adjustment. You're not alone. The strictness feels unusual at first, then it feels liberating - you never have to debate naming standards because the compiler is the standard.
Next Steps:
- See the errors in action: E11026, E11030, E11031 have detailed examples and fixes
- Use verbose mode: Add
-E3to your compile command for detailed explanations and suggestions - Explore related checks: Learn about complexity limits, cohesion metrics, and coupling enforcement
Purity, Safety, and Contracts
EK9's pure keyword isn't just about functional programming correctness —
it enables more accurate quality metrics and better optimization.
Hard Requirements as Contracts
A key part of EK9's safety model is the require statement. Unlike in some languages where assertions are
disabled in production code, EK9 requirements are hard requirements that are always active.
They generate a dedicated REQUIRE instruction in the Intermediate Representation, ensuring they become
a permanent part of the compiled code. This powerful feature enables a form of "Design by Contract", where functions
can enforce that critical pre-conditions or post-conditions are met, guaranteeing program correctness at runtime.
How Purity Enables Quality
- Eliminates hidden coupling: Pure functions cannot depend on mutable global state, so all dependencies are explicit in the signature
- Accurate coupling measurement: Since there are no hidden dependencies, Ce (efferent coupling) accurately reflects true coupling
- Enables parallelization: Pure functions can be called in parallel without synchronization, enabling the compiler to optimize for multi-core processors
- Simplifies testing: Pure functions are deterministic — same inputs always produce same outputs, making them trivially testable
EK9 implements a three-tier purity model:
- Compile-time enforcement: The
purekeyword is checked by the compiler - Pragmatic I/O: I/O operations are tracked but allowed in non-pure contexts
- Controlled mutation: Mutation is explicit and contained, not hidden
Operator Semantic Constraints
Beyond structural metrics, EK9 enforces semantic constraints on operators at compile-time. This ensures operators behave correctly and consistently across all types. See Operators for the full operator reference.
Return Type Constraints
Each operator category has strict return type requirements:
| Operator Category | Operators | Required Return Type | Error Code |
|---|---|---|---|
| Comparison | <, <=, >, >=, ==, <> |
Boolean | E07520 |
| Comparator | <=>, <~> |
Integer | E07550 |
| Is-Set | ? |
Boolean | E07520 |
| Hash Code | #? |
Integer | E07550 |
| String Conversion | $ |
String | E07570 |
| JSON Conversion | $$ |
JSON | E07580 |
| Complement/Absolute | ~, abs |
Same as construct type | E07410 |
| Promotion | #^ |
Different from construct type | E07420 |
Purity Constraints
Operators are divided into pure (no side effects) and mutating categories:
- Must be pure (E07500): All comparison operators, arithmetic operators, conversions, and queries
- Cannot be pure (E07510): Mutating operators like
:=:(copy),++,--,+=,-=
Defaulting Requirements
When using the default keyword to auto-generate comparison operators, the compiler
validates that all required underlying operators exist:
- Comparison operators (
<,<=,>,>=,==,<>) require the<=>comparator on the type and all properties (E07180, E07200) - Hash code (
#?) requires hash code on all properties - String conversion (
$) requires string conversion on all properties
This ensures that defaulted operators have correct semantics — a comparison can only be auto-generated if the type knows how to compare all its parts.
Parameter Safety
EK9 enforces parameter immutability at compile-time, preventing a common source of bugs where function parameters are accidentally modified.
Parameter Count Limits
Functions and methods have a hard limit of 20 parameters. Exceeding this triggers E11010 (Excessive Complexity). Additionally, parameter count contributes to complexity scoring:
- 3+ parameters: adds 1 complexity point
- 5+ parameters: adds 2 complexity points
- 20+ parameters: compilation error
This encourages grouping related parameters into records, improving API design.
No Parameter Reassignment
Function and method parameters cannot be reassigned within the function body (E08110). This eliminates bugs where the original parameter value is lost:
#!ek9 @Error: FULL_RESOLUTION: NO_INCOMING_ARGUMENT_REASSIGNMENT processData() -> data as Integer <- result as Integer? data := data + 1 //Error: cannot reassign parameter result: data * 2
#!ek9 processData() -> data as Integer <- result as Integer? adjusted <- data + 1 //Create new variable instead result: adjusted * 2
Why This Matters
- Prevents shadowing bugs: Original parameter value is always available
- Improves readability: Parameters are guaranteed to hold their original values
- Enables optimization: Compiler knows parameters won't change, enabling better code generation
- Matches functional style: Aligns with EK9's support for pure functions
ARI Readability Scores (Informational)
The Automated Readability Index (ARI) measures lexical complexity — how easy your code is to read at the character/word level. Unlike complexity metrics that measure control flow, ARI evaluates:
- Identifier length — Longer variable/function names increase ARI
- Statement structure — More tokens per line increases ARI
- Whitespace usage — Consistent indentation improves readability
No Hard Limit: Unlike other quality metrics, ARI does not produce compile-time errors. It is shown in coverage reports as an informational aid to help developers understand their code's lexical density.
Why No Enforcement?
Domain-specific terminology often requires long, precise identifiers. A finance application
using calculateAmortizedPrincipalPayment or a chemistry module using
hydrochlorofluorocarbon has high ARI scores by necessity — these names
are correct and meaningful in their domain. Forcing shorter names would harm clarity.
Score Interpretation (1-20 scale)
| ARI Score | Typical Domain | Interpretation |
|---|---|---|
| 1-6 | General utilities, simple CRUD | Simple lexical structure |
| 7-10 | Business logic, web services | Moderate complexity |
| 11-14 | Finance, healthcare, legal | Domain terminology expected |
| 15-20 | Scientific computing, research | Complex domain — acceptable |
Why Track Readability?
Even without enforcement, visibility helps developers:
- Compare files — A utility module with ARI 15 may need review; a physics module with ARI 15 is normal
- Identify outliers — Unusually high ARI in simple code suggests naming improvements
- AI collaboration — Lower ARI improves AI code generation accuracy (fewer tokens per concept)
Human and AI Benefits
For humans: Readability scores highlight files that may benefit from clearer naming (where domain permits). For AI assistants: Lower ARI means fewer tokens per concept, improving code generation accuracy and reducing context window usage.
Viewing Quality Metrics in Reports
EK9 displays quality metrics alongside coverage data in the
HTML coverage reports generated with -t6.
This section explains what each visual element means.
Dashboard Quality Dials
The main dashboard (index.html) shows four quality metric dials:
| Dial | What It Shows | Good Range |
|---|---|---|
| Avg Complexity | Average cyclomatic complexity across all functions | ≤10 (green) |
| Max Complexity | Highest cyclomatic complexity of any single function | ≤20 (green), 21-30 (orange) |
| Avg Cognitive | Average cognitive complexity across all functions | ≤10 (green) |
| Max Cognitive | Highest cognitive complexity of any single function | ≤15 (green), 16-25 (orange) |
The dial fill indicates the metric value relative to the compile-time limit. For example, "15 / 45 limit" means the function has CC=15 against the E11010 limit of 45.
Module and File Quality Metrics
Module detail pages and file summary pages show the same four dials, calculated for that specific scope (all functions in the module, or all functions in the file).
Source Code Complexity Badges
In the source code view, each function displays a complexity badge like CC:12.
Badge colors indicate severity:
| Color | CC Range | Meaning |
|---|---|---|
| Green | ≤10 | Good, maintainable |
| Orange | 11-20 | Consider refactoring |
| Red | >20 | High complexity, should refactor |
Complexity Badge Tooltips
Hover over any complexity badge to see detailed metrics for that function:
Code Quality Metrics Cyclomatic Complexity: 12 / 45 limit ✓ OK → Number of independent paths through code Cognitive Complexity: 8 / 35 limit ✓ OK → How hard the code is to understand Nesting Depth: 3 / 6 limit ✓ OK → Maximum depth of nested control structures Status: All metrics within acceptable range
Status indicators:
- ✓ OK — Metric is within good range
- ℹ Monitor — Metric is elevated but acceptable
- ⚠ Warning — Metric needs attention
File Readability Bars
On module pages, each source file shows a readability bar alongside its coverage bar:
Source Files ┌───────────────────────────────────────────────────────────────┐ │ utils.ek9 87.5% ████████░░ 14/16 [Readability: 6] │ │ helpers.ek9 60.0% ██████░░░░ 3/5 [Readability: 8] │ │ validators.ek9 33.3% ███░░░░░░░ 2/6 [Readability: 12] │ └───────────────────────────────────────────────────────────────┘
The readability number is the ARI score (1-20 scale). Lower is simpler.
Generating Reports
To generate the HTML quality and coverage report:
ek9 -t6 main.ek9
The report is created in .ek9/coverage/. Open index.html in a
browser to view. See HTML Coverage Reports for
complete documentation of the report structure and navigation.
Impact on AI-Assisted Development
As AI coding assistants become more prevalent, EK9's compile-time quality enforcement provides unique benefits for human-AI collaboration.
The AI Code Quality Problem
AI coding assistants (like GitHub Copilot, Claude, and others) can generate code quickly, but often produce code with quality issues:
- Excessive complexity: AI tends to generate long functions with many conditions
- High coupling: AI may add unnecessary dependencies to solve problems quickly
- Duplication: AI doesn't always recognize when code should be refactored into shared utilities
- Deep hierarchies: AI may follow inheritance patterns from training data
How EK9 Helps
EK9's compiler provides immediate feedback that helps both humans and AI improve:
- Constraint-based learning: AI learns from compiler errors, improving suggestions over time
- Guaranteed maintainability: AI-generated code that compiles is guaranteed to meet quality standards
- Objective standards: No debate about code quality — the compiler is the arbiter
- Verbose errors: The
-E3flag provides detailed explanations that help AI understand issues
Enterprise Benefits
For enterprise teams adopting AI coding assistants, EK9 provides crucial safeguards:
- Zero configuration: No need to set up and maintain external quality tools
- Consistent standards: Same quality bar across all projects and teams
- Faster onboarding: New developers (and AI) learn good design from compiler feedback
- Technical debt prevention: Cannot accumulate quality debt through AI-generated code
Complete AI Reference: The For AI Assistants page provides comprehensive documentation for AI coding assistants, including: JSON output schemas, metric thresholds with recommended actions, EK9-specific refactoring patterns, and workflow templates for iterative quality improvement.
Implementation Roadmap
EK9's quality enforcement is being implemented in phases:
Completed (2024-2025)
- Phase 1: Complexity Limits (E11010, E11011, E11013, E11021) — functions, nesting, expressions, cognitive complexity
- Phase 2A: Cohesion Metrics (E11014) — LCOM4 for classes and components
- Phase 2B: Coupling Metrics (E11015) — efferent coupling for all construct types
- Phase 2C: Inheritance Depth (E11019) — all construct types with construct-specific thresholds
- Phase 2D: Size Limits NCSS (E11012) — statement counts for functions (150), methods (100), operators (50)
- Phase 2E: Combined Complexity×Size (E11020) — novel product-based threshold catching "both moderately high" scenario (research: NASA SATC)
- Phase 2F: Construct Misuse Detection (E11022-E11025) — detects class/component/delegation misuse patterns including Blob/God Class (research: Martin SRP, Brown AntiPatterns, Marinescu IEEE ICSM 2004)
- Phase 2G: Module-Level Metrics (E11016, E11017) — module coupling (>10 external modules) and module cohesion (disconnected construct groups) for large modules
- Phase 2H: Naming Quality Enforcement (E11026, E11030, E11031) — alphabetical reference ordering, confusingly similar names (Levenshtein distance ≤2), and non-descriptive variable names (research-backed bans with 2.7x-3.4x defect correlation). UNIQUE features no other language provides
Planned (2026)
- Phase 3A: Duplicate Code Detection — IR-based similarity analysis (>70% = error)
- Phase 3B: Unused Capture Detection (E11018) — detect captured variables in dynamic functions that are never used, indicating potential bugs or unnecessary memory retention
Available Now (2025)
- Phase 4A: Input Security Sanitization — Built-in runtime sanitization for SQL injection,
XSS, command injection, path traversal, XXE, and SSTI attacks. See Security Types
for
Sanitizer,DefaultSanitizer,CefSanitizer, andSanitizationContext
Future (2027+)
- Phase 4B: Compile-Time Security Pattern Enforcement — hardcoded credentials detection, insecure configuration patterns
- Phase 5: Style Enforcement — consistent formatting as part of compilation
Deferred (Requires IR Generation Completion)
- Phase 2D: Afferent Coupling (Ca) — "who depends on this construct" analysis
- Phase 2E: Module Instability — Ca/(Ca+Ce) ratio calculations
Summary
EK9's compile-time quality enforcement represents a fundamental shift in how programming languages approach code quality. By making quality mandatory rather than optional, EK9 eliminates technical debt at its source.
The EK9 Guarantee: If your code compiles, it has met rigorous quality standards for complexity, cohesion, coupling, and inheritance depth. There is no gray area, no accumulated warnings, and no technical debt.
Next Steps
- See the Error Index for detailed documentation of all quality error codes (E11010-E11019)
- Learn about composition and delegation as an alternative to deep inheritance
- Explore packaging to understand how EK9 manages dependencies
- Review command line options including the
-E0to-E3error verbosity levels