This file contains project conventions, build steps, test patterns, and gotchas that are useful for AI agents working on the scheme-langserver codebase.
scheme-langserver is a Language Server Protocol (LSP) implementation for Scheme, written in Chez Scheme and managed with the Akku package manager.
Key subsystems:
virtual-file-system/ — File-node tree, library-node tree, documents, index-nodesanalysis/ — Tokenizer, abstract interpreter, identifier reference resolution,
type inference, dependency graph (file-linkage)protocol/ — LSP message parsing and API handlersutil/ — Shared utilities (matrix, dedupe, path, io, etc.)The server supports multiple Scheme dialects: r6rs (default), r7rs, s7.
scheme binary)akku binary) for dependency management# Always source this before running anything
source .akku/bin/activate
This sets CHEZSCHEMELIBDIRS so Chez can find libraries under .akku/lib/ and
.akku/libobj/.
bash build.sh
This produces a static binary via compile-chez-program run.ss --static.
scheme --script run.ss
# or
./run
Tests use SRFI-64 ((srfi :64 testing)).
Boilerplate at the top of every test file:
#!/usr/bin/env scheme-script
;; -*- mode: scheme; coding: utf-8 -*- !#
;; Copyright (c) 2022 WANG Zheng
;; SPDX-License-Identifier: MIT
#!r6rs
(import
(chezscheme)
(srfi :64 testing)
...)
Basic pattern:
(test-begin "group-name")
(test-equal expected actual)
(test-equal #t (predicate? value))
(test-end)
(exit (if (zero? (test-runner-fail-count (test-runner-get))) 0 1))
Single test file (fast, preferred during development):
source .akku/bin/activate
scheme --script tests/analysis/dependency/test-file-linkage.sps
All tests (slow, run via test.sh):
bash test.sh
.so cacheAkku caches compiled .so files under .akku/libobj/. After editing any
.sls source file, delete the corresponding .so cache before running tests,
or you will see errors like:
incompatible fasl-object versionvariable <name> is not boundSafe incantation after editing analysis/**/*.sls:
rm -rf .akku/libobj/scheme-langserver
If the error persists, also clear workspace-level caches:
rm -f .akku/libobj/scheme-langserver/analysis/workspace.chezscheme.so
rm -f .akku/libobj/scheme-langserver/analysis/workspace.chezscheme.wpo
Workspace fixtures live under tests/resources/workspace-fixtures/<name>/.
A minimal fixture for testing workspace / linkage / identifier analysis:
tests/resources/workspace-fixtures/simple-lib/
├── lib.scm.txt # r6rs library source (renamed to .txt for txt-filter)
└── consumer.scm.txt # another library that imports lib
Use .scm.txt extension so generate-txt-file-filter accepts them.
Initialize in tests with:
(let* ([fixture (string-append (current-directory)
"/tests/resources/workspace-fixtures/simple-lib")]
[workspace (init-workspace fixture 'txt 'r6rs #f #f)]
...)
...)
init-workspace arguments:
path — absolute path to fixture directory'txt — use generate-txt-file-filter'r6rs — top environment (also 'r7rs, 's7)#f — threaded? (use #f in tests)#f — type-inference? (use #f unless testing type inference)Helper for locating children:
(find (lambda (child) (string=? (file-node-path child) expected-path))
(file-node-children root-file-node))
Or use walk-file for recursive lookup:
(walk-file root-file-node (string-append fixture "/math.scm.txt"))
Observed conventions in existing code:
let / let* bindings: +4 spaces from the let keyword.let / let* body: +2 spaces from the let keyword.let body: continue +2 per nesting level (do not flatten everything to the same column).
```scheme
(import
(chezscheme)
(srfi :64 testing))(test-begin “group-name”) (let* ([foo (init-foo)] [bar (workspace-bar foo)] [baz (construct-baz bar)]) (process-baz baz) (test-equal #t (contain? (type:interpret-result-list baz) check-base))) (test-end) ```
kebab-case<record>-<field> (e.g. file-node-path)private:<name> or just internal define(library (scheme-langserver <path>) ...).; for inline, ;; for section dividers inside functions.Prefer for-each over map when the result is discarded (side-effect only).
This is a common fix in the codebase.
string=? for stringsequal? for lists / deep structureseq? for symbols and small integers= for numeric comparison only| Layer | May import from |
|---|---|
protocol/ |
analysis/, virtual-file-system/, util/ |
analysis/ |
virtual-file-system/, util/ |
virtual-file-system/ |
util/ only |
util/ |
nothing inside the project (only standard libs) |
Never let analysis/ import protocol/.
directory-list returns bare filenames(directory-list "/some/dir")
;; => ("foo.sls" "bar.sls") -- NOT full paths
Always prepend the directory when constructing child paths:
(string-append dir (string (directory-separator)) entry)
source-file->annotations has two arities(source-file->annotations path) — re-reads from disk(source-file->annotations source path) — parses the provided stringPrefer the 2-arity version when you have already read the file into memory, to avoid double I/O.
(library (name) ...) header.
get-library-identifiers-list returns a non-empty list.get-library-identifiers-list returns '().This distinction affects:
init-library-node — script files attach directly under the root library-noderefresh-workspace-for — script files bypass the linkage graph and go straight
to undiagnosed-pathspath->uri and uri->pathLocated in util/path.sls. The URI format is file:///absolute/path.
path->uri now correctly handles . and .. in relative paths.
util/matrix.slsencode / decode use row-major order.matrix-expand grows a square matrix; matrix-shrink removes a row/column.(sqrt (vector-length matrix)).inner:pair? vs inner:list? in the type systemIn Scheme a list is a chain of pairs terminated by '(), so (cons x '()) is both a pair and a list. The langserver type system distinguishes them:
| Type | Meaning | Typical producers |
|---|---|---|
inner:pair? |
Any cons cell (proper or improper list) |
cons, list (single element) |
inner:list? |
Proper list (chain of pairs ending in '()) |
'(), append, reverse, list (≥0 elements) |
Trap: cons’s type rule in rnrs-meta-rules.sls returns inner:pair?, while append returns inner:list?. If you rewrite an accumulator loop from (append result (,x)) to (cons x result), the type inferrer sees the recursive argument as inner:pair? instead of inner:list?, which can break substitution generation for named-let bindings. The fix is to keep append (or add a reverse at the return point and teach the type system that (cons x )` → `
`).
ufo-match wildcardufo-match uses :_ as the “match anything, don’t bind” wildcard, not _.
_ is treated as a normal pattern variable.
grep -r "library-import-process" tests/
Look at the (export ...) list at the top of the .sls file.
The server can write a structured log (read-message / send-message pairs with timestamps) that is invaluable for tracking down latency or silent crashes.
Key technique: compare read-message timestamps with send-message timestamps for the same id.
# Extract request/response timeline
awk '
/^(read-message|send-message)$/ { mode=$0; next }
/^2026 / { if(mode!="") ts=$0; next }
mode=="read-message" && /"id":13,/ { printf "req id=13 @ %s\n", ts }
mode=="send-message" && /"id":13,/ { printf "resp id=13 @ %s\n", ts }
' ~/ready-for-analyse.log
If send-message stops but read-message continues, the main loop is alive but the request-queue worker thread is stuck or dead. Check:
init-references under workspace-mutex?make-engine + expire interacting badly with workspace-mutex?type:interpret → private-generate-cartesian-product-procedure) be throwing uncaught exceptions inside the engine wrapper?Replay scripts
bin/log-debug.sps — single-threaded replay (#f threaded). Fast, good for verifying fixes.bin/parallel-log-debug.sps — multi-threaded replay (#t threaded). Closer to real clients, but request ordering differs because all messages are injected instantly.A vector-in-list bug to watch for
analysis/type/substitutions/rules/trivial.sls defines index-of using car/cdr/null?. If a caller passes a vector (e.g. (index-of (list->vector rests) index-node)), the car call throws "~s is not a pair". In multi-threaded mode this exception may be swallowed by the engine layer instead of reaching private:try-catch, leaving the worker thread dead and all subsequent requests orphaned.
check-duplicate-identifiers and collect-parameter-pairsTwo helpers live in analysis/identifier/util.sls (extracted from reference.sls):
check-duplicate-identifiers document pairs — takes a list of (symbol . index-node) pairs, detects duplicates with an eq-hashtable, and appends a "Duplicate identifier: ..." diagnosis (severity 1 / Error).collect-parameter-pairs index-node — recursively extracts parameter symbols and their index-nodes from lambda/define parameter lists; handles flat lists, nested lists, and improper-list rest args. Returns a list of (symbol . index-node) cons cells.Used in lambda.sls, case-lambda.sls, let.sls, letrec.sls, let-values.sls, do.sls, define.sls, and with-syntax.sls.
usage-count trackingThe identifier-reference record has a mutable usage-count field (default 0).
find-available-references-for (that function is called for internal lookups, guard checks, etc., not all of which represent a genuine “use”).abstract-interpreter.sls when step successfully resolves a leaf symbol (the [else branch of the top-level cond).private:check-unused-imports in workspace.sls scans import clauses after step and reports imported references with usage-count = 0 as "Unused import: ..." (severity 2 / Warning). Supports plain, only, except, rename, and alias imports.--no-verifyThe repository has a pre-commit hook (.git/hooks/pre-commit) that runs the protocol API test suite. Do not bypass it with git commit --no-verify. If the hook fails because tests are too slow or broken, fix the tests or the hook first, then commit normally.
Note: The hook is intentionally slow (often 2–5 minutes on a cold cache) because it runs the full protocol API test suite in parallel. It first compiles shared modules via a warm-up test, then forks the remaining tests. If you see it hanging, it is usually waiting for Chez Scheme to compile
.sofiles, not deadlocked. Be patient, or runbash test.shmanually beforehand to warm the cache.
| Location | Issue | Impact |
|---|---|---|
analysis/identifier/rules/library-import.sls |
alias modifier does not add refs when used inside a (library ...) form (script-level import-process works fine) |
Low — alias is rare in library headers |
analysis/abstract-interpreter.sls:74 |
Missing recursion guard for self-defined macro partial evaluation | Medium — can infinite-loop on certain macros |
protocol/apis/document-sync.sls:44 |
Document sync has a TODO for optimization | Low — performance only |
doc/analysis/file-linkage.md:148 |
Matrix shrink on file deletion is now implemented via shrink-file-linkage! |
Resolved |
analysis/type/substitutions/rnrs-meta-rules.sls:182 |
cons type rule returns inner:pair?, not inner:list?. The type system treats inner:pair? and inner:list? as disjoint. matrix-from/matrix-to now work around this by using cons inside the loop and reverse at the return point, so the accumulator stays correctly typed as inner:list?. No change to cons’s rule itself was needed. |
Resolved — workaround in place |
analysis/type/domain-specific-language/interpreter.sls |
private-with used candy:match-right when input contained **1/.... This fragmented list-valued bindings (e.g. map’s higher-order params) into multiple flat pairs that overwrote each other during fold-left + private-substitute, causing type collapse (e.g. (inner:list? something? ...) → inner:list?). |
Fixed (2025-05-11) — unconditional candy:match-left preserves bindings intact |
analysis/abstract-interpreter.sls:270 |
Global eq-hashtable private:expander-doc-cache-ht was accessed unsafely from threaded-map, causing bucket-list corruption (100% CPU hang or nonrecoverable invalid memory reference). Cache removed; private:find-expander-doc-for-node now computes directly. |
Fixed (2025-05-26) |
analysis/workspace.sls |
bf98f11 added clear-expander-doc-cache! and clear-references-for inside private-init-references, which runs in parallel via threaded-map. Both mutate global/shared state without synchronization. Moved to serial pre-phase before threaded-map (under workspace-mutex). |
Fixed (2025-05-26) |
scheme-langserver.sls:235 |
When the client closes the connection without sending exit, read-message returns #f on EOF. The main loop called thread-pool-stop!, but worker threads were blocked in request-queue-pop’s condition-wait and could never consume the kill-thread job. Deadlock caused the process to remain alive after the client disconnected. |
Fixed (2025-05-26) — (exit 0) on EOF instead of waiting for thread-pool-stop! |
protocol/request.sls:26 |
read-message has no exception guard around parse-content. Malformed JSON (unclosed strings, invalid escapes, NaN/Infinity, non-object roots like []/42/true/null) propagates as unhandled json-error or assq crashes, killing the server. |
High — any malformed LSP message crashes the server |
protocol/request.sls:54 |
read-content does not validate content-length from get-content-length. Negative values, non-numeric strings, or malformed headers (e.g. Content-Length: 10: extra) cause get-bytevector-n to crash. |
High — malformed HTTP-style header crashes the server |
analysis/workspace.sls:150 |
threaded-map calls private-init-references without exception guard. Sub-thread exceptions leave optional-finished? unset, causing de-optional to condition-wait forever while workspace-mutex is held, blocking all subsequent requests. |
Fixed (2025-05-28) — try/except added in threaded-map lambda; errors written to document-diagnoses |
protocol/analysis/request-queue.sls:59 |
expire acquires workspace-mutex when tickal-task-stop? is true. Intent is correct (cancelled task may be updating workspace), but implementation is incomplete (does not wait for sub-threads to finish). Currently harmless because with-mutex is reentrant, but provides no actual protection either. |
Low — retained for future completion |
analysis/workspace.sls |
Attempted post-phase undefined-identifier diagnostic (5545e4c, reverted in 4a13a70). find-available-references-for returns empty for local bindings (let/lambda/define params) as well as truly undefined symbols. Distinguishing the two requires reliable binding-position tracking across all binding forms (including quoted symbols and library-name components), which proved too fragile in the current AST-walker architecture. |
Withdrawn — requires deeper binding-tracking before retry |
# Run a single test file quickly
source .akku/bin/activate && scheme --script tests/analysis/test-workspace.sps
# Clear all compiled caches for the project
rm -rf .akku/libobj/scheme-langserver
# Find all .sls files under analysis/
find analysis -name "*.sls" | sort
# Check which tests import a specific module
grep -rl "library-import" tests/
# Count test assertions in a file
grep -c "test-equal\|test-assert" tests/analysis/dependency/test-file-linkage.sps
# Run LSP message-level robustness tests
source .akku/bin/activate && scheme --script tests/robustness-lsp-replay.sps
# Log replay — single-threaded (deterministic)
source .akku/bin/activate && scheme --script bin/log-debug.sps
# Log replay — multi-threaded (concurrent, closer to real-world)
source .akku/bin/activate && scheme --script bin/parallel-log-debug.sps
# Clear caches before replaying after any code change
rm -rf .akku/libobj/scheme-langserver
# Compare response counts between single-thread and multi-thread replays
# (different counts often reveal concurrency-related bugs)
grep -c '"id":' ~/scheme-langserver.out
Place production logs at ~/ready-for-analyse.log. Both replay scripts reconstruct the LSP JSON-RPC stream and run the server, writing outputs to ~/scheme-langserver.out (responses) and ~/scheme-langserver.log (diagnostics).
Key things to check when responses are missing:
Client cancellation — Search the log for $/cancelRequest with the same id. LSP allows clients to cancel stale requests; the server silently drops them (no response is expected).
I/O errors at EOF — If the client disconnects without sending exit, send-message may fail with Broken pipe. This produces error: failed on ... + Failed to send error response pairs in the log. These are normal I/O errors, not logic bugs.
didChange no longer auto-cancels — As of the LSP-compliance fix, textDocument/didChange only enqueues itself; it no longer wipes pending hover/definition/documentSymbol requests. If you see massive response loss in multi-thread replay, suspect stale .so caches first.
Response diffing — parallel-log-debug.sps should now produce the same (or more) responses as log-debug.sps. If multi-thread returns fewer responses despite the fix, check ~/scheme-langserver.log for exceptions.
init-virtual-file-system — scan directory tree, create file-nodes + documentsinit-library-node — build library-node tree from library headersinit-file-linkage — build dependency adjacency matrixinit-references — run abstract interpreter (step) over all filesupdate-file-node-with-tail (or attach-new-file)refresh-file-linkage&get-refresh-pathshrink-paths produces topological batchesinit-references re-runs step on affected batches| Record | Fields (mutable marked) | Purpose |
|——–|————————|———|
| file-node | path, name, parent, folder?, children, document | VFS node |
| library-node | identifier, parent, children, file-nodes | Library hierarchy |
| document | uri, text, index-node-list, ordered-reference-list, diagnoses | Parsed source |
| index-node | datum/annotations, parent, children, excluded-references, import-in-this-node, export-to-other-node | AST node |
| file-linkage | path->id-map, id->path-map, matrix | Dependency graph |
| identifier-reference | identifier, document, index-node, initialization-index-node, library-identifier, type, parents, type-expressions, usage-count (mutable) | Symbol reference |