scheme-langserver

Workspace

The analysis/workspace.sls module is the central coordinator of the language server. It owns the mutable state that represents an opened project (the workspace), drives the multi-phase analysis pipeline, and handles incremental updates when files change.


1. Overview

A workspace aggregates four large sub-structures:

Sub-structure Module Purpose
Virtual file system (VFS) virtual-file-system/file-node Tree of directories / files discovered on disk.
Document store virtual-file-system/document Per-file AST (index-node), raw text, line-length cache, and diagnostic list.
Library index virtual-file-system/library-node Hierarchical index of (library ...) identifiers extracted from files.
Dependency graph analysis/dependency/file-linkage Directed graph encoding import / load relationships between files.

In addition the workspace record stores configuration flags (threaded?, type-inference?, top-environment), a mutex for parallel analysis, and a list of undiagnosed paths used by the LSP diagnostic provider.

All operations on a workspace fall into one of three categories:

  1. Initialization – build every sub-structure from scratch.
  2. Incremental refresh – patch a changed file and re-analyse the smallest affected transitive closure.
  3. File-system mutations – add or remove files from the VFS.

2. Workspace Record

(define-record-type workspace
  (fields
    (mutable file-node)          ; root of the VFS tree
    (mutable library-node)       ; root of the library tree
    (mutable file-linkage)       ; dependency graph instance
    (immutable mutex)            ; Chez Scheme mutex or '()
    (immutable facet)            ; file-filter predicate
    (immutable threaded?)        ; enable parallel analysis?
    (immutable type-inference?)  ; enable type inference?
    (immutable top-environment)  ; 'r6rs | 'r7rs | 's7 | 'goldfish
    (mutable undiagnosed-paths)))

The protocol creates the mutex automatically when threaded? is true. The three mutable fields (file-node, library-node, file-linkage) are replaced wholesale during a full refresh, while undiagnosed-paths is appended to incrementally.


3. Initialization Pipeline

3.1 init-workspace

init-workspace is the main entry point used by the LSP server when a folder is opened. It accepts several optional parameters and resolves defaults left-to-right:

path → identifier → top-environment → threaded? → type-inference? → facet

Typical call from the server:

(init-workspace "/project" 'akku 'r6rs #t #f)

The pipeline executed inside is strictly ordered:

init-virtual-file-system
        ↓
init-library-node
        ↓
init-file-linkage
        ↓
get-init-reference-batches   (from file-linkage)
        ↓
init-references

Why this order matters:

3.2 init-virtual-file-system

Recursively walks the directory tree starting at path. For every path accepted by the facet filter it creates a file-node. If the path is a regular file it also creates a document via init-document.

The resulting tree is later navigated with walk-file (from virtual-file-system/file-node).

3.3 init-document

Performs the first parse of a source file:

  1. Reads the file as a string (read-string).
  2. Chooses a meta-library based on top-environment ((scheme base) for r7rs, (chezscheme) for r6rs, etc.).
  3. Creates a document with URI, text, and the meta-library’s identifier table.
  4. Runs source-file->annotations on the already-read string (not re-reading from disk) to produce a list of annotated AST nodes.
  5. Wraps each annotation in an index-node and stores the list in the document.

If the file cannot be read, an empty document is produced.

3.4 init-library-node

Walks the VFS tree and extracts library identifiers from every file. Files that contain a (library ...) form are inserted under their identifier path in the hierarchical library-node tree. Files without a library declaration (script files) are attached directly to the root library node so that they still participate in the dependency graph and incremental refresh pipeline.

The root library node is created automatically; children are attached by make-library-node.

3.5 init-file-linkage

Builds the dependency graph. See doc/file-linkage.md for details. The result is an adjacency matrix plus bidirectional path↔id maps.

3.6 init-references

Accepts a list of batches (each batch is a list of file paths).

If threaded? is true the batch is processed inside a with-mutex block. Before dispatching parallel work, init-references first serially extracts syntax-diagnoses and clears per-document state (document-diagnoses-set! and clear-references-for) for every path in the batch. It then uses threaded-map to run private-init-references on each path concurrently. If false, plain for-each is used and the same extract-then-clear-then-analyse sequence happens serially.

private-init-references performs the actual per-file analysis:

  1. Runs the abstract interpreter (step) with the current file-node, library-node, linkage, and document.
  2. Runs process-library-identifier-excluded-references to resolve identifiers that are not covered by the library system (e.g. top-level bindings).
  3. Optionally runs construct-substitutions-for (type inference) if type-inference? is enabled. Errors during type inference are caught and logged as warnings rather than crashing the server.
  4. Marks the document as not refreshable (document-refreshable?-set! document #f), indicating it is up-to-date.

4. Incremental Update

When the user edits a file the LSP client sends textDocument/didChange. The server calls update-file-node-with-tail followed later by refresh-workspace-for.

4.1 update-file-node-with-tail

Updates a single document with new text and prepares the workspace for a minimal re-analysis.

Steps:

  1. Snapshot old library identifiers of the target file.
  2. Parse new text with source-file->annotations and replace the document’s index-node-list and text.
  3. Mark reverse dependents refreshable: using the old linkage graph, every file that references the target file (via get-reference-path-to) is marked refreshable? = #t.
  4. Compare library identifiers: if the set of (library ...) names changed, the file may have changed its export contract.
    • Detach the file from old library nodes (and prune empty library nodes) using for-each.
    • Re-attach it under the new library identifiers using for-each.
    • Rebuild the entire dependency graph (init-file-linkage) because the mapping from imports to files may have shifted globally.
    • Mark the new reverse dependents refreshable using the new linkage graph.

This is the most expensive path; in practice library-header edits are rare compared to body edits.

4.2 refresh-workspace-for

Performs the actual re-analysis after a document has been marked refreshable.

  1. If the document is not refreshable, does nothing.
  2. If the file has no library identifiers, appends its path to undiagnosed-paths and re-analyses that single file only.
  3. Otherwise:
    • Calls refresh-file-linkage&get-refresh-path to update the linkage row for this file and obtain the transitive closure of affected files (backwards along dependency edges).
    • Restricts the closure to files that are currently marked refreshable.
    • Calls shrink-paths (topological batching from file-linkage) to order the refreshable subset into parallel-safe batches.
    • Appends all affected paths to undiagnosed-paths so that the diagnostic publisher knows which files need fresh diagnostics.
    • Runs init-references on the batches.

4.3 refresh-workspace

A blunt but safe escape hatch: rebuilds the VFS, library tree, and linkage graph from scratch, then re-analyses everything and resets undiagnosed-paths to the full set of analysed paths. Used when the server detects a situation that incremental logic cannot handle reliably.


5. File-System Mutations

A helper allows the VFS to reflect file creation without a full workspace rebuild.

5.1 attach-new-file

Inserts a new path into the existing VFS tree.


6. Threading & Type Inference

6.1 Threading

When threaded? is #t:

Because the entire batch (serial pre-phase + threaded-map) happens inside the mutex, editor sync and background analysis are fully isolated for the duration of the batch. Batches themselves are processed serially by the single-consumer request-queue, while files inside a batch run in parallel under the mutex. See §7 Workspace Mutex for the design rationale.

6.2 Type Inference

When type-inference? is #t, private-init-references calls construct-substitutions-for after the abstract interpretation step. Any exception raised by the type engine is caught and logged with warning; the document is still marked up-to-date so that analysis of downstream files can proceed. This prevents a single complex file from freezing diagnostics for the whole project.


7. Call Chains & Integration

7.1 Server startup

protocol/connection.sls  or  run.ss
        ↓
init-workspace
        ↓
init-virtual-file-system → init-document → source-file->annotations
        ↓
init-library-node
        ↓
init-file-linkage
        ↓
get-init-reference-batches → shrink-ids
        ↓
init-references → private-init-references → step
                                              construct-substitutions-for (optional)

7.2 Document change (LSP textDocument/didChange)

protocol/apis/did-change.sls
        ↓
update-file-node-with-tail
        ↓ (later)
refresh-workspace-for
        ↓
refresh-file-linkage&get-refresh-path
        ↓
shrink-paths
        ↓
init-references

7.3 Diagnostic publication

workspace-undiagnosed-paths is consumed by the diagnostic loop (typically in the connection handler). After publishing diagnostics for a path the server removes it from the list.


8. Design Notes


7. Workspace Mutex

7.1 Design purpose

workspace-mutex is not a generic lock that protects every mutable field in the workspace. Its purpose is specific:

Isolate editor document-sync operations from background analysis operations so that the workspace is never in a partially-updated state while step or clear-references-for is running.

In other words, it is a read/write exclusion barrier between two actors:

Actor Operations Files
Editor (write) didChange, didOpen, didClose, did-change-watched-files protocol/apis/document-sync.sls, protocol/apis/file-change-notification.sls
Background analysis (read + derived write) init-referencesstep → write references/diagnoses analysis/workspace.sls

When threaded? is #f the mutex is '() and never acquired; the single thread naturally serialises everything. When threaded? is #t the mutex is created via (make-mutex) and used at every boundary where editor traffic and analysis could otherwise interleave.

7.2 Why this matters

update-file-node-with-tail (called by didChange) performs a wholesale replacement of a document’s core state:

  1. Re-tokenises the text (source-file->annotations), producing new document-diagnoses.
  2. Rebuilds the AST (document-index-node-list).
  3. Replaces document-text and document-line-length-vector.
  4. Updates file-linkage and library-node if the library header changed.

If step (or clear-references-for) is traversing the old index-node-list while didChange swaps the tree out from under it, the result is a dangling pointer or a half-initialised node — exactly the kind of crash that c752796 fixed by serialising the clear phase.

7.3 Critical section in init-references

(if (workspace-threaded? workspace-instance)
  (with-mutex (workspace-mutex workspace-instance)
    (let ([path+syntax-pairs
        (map
          (lambda (path)
            (let* ([...]
                [syntax-diagnoses 
                  (filter (lambda (d) (string-prefix? "Syntax error:" (cadddr d))) 
                    (document-diagnoses document))])
              (document-diagnoses-set! document '())
              (clear-references-for (car index-node-list))
              (cons path syntax-diagnoses)))
          paths)])
      (threaded-map 
        (lambda (pair) (private-init-references workspace-instance (car pair) (cdr pair)))
        path+syntax-pairs)))
  ...)

The with-mutex block now covers the entire batch: serial extraction of syntax-diagnoses, clearing of per-document state, and the subsequent threaded-map parallel analysis. This guarantees that no editor didChange can interleave with step or clear-references-for at any point during the batch.

Trade-off: holding the mutex for the full duration of threaded-map means didChange notifications are blocked until the batch finishes. In practice the batches are small (topological slices from shrink-paths) and the blocking time is acceptable. The alternative — running threaded-map outside the mutex — left a theoretical race where didChange could replace document-index-node-list while a default-pool worker was still traversing the old tree via step.

7.4 Critical section in document-sync.sls

(define (did-change workspace params)
  (let ([body (lambda () ... (update-file-node-with-tail workspace file-node text) ...)])
    (if (null? (workspace-mutex workspace))
      (body)
      (with-mutex (workspace-mutex workspace) (body)))))

The body of did-change mutates document text, re-parses, and rebuilds the index-node tree. Wrapping it in workspace-mutex ensures these mutations are atomic with respect to init-references.

7.5 Critical section in file-change-notification.sls

did-change-watched-files (file-system watcher events such as git checkout) can attach, update, or delete file nodes. It uses the same pattern:

(if (null? (workspace-mutex workspace))
  (body)
  (with-mutex (workspace-mutex workspace) (body)))

7.6 Relationship to request-queue-mutex

The project deliberately maintains two separate locks:

Lock Protected resource Held during
request-queue-mutex queue (slib queue) and tickal-task-list push, pop, remove:from-request-tickal-task-list
workspace-mutex Workspace mutable state (document, index-node, linkage, library-node) init-references entire batch (serial pre-phase + threaded-map), didChange, expire callback

Once a worker thread dequeues a task, request-queue-pop returns a thunk and releases the queue mutex before the thunk is invoked. The actual execution of request-processor (and the engine that wraps it) runs outside the queue mutex. This prevents a slow request from starving the I/O thread or the timer thread.

workspace-mutex is acquired only when the thunk actually touches workspace state.

7.7 Single-threaded fallback

When threaded? is #f, (workspace-mutex workspace) is '(). Every call site checks (null? (workspace-mutex workspace)) and skips the lock. In this mode the request-queue still exists but has only one worker thread, so natural serialization makes the mutex unnecessary.