This document describes the design and usage of analysis/dependency/file-linkage.sls. This module is responsible for maintaining the file dependency graph (file-linkage) within a scheme-langserver workspace, providing the infrastructure for batch analysis, reference lookup, topological sorting, and incremental refresh.
A scheme-langserver workspace contains many Scheme source files. These files reference each other via import, load, include, etc. The file-linkage module abstracts this reference network as a directed graph and provides the following capabilities:
root-file-node + root-library-node), parse their import / load, and construct an adjacency matrix.shrink-ids, grouping files in a cycle into the same batch.file-linkage Record Type(define-record-type file-linkage
(fields
(mutable path->id-map) ; string-hashtable: path -> integer id
(mutable id->path-map) ; eq-hashtable: integer id -> path
(mutable matrix))) ; vector<integer>: adjacency matrix (flattened)
path->id-map: Maps absolute file paths to consecutive integer IDs using string-hash and equal?.id->path-map: Reverse mapping using eq-hashtable, for converting matrix row/column indices back to paths.matrix: A one-dimensional vector representing a square matrix. If there are N files, the matrix length is N*N. A value of 1 at (i, j) means file i directly depends on file j (i.e., i imports / loads j).The matrix is stored in row-major order using encode / decode from util/matrix.sls:
(encode N i j) => (+ (* i N) j)
(decode N index) => `(,i ,j)
Since N is known at initialization time, all subsequent matrix operations derive the dimension via (sqrt (vector-length matrix)).
init-file-linkageEntry point, supports two calling conventions:
(init-file-linkage root-file-node root-library-node)
(init-file-linkage root-file-node root-library-node top-environment)
Optional top-environment values are 'r6rs (default), 'r7rs, and 's7, which determine which import parser to use.
Initialization happens in two steps:
init-maps: Recursively traverses all library-nodes and file-nodes under root-library-node, assigning a unique integer ID to each file and populating path->id-map and id->path-map.init-matrix: Traverses all files again. For each file, parses its AST (via document-index-node-list) and extracts:
get-imported-libraries-from-index-node)load-process)Then writes 1 to the corresponding position in the matrix.
(get-imported-libraries-from-index-node root-library-node index-node [top-environment])
This function selects a different processor based on top-environment:
| Environment | Processor |
|---|---|
r6rs |
library-import-process |
r7rs |
library-import-process-r7rs |
s7 |
library-import-process-r7rs |
The processor extracts the list of imported library identifiers from the index-node, then uses walk-library to find the corresponding library-node under root-library-node, and finally returns the paths of all file-nodes under those library nodes.
(file-linkage-take linkage from-path to-path) ; => 0 or 1
(file-linkage-set! linkage from-path to-path) ; sets to 1 (default)
(file-linkage-from linkage from-path) ; => list of files directly depended on by from-path
(file-linkage-to linkage to-path) ; => list of files that directly depend on to-path
Internally implemented by scanning the corresponding row/column of the matrix.
(get-reference-path-to linkage to-path) ; => all (direct + indirect) files that depend on to-path
(get-reference-path-from linkage from-path) ; => all files that from-path (direct + indirect) depends on
Internally uses linkage-matrix-to-recursive and linkage-matrix-from-recursive for BFS/recursive expansion. The result includes the starting file itself.
(file-linkage-head linkage)
Returns a list of files with in-degree 0 (i.e., files not depended on by any other file). In the scheme-langserver batch processing flow, these files are typically the starting points for analysis.
refresh-file-linkage&get-refresh-pathWhen a user modifies or saves a file in the editor, there is no need to rebuild the entire graph. This function handles incremental updates:
(refresh-file-linkage&get-refresh-path
linkage root-library-node file-node
new-index-node-list new-library-identifier-list
[top-environment])
Steps:
id for the file. If it is a newly added file and new-library-identifier-list is non-empty, dynamically expand path->id-map / id->path-map, and expand the matrix capacity via matrix-expand.old-imported-file-ids) and its new dependency set after modification (new-imported-file-ids).0) and write new dependencies (set to 1).reference-id-from)reference-id-to)shrink-file-linkage!When a file is physically deleted from the workspace, its row and column must be removed from the linkage matrix to keep the graph consistent. This function performs a matrix shrink:
(shrink-file-linkage! linkage removed-path)
Steps:
id of removed-path in path->id-map.path->id-map / id->path-map pairs, skipping the removed id and renumbering all higher ids down by one so ids stay contiguous.matrix-shrink to produce a new (N-1) × (N-1) matrix that copies every value except the removed row and column.file-linkage record.This is called from protocol/apis/file-change-notification.sls (did-delete) so that deletions are reflected immediately in the dependency graph.
get-init-reference-batches(get-init-reference-batches linkage)
Splits the file dependency graph into topological batches (list<list<path>>). Example usage:
(let ([batches (get-init-reference-batches linkage)])
; batches looks like '((file-a file-b) (file-c) (file-d file-e))
; Files within the same batch do not depend on each other (or are in the same cycle),
; so they can be analyzed in parallel.
; Different batches have a strict dependency direction and must be analyzed sequentially.
)
shrink-ids: Maximizing Parallelism in Dependency ChainsIn a typical Scheme project, the file-linkage graph contains many long import chains. However, a significant portion of these chains are parallel rather than sequential — they branch out independently and do not depend on each other. The purpose of shrink-ids is to exploit this parallelism by “peeling” the graph layer by layer, grouping as many independent files as possible into the same batch.
The implementation is based on Kahn’s algorithm for topological sorting, adapted to maximize batch size:
(define (shrink-ids matrix ids)
...)
Pre-processing ($O(n^2)$ one-time cost):
eq-hashtable (id-set) for $O(1)$ membership testing of the ids set.ids, scan the matrix row to find neighbors that are also in ids. Build:
out-degrees: internal out-degree of each node within the ids subgraph.in-adj: reverse adjacency list (for each node, which nodes have an edge to it).Layer peeling:
in-adj to decrement the out-degree of its predecessors. When a predecessor’s out-degree drops to 0, it becomes ready for the next batch and is pushed onto the next queue.SCC (cycle) handling:
threaded-map. This is the correct trade-off because cyclic dependencies are invalid in Scheme’s library system anyway; splitting them does not improve analysis accuracy, but keeping them together preserves parallelism.Complexity:
shrink-pathsshrink-paths is the path-wrapper around shrink-ids: it converts paths to IDs, calls shrink-ids, and converts the result back to paths.
Its primary use case is in analysis/workspace.sls inside refresh-workspace-for. When a file is modified, refresh-file-linkage&get-refresh-path returns an “impact set” of paths that need re-analysis. This impact set often contains long import chains. shrink-paths flattens these chains into the minimum number of batches with the maximum number of files per batch, which is then passed to init-references for efficient parallel reference rebuilding.
util/matrix.slsfile-linkage relies on util/matrix.sls for low-level matrix operations:
| Function | Description |
|---|---|
matrix-take |
Reads the value at position (i, j) |
matrix-set! |
Sets the value at position (i, j) |
matrix-from |
Returns all column indices with value 1 in the given row (outgoing neighbors) |
matrix-to |
Returns all row indices with value 1 in the given column (incoming neighbors) |
matrix-expand |
Expands an N x N matrix to (N+1) x (N+1), used when adding a new file |
find-cycle |
Detects cycles in the graph via DFS |
encode/decode |
Converts between 1D vector index and 2D coordinates |
In analysis/workspace.sls, init-workspace calls init-file-linkage:
(let ([linkage (init-file-linkage root-file-node root-library-node top-environment)])
...)
In protocol/apis/references.sls and protocol/apis/document-highlight.sls, get-reference-path-to / get-reference-path-from are used to determine the search scope, avoiding a full workspace scan.
In analysis/abstract-interpreter.sls, get-init-reference-batches is used to obtain topological batches. The abstract interpreter then performs type inference and identifier analysis on each batch in order.
Relevant tests are located at:
tests/analysis/dependency/test-file-linkage.sps
Coverage includes:
init-linkage-matrix: Verifies that the dependency workspace.sls -> io.sls is correctly written into the matrix.get-init-inference-path: Verifies that get-init-reference-batches includes the specified file.file-linkage-to: Verifies reverse lookup (who depends on error-code.sls).r7rs / s7 environment support: Verifies import parsing and dependency graph construction under different Scheme dialects.Expanded coverage (Groups A–E) using dedicated workspace fixtures under tests/resources/workspace-fixtures/:
file-linkage-from (outgoing deps), file-linkage-head (in-degree-0 nodes), file-linkage-set! / file-linkage-take (matrix mutation), and accessor non-empty validation (path->id-map, id->path-map, matrix).get-reference-path-from and get-reference-path-to on simple-lib and two-libs fixtures, including self-inclusion behavior.shrink-paths (direct ordering, single-file, empty input) and get-init-reference-batches (full-project layering) on acyclic fixtures.refresh-file-linkage&get-refresh-path returns correct refresh set for an existing file, and matrix-expand correctly grows a square adjacency matrix while preserving old values and zeroing the new row/column. Also fixes a decode bug in matrix-expand where the old node count was used instead of new-count.shrink-ids).shrink-file-linkage! removes a file from the linkage maps and matrix, renumbers remaining ids, and is a no-op for non-existent paths. Also verifies matrix-shrink produces correct smaller matrices and handles the 1×1 → 0×0 edge case.file-linkage is the core component of the scheme-langserver dependency analysis layer. Its responsibilities can be summarized as:
load relationships from the file system into an adjacency matrix.