banner
Madinah

Madinah

github
twitter
telegram
email
bilibili

TS compile API

Background#

Previously, during the SDK compilation process, to facilitate developers, we often wrote some aliases to turn long relative paths into shorter alias paths.
For example

{
  // ...
  "baseUrl":"src",
  "paths": {
    "@/package": ["./index"],
    "@/package/*": ["./*"],
  }
}

The effect of this is to always point the alias to the src directory, allowing files deep within the src directory to reference top-level files directly, which reduces the complexity of relative path references and makes the code look neat && easier to adjust the directory structure.

import Components from "@/package/ui/header"

In this way, the developer experience is friendly during the development phase, but when we finally package the code into an SDK, the host environment using the SDK may not have the same aliases configured. If they are inconsistent, there will be issues finding files, which requires us to compile the relevant alias files back to relative paths during the TypeScript packaging process to ensure compatibility for all importers. Therefore, it is necessary to have an understanding of the TypeScript compilation process.

TS Compilation Process#

SourceCode(Source Code) ~~ Scanner ~~> Token Stream
Token Stream ~~ Parser ~~> AST(Abstract Syntax Tree)
AST ~~ Binder ~~> Symbols(Symbols)
AST + Symbols ~~ Checker ~~> Type Validation
AST + Checker ~~ Emitter ~~> JavaScript Code

Scanner#

The scanner in TS (TypeScript) is the first stage of the compiler, also known as the lexical analyzer. It is responsible for converting the character stream in the source code file into a series of lexical units (tokens).
The working process is as follows:

  1. Read the character stream: The scanner reads characters from the source code file one by one.
  2. Identify lexical units: The scanner combines characters into recognized lexical units, such as identifiers, keywords, operators, constants, etc., based on a set of predefined syntax rules. It uses finite automata or regular expressions to match character sequences.
  3. Generate lexical units: Once a complete lexical unit is recognized, the scanner generates it as an object containing type and value information and passes it to the next stage of the compiler.
  4. Handle special cases: The scanner also handles special cases, such as comments, string literals, and the parsing of escape characters.

For example, consider the following TypeScript code snippet:

let age: number = 25;

The scanner will read characters one by one and generate the following lexical units:

  1. let keyword
  2. age identifier
  3. : colon (operator)
  4. number keyword
  5. = equals (operator)
  6. 25 numeric constant
  7. ; semicolon (separator)

The order of lexical unit generation is defined by syntax rules, and the scanner will continuously repeat this process until all characters in the source code file have been processed. This stage only extracts the relevant tokens without performing syntax or semantic analysis.

import * as ts from "typescript";

// TypeScript has a singleton scanner
const scanner = ts.createScanner(ts.ScriptTarget.Latest, /*skipTrivia*/ true);

// That is initialized using a function `initializeState` similar to
function initializeState(text: string) {
  scanner.setText(text);
  scanner.setOnError((message: ts.DiagnosticMessage, length: number) => {
    console.error(message);
  });
  scanner.setScriptTarget(ts.ScriptTarget.ES5);
  scanner.setLanguageVariant(ts.LanguageVariant.Standard);
}

// Sample usage
initializeState(`
var foo = 123;
`.trim());

// Start the scanning
var token = scanner.scan();
while (token != ts.SyntaxKind.EndOfFileToken) {
  console.log(ts.SyntaxKind[token]);
  token = scanner.scan();
}

output

VarKeyword
Identifier
FirstAssignment
FirstLiteralToken
SemicolonToken

Parser#

The parser in TS (TypeScript) is a tool used to convert TypeScript code into an Abstract Syntax Tree (AST). The main function of the parser is to parse the source code into a syntax tree for subsequent static analysis, type checking, and compilation operations.
The parser constructs the syntax tree by analyzing the lexical and syntactic structures of the source code. The lexical analysis phase breaks the source code into tokens, such as keywords, identifiers, operators, and constants. The syntax analysis phase organizes the tokens into a tree structure, ensuring the syntactic correctness of the code.

import * as ts from "typescript";

function printAllChildren(node: ts.Node, depth = 0) {
  console.log(new Array(depth + 1).join('----'), ts.SyntaxKind[node.kind], node.pos, node.end);
  depth++;
  node.getChildren().forEach(c => printAllChildren(c, depth));
}

var sourceCode = `
var foo = 123;
`.trim();

var sourceFile = ts.createSourceFile('foo.ts', sourceCode, ts.ScriptTarget.ES5, true);
printAllChildren(sourceFile);

output

SourceFile 0 14
---- SyntaxList 0 14
-------- VariableStatement 0 14
------------ VariableDeclarationList 0 13
---------------- VarKeyword 0 3
---------------- SyntaxList 3 13
-------------------- VariableDeclaration 3 13
------------------------ Identifier 3 7
------------------------ FirstAssignment 7 9
------------------------ FirstLiteralToken 9 13
------------ SemicolonToken 13 14
---- EndOfFileToken 14 14

Binder#

The general flow of a JavaScript parser is roughly

SourceCode ~~Scanner~~> Tokens ~~Parser~~> AST ~~Emitter~~> JavaScript

However, the above flow for TS lacks a key step: TypeScript's semantic system. To assist (the checker execution) type checking, the binder connects parts of the source code into a related type system for the checker to use. The main responsibility of the binder is to create symbols (Symbols).

  1. Simple Understanding

image

  1. In-Depth Exploration of Structure

image

image.png#

You can determine the uniqueness of scope-related references through the pos and end inside.

Checker#

Here, the symbols generated by the binder above will be used for type inference, type checking, etc.
Code example

import * as ts from "typescript";
import path from 'path'

// Create a TypeScript project
const program = ts.createProgram({
  rootNames: [path.join(__dirname, './check.ts')], // Paths of all files to check in the project
  options: {
    ...ts.getDefaultCompilerOptions(),
    baseUrl: '.'
  }, // Compilation options
});

// Get all semantic errors in the project
const diagnostics = ts.getPreEmitDiagnostics(program)

// Print error messages
diagnostics.forEach((diagnostic) => {
  console.log(
    `Error: ${ts.flattenDiagnosticMessageText(diagnostic.messageText, "\n")}`
  );
});

check.ts

const a:string  = 1
console.log(a)

const b = ({)

output

Error: Type 'number' is not assignable to type 'string'.
Error: Property assignment expected.
Error: '}' expected.

Emitter#

  • emitter.ts: is the emitter from TS to JavaScript
  • declarationEmitter.ts: this emitter is used to create declaration files for TypeScript source files (.ts)

The Emit phase will call the Printer to convert the AST into text. The name Printer is very fitting, as it prints the AST into text.

import * as ts from 'typescript';

const printer = ts.createPrinter();
const result = printer.printNode(
  ts.EmitHint.Unspecified,
  makeNode(),
  undefined,
);
console.log(result);

function makeNode() {
  return ts.factory.createVariableStatement(
    undefined,
    ts.factory.createVariableDeclarationList(
      [
        ts.factory.createVariableDeclaration(
          ts.factory.createIdentifier('video'),
          undefined,
          ts.factory.createKeywordTypeNode(ts.SyntaxKind.NumberKeyword),
          ts.factory.createStringLiteral('conference'),
        ),
      ],
      ts.NodeFlags.Const,
    ),
  );
}

Transformers#

The above sections introduced some processes of TS compiling code, and TS also provides us with hooks similar to "lifecycle" that allow us to add our custom parts during the compilation process.

  • before runs the transformer before TypeScript (code has not been compiled yet)
  • after runs the transformer after TypeScript (code has been compiled)
  • afterDeclarations runs the transformer after the declaration step (you can transform type definitions here)

API#

Visiting#

  • ts.visitNode(node, visitor) is used to traverse the root node
  • ts.visitEachChild(node, visitor, context) is used to traverse child nodes
  • ts.isXyz(node) is used to determine the node type, for example, ts.isVariableDeclaration(node)

Nodes#

  • ts.createXyz creates a new node (and returns it), ts.createIdentifier('world')
  • ts.updateXyz is used to update nodes ts.updateVariableDeclaration()

Writing a Transformer#

const transformer =
  (_program: ts.Program) => (context: ts.TransformationContext) => {
    return (sourceFile: ts.Bundle | ts.SourceFile) => {
      const visitor = (node: ts.Node) => {
        console.log('zxzxxxx', node);
        if (ts.isIdentifier(node)) {
          switch (node.escapedText) {
            case 'babel':
              return ts.factory.createStringLiteral('babel-transformer');
            case 'typescript':
              return ts.factory.createStringLiteral('typescript-transformer');
          }
        }
        return ts.visitEachChild(node, visitor, context);
      };

      return ts.visitNode(sourceFile, visitor);
    };
  };

const program = ts.createProgram([path.join(__dirname, './02.ts')], {
  baseUrl: '.',
  target: ts.ScriptTarget.ESNext,
  module: ts.ModuleKind.ESNext,
  declaration: true,
  declarationMap: true,
  jsx: ts.JsxEmit.React,
  moduleResolution: ts.ModuleResolutionKind.NodeJs,
  skipLibCheck: true,
  allowSyntheticDefaultImports: true,
  outDir: path.join(__dirname, '../dist/transform'),
});

const res = program.emit(undefined, undefined, undefined, undefined, {
  after: [transformer(program)],
});
console.log(res);

image

More code examples https://github.com/itsdouges/typescript-transformer-handbook/tree/master/example-transformers

Practical Applications#

import path from 'path';
import { chain, head, isEmpty } from 'lodash';
import ts from 'typescript';

export function replaceAlias(
  fileName: string,
  importPath: string,
  paths?: Record<string, string[]>
) {
  if (isEmpty(paths)) return importPath;

  const normalizedPaths = chain(paths)
    .mapKeys((_, key) => key.replace(/\*$/, ''))
    .mapValues(head)
    .omitBy(isEmpty)
    .mapValues((resolve) => (resolve as string).replace(/\*$/, ''))
    .value();

  for (const [alias, resolveTo] of Object.entries(normalizedPaths)) {
    if (importPath.startsWith(alias)) {
      const resolvedPath = importPath.replace(alias, resolveTo);
      const relativePath = path.relative(path.dirname(fileName), resolvedPath);
      return relativePath.startsWith('.') ? relativePath : `./${relativePath}`;
    }
  }

  return importPath;
}

export default function (_program?: ts.Program | null, _pluginOptions = {}) {
  return ((ctx) => {
    const { factory } = ctx;
    const compilerOptions = ctx.getCompilerOptions();

    return (sourceFile: ts.Bundle | ts.SourceFile) => {
      const { fileName } = sourceFile.getSourceFile();
      function traverseVisitor(node: ts.Node): ts.Node | null {
        let importValue: string | null = null;
        if (ts.isCallExpression(node)) {
          const { expression } = node;
          if (node.arguments.length === 0) return null;
          const arg = node.arguments[0];
          if (!ts.isStringLiteral(arg)) return null;
          if (
            // Can't call getText on after step
            expression.getText(sourceFile as ts.SourceFile) !== 'require' &&
            expression.kind !== ts.SyntaxKind.ImportKeyword
          )
            return null;
          importValue = arg.text;
          // import, export
        } else if (
          ts.isImportDeclaration(node) ||
          ts.isExportDeclaration(node)
        ) {
          if (
            !node.moduleSpecifier ||
            !ts.isStringLiteral(node.moduleSpecifier)
          )
            return null;
          importValue = node.moduleSpecifier.text;
        } else if (
          ts.isImportTypeNode(node) &&
          ts.isLiteralTypeNode(node.argument) &&
          ts.isStringLiteral(node.argument.literal)
        ) {
          importValue = node.argument.literal.text;
        } else if (ts.isModuleDeclaration(node)) {
          if (!ts.isStringLiteral(node.name)) return null;
          importValue = node.name.text;
        } else {
          return null;
        }

        const newImport = replaceAlias(
          fileName,
          importValue,
          compilerOptions.paths
        );

        if (!newImport || newImport === importValue) return null;

        const newSpec = factory.createStringLiteral(newImport);

        let newNode: ts.Node | null = null;

        if (ts.isImportTypeNode(node))
          newNode = factory.updateImportTypeNode(
            node,
            factory.createLiteralTypeNode(newSpec),
            node.assertions,
            node.qualifier,
            node.typeArguments,
            node.isTypeOf
            );

            if (ts.isImportDeclaration(node))
              newNode = factory.updateImportDeclaration(
              node,
              node.modifiers,
              node.importClause,
              newSpec,
              node.assertClause
            );

            if (ts.isExportDeclaration(node))
              newNode = factory.updateExportDeclaration(
              node,
              node.modifiers,
              node.isTypeOnly,
              node.exportClause,
              newSpec,
              node.assertClause
            );

            if (ts.isCallExpression(node))
              newNode = factory.updateCallExpression(
              node,
              node.expression,
              node.typeArguments,
              [newSpec]
            );

            if (ts.isModuleDeclaration(node))
              newNode = factory.updateModuleDeclaration(
              node,
              node.modifiers,
              newSpec,
              node.body
            );

            return newNode;
            }

            function visitor(node: ts.Node): ts.Node {
            	return traverseVisitor(node) || ts.visitEachChild(node, visitor, ctx);
            }
            return ts.visitNode(sourceFile, visitor);
            };
            }) as ts.TransformerFactory<ts.Bundle | ts.SourceFile>;
            }

References#

https://www.youtube.com/watch?v=BU0pzqyF0nw
https://github.com/basarat/typescript-book
https://github.com/itsdouges/typescript-transformer-handbook
https://github.com/LeDDGroup/typescript-transform-paths
https://github.com/nonara/ts-patch
https://github.com/LeDDGroup/typescript-transform-paths/blob/v1.0.0/src/index.ts
https://github.com/microsoft/TypeScript-Compiler-Notes

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.