Writing rules
Rules are plain YAML files loaded from a directory. One rule per file. Every .yml and .yaml file under the rules dir is parsed on startup; no code changes, no restarts.
Minimal rule
id: CBR-PY-DEBUG-PRINT
title: "Leftover debug print"
severity: low
languages: [python]
cwe: [CWE-532]
message: |
print() calls in production code leak information. Ship logs through
the structured logger instead.
regex: |
\bprint\s*\(
cyscan scan . picks it up the next time it runs. No compilation step.
Required fields
| Field | Type | Notes |
|---|---|---|
id | string | Stable identifier. Convention: CBR-<LANG>-<SHORT-NAME> or CBR-<CATEGORY>-<NAME> for polyglot rules. |
title | string | One-line human summary. Shows in the text reporter. |
severity | critical / high / medium / low / info | |
languages | list | Any of python, javascript, typescript, go. Empty for policy (dependency) rules. |
message | multi-line string | Detailed description + remediation hint. Shown in SARIF + reports. |
Two matcher types
Regex
Simple, fast, line-by-line.
regex: |
\bhashlib\.(md5|sha1)\b
- The pattern is trimmed — trailing newlines from YAML block literals are stripped for you.
- Ruby-style anchors not supported (no lookbehind). Use tree-sitter if you need structure.
Tree-sitter query
Structured, language-aware. Matches against the AST — immune to whitespace and comment variations.
query: |
(call
function: (attribute
object: (identifier) @m (#eq? @m "pickle")
attribute: (identifier) @fn (#match? @fn "^(loads?)$"))) @call
The finding is reported at the first captured node. The example above reports on @m (the pickle identifier). To report on the whole call instead, move @call first:
query: |
((call) @call
function: (attribute
object: (identifier) @m (#eq? @m "pickle")
attribute: (identifier) @fn (#match? @fn "^(loads?)$")))
Query syntax reference by language:
- Python — tree-sitter-python/src/grammar.js
- JavaScript — tree-sitter-javascript/src/grammar.js
- Go — tree-sitter-go/src/grammar.js
Rules validate at load time — broken queries don't crash the scan, they print a warning and are skipped.
Optional fields
cwe: [CWE-89, CWE-564]
fix_recipe: CWE-89-PARAMETERIZE # platform-side recipe slug, ignored by OSS
fix: |
hashlib.sha256 # literal replacement for `cyscan fix`
fix:
Literal replacement text spliced over the matched range by cyscan fix. See Autofix.
dependency:
Converts the rule into a supply-chain policy rule. Mutually exclusive with regex and query.
id: CBR-DEP-EVENT-STREAM-MALWARE
title: "event-stream 3.3.6 was trojaned"
severity: critical
dependency:
ecosystem: npm
name: event-stream
version: { min: "3.3.6", max: "3.3.6" }
message: |
This exact release was compromised to steal Bitcoin wallet credentials.
See Supply-chain scanning for the full schema.
Organising rules
Put related rules in subdirectories — cyscan walks the tree recursively.
my-rules/
├── python/
│ ├── sqli-concat.yml
│ └── weak-hash.yml
├── javascript/
│ └── eval.yml
├── secrets/
│ └── github-token.yml
└── supply/
└── blocked-licenses.yml
Testing a rule
- Drop a fixture file next to the rule with the pattern you want to catch.
- Run
cyscan scan <fixture>to confirm it fires. - Run
cyscan rules validateto confirm no YAML / query errors. - Add a real-code test in CI:
cyscan scan tests/fixtures --rules ./my-rules --format json \
| jq -e '[.[] | select(.rule_id == "CBR-PY-MY-RULE")] | length > 0'
Rule IDs we reserve
Cyscan's bundled rules use the CBR-* prefix:
CBR-PY-*— Python SASTCBR-JS-*— JavaScript SASTCBR-GO-*— Go SASTCBR-SECRETS-*— secret detection (polyglot)CBR-SUPPLY-*— OSV advisory matchesCBR-DEP-*— dependency policy rules
For your own rules, pick a different prefix (e.g. ACME-PY-*) so upgrades to the bundled pack never collide.
Next step
- Autofix — ship
fix:blocks with your rules - Supply-chain scanning — write
dependency:policy rules