Corelight Recognized as a Leader in the 2025 Gartner® Magic Quadrant™ for Network Detection and Response

Corelight Recognized as a Leader in the 2025 Gartner® Magic Quadrant™ for Network Detection and Response

CONTACT US
Detect and disrupt evasive threats with high-fidelity, multi-layered detection.

Detect and disrupt evasive threats with high-fidelity, multi-layered detection.

SEE HOW

volt-typhoon-warning

Detect advanced attacks with Corelight

SEE HOW

cloud-network

Corelight announces cloud enrichment for AWS, GCP, and Azure

READ MORE

partner-icon-green

Corelight's partner program

BECOME A PARTNER

glossary-icon

10 Considerations for Implementing an XDR Strategy

READ NOW

ad-images-nav_0006_Blog

Don't trust. Verify with evidence

READ BLOG

2025 Gartner® Magic Quadrant for NDR

GET THE REPORT

ad-images-nav_0006_Blog

Detecting 5 Current APTs without heavy lifting

READ BLOG

g2-medal-best-support-spring-2024

Network Detection and Response

SUPPORT OVERVIEW

 

How to level up your technical documentation with Microsoft's style guide and LLMs

How AI can transform even Microsoft's own documentation to meet its style standards.

Introduction

As a security researcher at Corelight, I produce a significant amount of technical documentation. Much of this documentation ends up in GitHub repositories or gets deployed to our Corelight sensors, where consistency and clarity are crucial. Writing technical documentation that meets style guide standards can be incredibly time-consuming, especially when you're juggling multiple projects and deadlines.

That's why I developed the llm-styleguide-helper tool—to speed up the time it takes me to write technical documentation that meets professional standards. After seeing how much time it saved me, I wanted to share it as open source so that others can also accelerate their documentation efforts.

Style guides are essential for maintaining consistency in technical documentation; however, manually applying them can be time-consuming and prone to errors. What if you could automate this process using AI? That's exactly what the llm-styleguide-helper tool does - it combines the power of Vale linting with AI to automatically fix style guide violations.

The tool operates through a sophisticated pipeline that scans documents for style violations and generates AI prompts to fix them. You can then manually copy these prompts to your favorite large language model (LLM), or use the --gemini flag to automatically process the corrections through Google’s Gemini CLI. Gemini was chosen for the automatic mode because it is the only online LLM I am aware of that does not require an API key, which can be costly and has no bounds, plus I already use Gemini for other work at Corelight. The methodology introduced here can be adapted to any organization's style guide or writing standards, making it a versatile solution for technical documentation teams.

The llm-styleguide-helper is part of the LLM-Ninja collection of AI-powered tools for document processing and LLM integration. More about LLM-Ninja and other LLM tricks can be found in some of the prior blogs I wrote:

The problem with manual style guide compliance

Traditional style guide compliance involves:

  1. Manually reviewing documents
  2. Identifying violations
  3. Making corrections
  4. Re-reviewing for missed issues
  5. Iterating until all violations are resolved

This process is not only tedious but also prone to human error and inconsistency. Even Microsoft's own documentation (example below) isn't immune to style guide violations.

Enter LLM-styleguide-helper

The llm-styleguide-helper is a Python script that revolutionizes this process by:

  1. Automated detection: Uses Vale to scan documents for style violations
  2. AI-powered fixing: Generates prompts for LLMs to automatically correct issues
  3. Iterative refinement: Continuously improves until all violations are resolved
  4. Microsoft Style Guide integration: Specifically designed to work with Microsoft's comprehensive style guide

Two ways to use the tool

The tool offers two distinct workflows. I will discuss the manual mode first, which can be used with any LLM, followed by an automatic mode that utilizes Gemini to automatically correct the style of your input text.

Manual mode: Generate prompts for your favorite LLM

The tool generates detailed prompts that you can copy and paste into ChatGPT, Claude, or any other LLM of your choice. This gives you full control over the AI model, allowing you to review changes before applying them.

When you run the tool in manual mode, it creates .prompt files next to each of your original documents. These prompt files contain:

  • Original document content: The complete text of your document
  • Vale alerts in JSON format: All detected style violations with line numbers and descriptions
  • Vocabulary definitions: Relevant definitions from the Microsoft Style Guide for any vocabulary-related issues
  • AI instructions: Clear directions for the LLM on how to fix the issues while preserving the original meaning

For example, if you have a file called README.md, the tool will create README.md.prompt. You can then:

  1. Open the .prompt file in your text editor
  2. Copy the entire content
  3. Paste it into your preferred LLM (ChatGPT, Claude, etc.)
  4. Copy the LLM's response (the corrected document)
  5. Save it as a new file (e.g., README.md.fixed)

This approach provides you with complete control over which AI model to use, allowing you to review and edit changes before applying them.

Manual iterative refinement

The beauty of manual mode is that you can iteratively refine your documents until you're satisfied with the results. Here's how:

  1. Run the tool to generate your first set of prompts
  2. Process with your LLM and save the corrected version
  3. Run Vale again on your corrected file to check for remaining issues:
  1. Copy Vale's output (the remaining alerts) and paste it into your LLM
  2. Ask the LLM to fix the remaining issues in your corrected document
  3. Repeat steps 3-5 until you're satisfied with the results

This manual iteration process gives you complete control over the refinement cycle. You can decide when to stop based on your quality requirements, and you can even manually edit the LLM's suggestions before applying them.

Automatic mode: seamless Gemini CLI integration

Use the --gemini flag to automatically process your documents through Google's Gemini CLI. This provides a fully automated workflow that handles the entire process from detection to final correction.

When you use the --gemini flag, the tool:

  1. Creates temporary .prompt files (same as manual mode) with Vale
  2. Automatically sends them to Gemini CLI for processing
  3. Receives the corrected content from Gemini
  4. Saves the results to .fixed files (e.g., README.md.fixed)
  5. Repeats the process until no more improvements are detected or the file is clean
  6. Cleans up the temporary .prompt files

This creates a fully automated iterative refinement process that continues until your document meets the style guide standards. The tool intelligently stops when it detects no further improvements for 3 consecutive iterations, preventing infinite loops.

Microsoft style guide integration

The tool is specifically designed to work with Microsoft's open-sourced style guide. Microsoft's style guide is comprehensive and well-maintained, making it an excellent choice for technical documentation. The tool leverages Microsoft's extensive vocabulary definitions and style rules to provide accurate and consistent corrections.

Real-world example: Microsoft's own documentation

Let's examine how this tool can enhance even Microsoft's official documentation. Using their own SECURITY.md file as an example:

Initial analysis

Running Vale on Microsoft's SECURITY.md reveals 26 style violations:

Here's what Vale's output looks like:

The output shows multiple types of issues:

  • Use of first-person plural ("our", "we", "us")
  • Passive voice constructions
  • Missing Oxford commas
  • Informal contractions ("do not" instead of "don't")
  • Undefined acronyms
  • Abbreviations ("e.g." instead of "for example")
  • And many more...

Automatic processing

With the llm-styleguide-helper, we can automatically fix these issues:

The tool processes the file through 4 iterations:

  • Iteration 1: 26 alerts → 3 alerts
  • Iteration 2: 3 alerts → 1 alert
  • Iteration 3: 1 alert → 0 alerts
  • Iteration 4: 0 alerts (clean!)

Key improvements made

The AI automatically transformed the content:

  • Before: "please report it to us as described below"
  • After: "report it as described below"
  • Before: "Please do not report security vulnerabilities"
  • After: "Please don't report security vulnerabilities"
  • Before: "our PGP key; please download it"
  • After: "the Pretty Good Privacy (PGP) key. Please download it"
  • Before: "ensure we received your original message"
  • After: "make sure Microsoft received your original message"
  • Note: The LLM could infer from the input content that “we” means “Microsoft”

Complete before and after comparison

Here's the full wdiff comparison showing every change made by the AI:

This comprehensive diff shows how the AI systematically addressed every style violation, from removing first-person pronouns to standardizing terminology and improving clarity.

Beyond Microsoft documentation

The beauty of this approach is its versatility. This methodology can be applied to any type of content that needs style guide compliance:

Open source README files

Many open source projects have inconsistent documentation. The tool can:

  • Standardize terminology across all documentation
  • Ensure consistent formatting and structure
  • Apply project-specific style guidelines
  • Maintain professional tone and clarity

User documentation

Technical documentation often suffers from:

  • Inconsistent voice and tone
  • Mixed terminology
  • Unclear instructions
  • Style inconsistencies

The AI can automatically:

  • Convert passive voice to active voice
  • Standardize technical terms
  • Improve readability
  • Ensure consistent formatting

Blog posts and articles

Content creators can use this tool to:

  • Maintain consistent brand voice
  • Apply editorial style guides
  • Ensure professional quality
  • Save time on manual editing

Academic and technical papers

Research and technical writing can benefit from:

  • Consistent citation formats
  • Standardized terminology
  • Improved clarity and readability
  • Professional presentation

How it works

The tool operates through a sophisticated pipeline:

    1. Document discovery: Recursively scans directories for .txt and .md files
    2. Style analysis: Runs Vale in JSON output mode to detect violations
    3. Vocabulary lookup: Automatically finds relevant definitions from Microsoft’s style guide
    4. Prompt generation: Creates comprehensive prompts for AI models
    5. AI processing:
      • Manual mode: Generates prompts for you to copy to your favorite LLM
      • Automatic mode (with --gemini): Sends prompts directly to Gemini CLI
    6. Iterative refinement:
      • Automatic mode: Repeats until optimal results are achieved
      • Manual mode: You can manually repeat the process by running Vale again on the fixed content
    7. Smart stopping:
      • Automatic mode: Automatically stops when no further improvements are possible
      • Manual mode: You decide when to stop based on your quality requirements

Technical features

  • Vale integration: Leverages industry-standard linting
  • Microsoft Style Guide support: Built-in support for Microsoft's comprehensive style guide
  • Gemini CLI integration: Seamless automation with Google's Gemini models
  • Flexible LLM support: Works with any LLM via manual prompt copying
  • Batch processing: Handles entire document collections
  • Smart iteration: Prevents infinite loops with intelligent stopping conditions

Getting started

The tool requires:

  • Python 3.7+
  • Vale linting tool (install via brew install vale on macOS)
  • Microsoft Style Guide repository
  • Optional: Gemini CLI for automatic processing

Basic setup:

Important: If you plan to use the automatic Gemini processing (with the --gemini flag), you'll need to install and configure Gemini CLI:

This will prompt you to sign in with your Google account and grant permissions for Gemini.

Vale configuration

You'll need to create a .vale.ini configuration file in your project root. This file tells Vale to use the Microsoft style guide and how to process your files:

After creating this file, run vale sync to download the required style files.

Note: The BlockIgnores line tells Vale to ignore any ... tags that LLMs might add to their output, so we don't style the AI's thinking process.

Note: Vale’s alert levels are “suggestion”, “warning”, and “error”, in increasing severity. If you only wish to see warnings and errors, you can change the “MinAlertLevel” to “warning”. If you only want to see errors, change “MinAlertLevel” to “error”

Command line arguments

The script offers several command-line options to customize its behavior:

Required arguments

  • --input-dir: Directory containing your .txt and .md files to process
  • --styleguide-dir: Path to the Microsoft Style Guide's a-z-word-list-term-collections directory

Optional arguments

  • --gemini: Enable automatic processing with Gemini CLI (default: manual mode)
  • --model: Specify which Gemini model to use (e.g., 'gemini-2.5-flash', 'gemini-1.5-pro')
  • --vale-ini: Path to a custom Vale configuration file (default: uses .vale.ini in current directory)

Example commands

Basic usage (manual mode):

Automatic processing with a specific model:

Using a custom Vale configuration:

Conclusion

The llm-styleguide-helper demonstrates how AI can transform the tedious task of style guide compliance into an automated, efficient process. By combining the precision of linting tools with the intelligence of large language models, we can achieve consistent, professional-quality documentation at scale.

Whether you're maintaining open source documentation, creating user guides, or writing technical content, this tool can help ensure your content meets Microsoft's high standards of style and consistency. The fact that it can improve even Microsoft's own documentation speaks to its effectiveness and potential impact.

As we move toward more automated content creation and editing workflows, tools like this will become essential for maintaining quality and consistency across all types of written content that follow Microsoft's style guide.

Recent Posts