Developer Tools · Open Source · Python

Spark: A Zero-Dependency Open-Source Repository Health Toolkit

By Rudra Sarker • Published May 9, 2026

Why Repository Health Matters

There are over 400 million repositories on GitHub. The vast majority are abandoned -- half-finished projects with no README, no license, no CI, and no documentation. Even among active projects, there is enormous variation in quality. Some repositories are exemplary: clean structure, comprehensive docs, automated testing, consistent contribution guidelines. Others are functional but opaque -- the code works, but onboarding a new contributor requires tribal knowledge that only exists in the maintainer's head.

Repository health is not just aesthetics. It directly affects adoption. A developer evaluating two libraries that solve the same problem will choose the one with better documentation, clearer licensing, and visible CI badges every time. Health scores also help maintainers prioritize -- if your project scores 45 out of 100, the report tells you exactly which improvements will move the needle the most.

Existing tools tackle pieces of this problem. There are linters, security scanners, documentation generators, and compliance checkers. But most of them require installing a dependency chain that pulls in dozens of packages. For a tool whose entire purpose is to assess whether your repository is well-organized, requiring a bloated dependency tree feels counterproductive. I wanted something lighter.

Zero-Dependency Philosophy

Spark has zero runtime dependencies. Every line of functionality is built on Python's standard library. No third-party packages, no pip install cascades, no version conflict nightmares. If you have Python 3.10 or later installed, Spark works.

pip install spark-oss-repo-toolkit
# That's it. Zero additional packages installed.

This was a deliberate design choice with real tradeoffs. Using only stdlib means I had to implement things that libraries like rich or click would provide out of the box -- terminal formatting, argument parsing helpers, color output. But the benefit is significant: Spark will never break because a transitive dependency released a backwards-incompatible change. It will never conflict with your project's own dependencies. It will never surprise you with a supply-chain vulnerability in an indirect dependency.

The codebase is also mypy strict-compatible. Every function has explicit type annotations, and the entire project passes mypy --strict with zero errors. For a developer tool that assesses code quality, holding itself to the same standard is non-negotiable.

How Scoring Works

Spark provides 8 CLI commands, each serving a distinct purpose in the repository assessment workflow:

  • validate -- Checks for required files: README, LICENSE, CONTRIBUTING guide, CODE_OF_CONDUCT, changelog
  • assess -- Runs a comprehensive evaluation and produces a 0-100 maturity score
  • discover -- Maps the repository structure and identifies missing conventions
  • scaffold -- Generates missing standard files from templates
  • health -- Quick health check with pass/fail indicators across key dimensions
  • locales -- Manages internationalization strings
  • integration-links -- Verifies links to CI services, documentation sites, and package registries
  • version -- Displays version and environment information

The assess command is the centerpiece. It evaluates the repository across multiple dimensions -- documentation completeness, license clarity, CI configuration, code organization, issue and PR templates, security policy, and community health files -- then produces a single numerical score from 0 to 100. The output includes both the aggregate score and a breakdown showing where points were gained and lost, along with specific recommendations for improvement.

$ spark assess ./my-project

Repository Assessment: my-project
================================
Overall Score: 72 / 100

Strengths:
  - README with installation and usage sections
  - MIT License detected
  - CI workflow present (GitHub Actions)

Recommendations:
  - Add CONTRIBUTING.md (score +8)
  - Add SECURITY.md policy (score +5)
  - Include issue templates (score +4)
  - Add CODE_OF_CONDUCT.md (score +3)

The scoring weights are designed to reflect what matters most for open-source adoption. Documentation and licensing carry the highest weight because they are the first things potential users and contributors check. CI configuration and community health files follow, because they signal active maintenance and a welcoming contributor experience.

CI Integration

Spark is designed to run in CI pipelines. Because it has zero dependencies, adding it to your GitHub Actions workflow is a single line -- no caching setup, no dependency installation step, no virtual environment management. The tool exits with a non-zero code when the health score falls below a configurable threshold, making it a natural gatekeeper in branch protection rules.

# .github/workflows/spark-health.yml
name: Repository Health Check
on: [push, pull_request]
jobs:
  health:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.10"
      - run: pip install spark-oss-repo-toolkit
      - run: spark assess . --min-score 60

This setup fails the CI build if the repository health drops below 60. You can adjust the threshold based on your standards. For mature projects, setting it to 80 or higher ensures that any PR that removes a community health file or breaks documentation structure gets caught automatically.

Plugin System

No single tool can anticipate every dimension of repository quality that every team cares about. Some teams prioritize accessibility compliance. Others care about containerization best practices. Some want to enforce specific documentation formats.

Spark addresses this with a plugin system built around the SparkPlugin base class. Writing a custom check is straightforward:

from spark.plugin import SparkPlugin

class AccessibilityCheck(SparkPlugin):
    name = "accessibility"
    description = "Checks for accessibility documentation"

    def run(self, repo_path: str) -> dict:
        # Custom logic here
        return {
            "score": 85,
            "details": "Accessibility statement found",
            "recommendations": ["Add ARIA labels documentation"]
        }

Plugins integrate seamlessly into the existing assess command. When Spark discovers a plugin, it extends its scoring rubric to include the plugin's dimensions. The final report merges the built-in scores with plugin scores, giving you a unified health assessment that reflects your organization's specific requirements.

Internationalization

Open source is global. Spark ships with built-in support for three languages: English, Spanish, and French. All user-facing strings -- CLI output, assessment reports, recommendations -- can be displayed in any of the supported languages. The i18n system uses standard Python gettext conventions, and the locales command manages translation files.

This matters because developer tooling should not assume English proficiency. A maintainer in Latin America should be able to run a repository assessment and get actionable recommendations in Spanish. The goal is to lower the barrier to good open-source practices for the widest possible audience.

Get Started

Spark is open-source under the MIT License. Install it, assess your repositories, and start improving your open-source health:

Spark OSS Repository Toolkit

8 CLI Commands · Zero Runtime Dependencies · 0-100 Scoring · Plugin System
mypy Strict · 3 Languages · Python 3.10+ · MIT License
GitHub · Docs

Related Posts

Connect With Me

Follow my work and connect across platforms:

Back to Blog