Skip to content

feat: add file-based Devin provider to fix ENAMETOOLONG on Windows#5

Open
jb-improving wants to merge 1 commit into
mainfrom
feat/fix-enametoolong-devin-provider
Open

feat: add file-based Devin provider to fix ENAMETOOLONG on Windows#5
jb-improving wants to merge 1 commit into
mainfrom
feat/fix-enametoolong-devin-provider

Conversation

@jb-improving

@jb-improving jb-improving commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Summary

On Windows, passing large prompts (>32k characters) as CLI arguments to the Devin CLI causes an ENAMETOOLONG error due to the operating system's command-line length limit. This PR adds a new file-based provider (devinProvider.js) that writes prompts to a temporary file and passes them via the --prompt-file flag, bypassing the command-line length limitation entirely.

Changes

devinProvider.js (new file)

  • Implements a promptfoo custom provider class (DevinProvider) that exports the standard id() and callApi() interface
  • Writes the full prompt to a temporary file in the OS temp directory, then invokes Devin CLI with --prompt-file instead of passing the prompt as a positional CLI argument
  • Uses async spawn (instead of spawnSync) for non-blocking execution
  • Parses token usage from Devin's --export JSON output, aggregating input/output/cache tokens from agent steps
  • Supports both provider mode and grader mode (auto-detected from prompt format, same as devin.js)
  • Cleans up temp files in a finally block to avoid leftover artifacts

promptfooconfig.yaml (modified)

  • Adds a commented-out provider entry referencing file://devinProvider.js with the label Devin SWE-1.6 (file provider)
  • Left commented out so users can enable it as needed without affecting the default configuration

Design Decisions

  • Original devin.js provider left in place: The existing exec-based devin.js provider remains active and unchanged. This ensures ongoing testing is not disrupted. The new file-based provider is available as an opt-in alternative.
  • Provider entry commented out in config: The new provider is validated and ready to use but is not enabled by default. Users can uncomment the entry in promptfooconfig.yaml to switch to the file-based provider.

Validation

  • Ran px promptfoo eval --no-cache with both the exec-based and file-based providers enabled
  • 14/14 tests passed (100%) — no errors, no failures
  • Both providers produced correct output across all test cases (deterministic, rubric-based, and script-based evaluations)
  • Token usage tracking confirmed working via --export parsing

Next Steps

  • Cross-platform testing: Additional testing on macOS and Linux is needed before unifying providers. The temp file approach should work cross-platform, but this has only been validated on Windows so far.
  • Provider unification: Once cross-platform validation is complete, the plan is to update other providers (gh_copilot.js, claude.js, agy.js, kiro.js) with the same file-based pattern to prevent ENAMETOOLONG across all providers.
  • Enable by default: After validation, consider replacing the exec-based devin.js provider with the file-based approach as the default.

Add devinProvider.js as a promptfoo custom provider that writes prompts
to a temp file and passes them via --prompt-file instead of as a CLI
argument. This avoids the Windows ENAMETOOLONG error when prompts exceed
the 32k character command-line limit.

The provider also uses async spawn and parses token usage from Devin's
--export output. The provider entry in promptfooconfig.yaml is added
but commented out, ready to be enabled when needed.
@jb-improving jb-improving requested a review from trayburn July 2, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant