Command line interface #
mbf-fastq-processor is configured exclusively through a TOML document. The CLI is therefore intentionally minimal and focuses on selecting the configuration and the working directory.
Usage #
mbf-fastq-processor process [config.toml] [--allow-overwrite]
mbf-fastq-processor template
mbf-fastq-processor verify [config.toml] [--output-dir <OUTPUT_DIR>]
mbf-fastq-processor interactive [config.toml]
mbf-fastq-processor completions <SHELL>
Process #
Process FASTQ as described in <config.toml>.(see the TOML format reference). Relative paths are resolved against the current shell directory.
The config.toml argument can be left off iff there’s one .toml in the current directory, and it contains an [input] and an [output] section
(not) “Done” marker file #
By default, existence of any output file will lead to an early abort, before any processing happens (other output files might have been created with 0 bytes at this point though). If you pass –allow-overwrite (or if an output.incomplete file exists), existing output files are overwritten instead.
The output.incomplete file exists until the successful exit of mbf-fastq-processor. This way you can detect incomplete runs by the existence of that file.
Behaviour #
- Exit status
0denotes success; non-zero exit codes indicate configuration, IO, or data validation failures. - Error messages go to stderr. Helpful hints for configuration issues also go to stderr.
- –help goes to stdout. Run without arguments shows help, going to stderr.
- by default there is no stdout output. Progress can change that.
Template #
Output a configuration file showing all the options, ready to be ‘uncommented’.
The template is also available here.
Appropriate parts of the template are also shown when a configuration error is detected.
Verify #
The verify command runs processing in a temporary directory and compares the outputs against expected outputs (or expected failures) in the same directory as the configuration file.
This is useful for:
- Testing that your pipeline produces expected results
- Regression testing during development (many of mbf-fastq-processor’s test cases use this exact facility)
- Validating that changes don’t affect output
Usage #
mbf-fastq-processor verify [config.toml] [--output-dir <OUTPUT_DIR>] [--unsafe-call-prep-sh]
If no configuration file is specified, the tool will auto-detect a single .toml
file in the current directory if that file contains both [input] and [output] sections.
Behavior #
- Creates a temporary directory, copies your TOML file with adjusted input file paths
- or: if a file ‘copy_input’ exists, copy the input fastqs, keep TOML relative
- If a prep.sh exists in the working directory: If
--unsafe-call-prep-shis passed, copy that to the temporary directory and execute it. If not, abort with an error message - if no ’expected_error.txt’ (or .regex) exists, runs ‘process’ (in the temporary directory).
If configuration uses stdin (segment =
--stdin--) and a file named ‘stdin’ exists in the config directory, pipes that file’s content to the subprocess as stdin - if ’expected_error.txt’ (|.regex) exists, run ‘validate’ instead of ‘process’
- if test.sh exists, run that instead of running ‘process’ or validate, and skip all output verification (the test.sh must do that)
- Compares all output files (matching the output prefix) against expected files in the config’s directory
- If files called ‘stdout’ or ‘stderr’ exist, compare these to actual stdout/stderr
- If a file called ’expected_error.txt’ exists, verify that stderr (from ‘validate’) contains that message and return code was != 0.
- If a file called ’expected_runtime_error.txt’ exists, verify ‘validate’ succeds but that that stderr (from ‘process’) contains that message and return code was != 0.
- (same for .regex files, except that we’re doing a regex match instead of string search)
- Normalizes non-reproducible content in JSON/HTML reports (timestamps, paths, versions) before comparison
- Reports missing files, unexpected files, and content mismatches
- Exit code 0 indicates successful verification; non-zero indicates failure
Output Directory Option #
The --output-dir option specifies a directory to copy the actual outputs to if verification fails:
mbf-fastq-processor verify config.toml --output-dir ./debug_output
When verification fails with this option:
- The specified directory is removed if it exists (!)
- All output files from the temporary directory are copied to it
- JSON and HTML files are normalized (same normalization used for comparison)
- stdout/sterr are logged into files
stdoutandstderr - This makes it easy to inspect what the processor actually generated
This is particularly useful for:
- Debugging test failures
- Updating expected outputs after intentional changes
- Understanding differences between expected and actual results
Stdin Support #
When your configuration uses stdin input (by specifying --stdin-- as an input file), the verify command can simulate stdin input by reading from a file named stdin in the same directory as your configuration file.
For example, if your config.toml contains:
[input]
read1 = '--stdin--'
Interactive #
The interactive mode takes your configuration file, samples 15 from the first 10,000 reads (configurable via CLI arguments), and shows you the Inspect results.
Every time you save, the results refresh.
This way you can quickly tune and work on your configuration.
Completions #
Generate shell completion scripts for command-line auto-completion in various shells.
Supported Shells #
- bash - Bourne Again Shell
- fish - Friendly Interactive Shell
- zsh - Z Shell
- powershell - PowerShell
- elvish - Elvish Shell
Installation Instructions #
Bash
Add to ~/.bashrc:
source <(mbf-fastq-processor completions bash)
Or for environment-based approach (auto-updates):
eval "$(COMPLETE=bash mbf-fastq-processor)"
Fish
Save to Fish completions directory:
mbf-fastq-processor completions fish > ~/.config/fish/completions/mbf-fastq-processor.fish
Or add to ~/.config/fish/config.fish for environment-based approach:
if command -v mbf-fastq-processor > /dev/null
COMPLETE=fish mbf-fastq-processor | source
end
Zsh
Add to ~/.zshrc:
source <(mbf-fastq-processor completions zsh)
Or for environment-based approach:
eval "$(COMPLETE=zsh mbf-fastq-processor)"
PowerShell
Add to your PowerShell profile:
mbf-fastq-processor completions powershell | Out-String | Invoke-Expression
Features #
Shell completions provide:
- Command and subcommand completion
- File path completion for configuration files
- Directory path completion for output directories
- Shell-specific syntax and behavior
After installing completions, restart your shell or source the configuration file for changes to take effect.