Frequently Asked Questions
Can't find what you're looking for? Contact our support team — we're happy to help!
General
How do I cite PNA/MPX?
Please cite the specific versions of the tools used in your analysis. We've made it easy:
➡️ How to cite
Where can I find example datasets?
We provide several publicly available datasets to help you get started and test your analysis workflows:
➡️ Datasets
Running the Pipeline
Questions about running nf-core/pixelator from start to finish.
How do I run pixelator on an HPC cluster?
The nf-core/pixelator pipeline supports execution on HPC clusters through Nextflow's built-in executors. Configuration options are available for SLURM, SGE, PBS, and other common schedulers.
➡️ See our configuration Quick Start guide
How can I diagnose a pipeline error?
When a run fails, there are some things that you can look at to try to determine what went wrong.
| Location | What you'll find |
|---|---|
.nextflow.log | The hidden file in your launch directory containing the full execution trace. |
<output dir>/pipeline_info/execution_trace_<timestamp>.txt | Details on which stages failed or completed |
<output dir>/pixelator/[stage]/<sample_name>.report.json | Sample-wise quality control metrics generated during the run. These files can be useful to understand if there was an issue with the data. |
<work dir>/**/**/*.log | Logs for individual pipeline tasks (see more on how to find them below). |
Finding individual logs
When a specific pipeline step fails, you should see an error message that looks something like this:
Caused by:
<Cause of the error>
Command executed:
<The command that was executed>
Command exit status:
1 <or some other non-zero exit status>
Command output:
<any output from the command>
Command error:
<the error message>
Work dir:
<path to the command work dir>
Container:
<the container that was running>
If you want to see the full log of the job that specifically failed, you can look under the Work dir section. That folder should contain:
.command.sh: the command Nextflow executed.command.out/.command.err: stdout and stderr from the tool.command.log: combined stdout + stderr.exitcode: the process exit status.command.trace: resource usage (CPU, memory, runtime) for that task
Looking at the log files there can often help diagnose the problem.
If you feel like you have a hard time understanding your log outputs, copy-pasting them into an LLM can be a good strategy to get some help. Please be careful so that you do not submit any sensitive information or go against your institutional policies when doing so.
If you need help, you can share the log and report files with our support team for faster troubleshooting. This script collects the most relevant pipeline logs into a single archive you can attach in your support request: Note: the logs and QC report JSON files can contain sample names and other sample-level details. If you do not want to share that information, do not send the bundle. Create a file called Example usage (recommended): If you don't have easily accessible the output directory, you can still run: To see exactly which files were included in the bundle (without extracting it), run one of: or, if your bundle is a Collect and send a log bundle
$NXF_LOG_FILE if set, otherwise .nextflow.log in your launch directory)
pipeline_info/execution_trace_*.txt**/*.log**/*.report.json**/*.meta.jsoncollect_pixelator_logs.sh with the contents below, then run it in the same directory
where you ran nextflow and send the resulting .zip (or .tar.gz) to support.#!/usr/bin/env bash
set -euo pipefail
# Optional: pipeline output directory (containing pipeline_info/ and pixelator/)
# Leave empty to skip collecting output-dir artifacts.
OUTPUT_DIR="${OUTPUT_DIR:-}"
# Where to write the archive (defaults to current directory)
DEST_DIR="${DEST_DIR:-$PWD}"
ts="$(date +%Y%m%d_%H%M%S)"
bundle_dir="$(mktemp -d -t pixelator_logs_"$ts"_XXXXXX)"
stage_dir="$bundle_dir/log_bundle_$ts"
mkdir -p "$stage_dir"
# Nextflow log: use NXF_LOG_FILE if set, else .nextflow.log in current directory
nxf_log="${NXF_LOG_FILE:-.nextflow.log}"
if [[ -f "$nxf_log" ]]; then
mkdir -p "$stage_dir/nextflow"
cp -- "$nxf_log" "$stage_dir/nextflow/"
fi
# Output-dir artifacts (optional)
if [[ -n "${OUTPUT_DIR}" && -d "${OUTPUT_DIR}" ]]; then
# Copy a curated subset from OUTPUT_DIR, preserving the folder structure.
out_root="$stage_dir/output_dir"
mkdir -p "$out_root"
# Execution traces (kept under output_dir/pipeline_info/)
if [[ -d "$OUTPUT_DIR/pipeline_info" ]]; then
(
cd "$OUTPUT_DIR"
find ./pipeline_info -maxdepth 1 -type f -name 'execution_trace_*.txt' -exec sh -c '
out="$1"; shift
for f in "$@"; do
mkdir -p -- "$out/$(dirname "$f")"
cp -- "$f" "$out/$f"
done
' sh "$out_root" {} +
)
fi
# Logs + report/meta JSONs anywhere under OUTPUT_DIR (preserve relative paths)
(
cd "$OUTPUT_DIR"
find . -type f \( -name '*.log' -o -name '*.report.json' -o -name '*.meta.json' \) -exec sh -c '
out="$1"; shift
for f in "$@"; do
mkdir -p -- "$out/$(dirname "$f")"
cp -- "$f" "$out/$f"
done
' sh "$out_root" {} +
)
fi
# Small manifest for context
cat > "$stage_dir/README.txt" <<EOF
Collected: $ts
OUTPUT_DIR: ${OUTPUT_DIR:-"(not set)"}
NXF_LOG_FILE: ${NXF_LOG_FILE:-"(not set)"}
Nextflow log used: $nxf_log
EOF
mkdir -p "$DEST_DIR"
# Create a .zip if possible, otherwise a .tar.gz
archive_base="$DEST_DIR/log_bundle_$ts"
if command -v zip >/dev/null 2>&1; then
(cd "$bundle_dir" && zip -rq "${archive_base}.zip" "log_bundle_$ts")
echo "Wrote ${archive_base}.zip"
else
tar -C "$bundle_dir" -czf "${archive_base}.tar.gz" "log_bundle_$ts"
echo "Wrote ${archive_base}.tar.gz (zip not found)"
fi
rm -rf "$bundle_dir"chmod +x collect_pixelator_logs.sh
OUTPUT_DIR="/path/to/your/output_dir" ./collect_pixelator_logs.sh./collect_pixelator_logs.shunzip -l log_bundle_*.zip.tar.gz:tar -tzf log_bundle_*.tar.gz
For a general Nextflow / nf-core debugging walkthrough, see Troubleshooting basics.
How is cell calling done?
In the nf-core/pixelator pipeline, cells are identified as highly connected subgraphs within the molecular interaction network. Most of these graphs are very small — representing spurious edges, partial subgraphs that failed to connect to a cell, or cell debris. These are easily distinguished from real cells by their size.
Cell calling separates "true" cells from background noise using the cell's number of protein molecules:
| Method | What it does |
|---|---|
| Molecule Rank Plot | Components are ranked by total antibody molecules (UMIs). A sharp "knee" or inflection point typically separates cells from debris or partial disconnected cell graphs. |
| Minimum Threshold | As a fallback, the pipeline requires >8,000 protein molecules per cell (default) to filter out the smallest graphs. |
See the full explanation in our algorithms documentation.
Data Analysis
Common questions about working with your results.
How do I interpret the colocalization log ratio?
The compares the observed number of edges (spatial connections) joining a pair of proteins to the number expected by chance:
| Value | Interpretation |
|---|---|
| Positive | Colocalization or clustering — proteins are physically closer than random chance suggests |
| Zero | Proteins are randomly distributed relative to each other |
| Negative | Spatial separation — the two proteins are segregated from each other |
What normalization method should I use?
We recommend the Centered Log Ratio (CLR) for protein abundance normalization.
How do I assess sample quality?
Review the following benchmarks in your QC report:
| Metric | Target | Why it matters |
|---|---|---|
| Valid Read Saturation | >20% | Ensures a sufficient number of valid reads were captured |
| Graph Edge Saturation | >40% | Indicates the library was sequenced deeply enough to capture network complexity |
| Graph Node Saturation | >60% | Indicates stable identification of sufficient protein molecules |
| Median avg. coreness | >1.5 | High coreness indicates structurally sound, intact cell components with high connectivity |
How do I filter out low-quality cells?
To ensure your analysis is based on high-quality individual cells rather than debris or artifacts, we recommend a data-driven approach. Rather than using "one-size-fits-all" numbers, visually inspect the distribution of metrics for each sample to determine specific thresholds.
Don't use fixed cutoffs — inspect your data and set thresholds per sample.
While we encourage evaluating all available cell QC metrics, you should at minimum filter based on:
| Metric | How to use it |
|---|---|
Molecule count (n_umi) | Use a molecule rank plot to set a minimum threshold. Remove cell fragments below the "knee" in the curve. The pipeline default is >8,000. |
Isotype fraction (isotype_fraction) | Determine a threshold to filter out cells with high levels of non-specific background binding. |
Cells that fail to meet these thresholds should be excluded during the Data Processing stage.
How do I annotate my cells?
Cell annotation translates protein abundance levels into distinct biological identities. There are two main approaches:
| Approach | Best for | How it works |
|---|---|---|
| Automated annotation | Quick start, efficiency | Uses a pre-annotated reference dataset to map cell identities onto your PNA data. We provide a PBMC reference, or you can supply your own. |
| Marker-based annotation | Precision, gold standard | Manual examination of specific protein marker abundances to confirm cell types with high precision. |
For automated annotation, your reference dataset should closely match your sample's biological context — similar sample type and protein panel will give the most accurate results.
Learn more: