Frequently Asked Questions

Can't find what you're looking for? Contact our support team — we're happy to help!

General

How do I cite PNA/MPX?

Please cite the specific versions of the tools used in your analysis. We've made it easy:

➡️ How to cite

Where can I find example datasets?

We provide several publicly available datasets to help you get started and test your analysis workflows:

➡️ Datasets

Running the Pipeline

Questions about running nf-core/pixelator from start to finish.

How do I run pixelator on an HPC cluster?

The nf-core/pixelator pipeline supports execution on HPC clusters through Nextflow's built-in executors. Configuration options are available for SLURM, SGE, PBS, and other common schedulers.

➡️ See our configuration Quick Start guide

How can I diagnose a pipeline error?

When a run fails, there are some things that you can look at to try to determine what went wrong.

Location	What you'll find
`.nextflow.log`	The hidden file in your launch directory containing the full execution trace.
`<output dir>/pipeline_info/execution_trace_<timestamp>.txt`	Details on which stages failed or completed
`<output dir>/pixelator/[stage]/<sample_name>.report.json`	Sample-wise quality control metrics generated during the run. These files can be useful to understand if there was an issue with the data.
`<work dir>///*.log`	Logs for individual pipeline tasks (see more on how to find them below).

Finding individual logs

When a specific pipeline step fails, you should see an error message that looks something like this:

Caused by:
  <Cause of the error>

Command executed:

  <The command that was executed>

Command exit status:
  1 <or some other non-zero exit status>

Command output:
  <any output from the command>

Command error:
  <the error message>

Work dir:
  <path to the command work dir>

Container:
  <the container that was running>

If you want to see the full log of the job that specifically failed, you can look under the Work dir section. That folder should contain:

.command.sh: the command Nextflow executed
.command.out / .command.err: stdout and stderr from the tool
.command.log: combined stdout + stderr
.exitcode: the process exit status
.command.trace: resource usage (CPU, memory, runtime) for that task

Looking at the log files there can often help diagnose the problem.

Try using LLMs to diagnose your issue

If you feel like you have a hard time understanding your log outputs, copy-pasting them into an LLM can be a good strategy to get some help. Please be careful so that you do not submit any sensitive information or go against your institutional policies when doing so.

Need help?

If you need help, you can share the log and report files with our support team for faster troubleshooting.

Collect and send a log bundle

This script collects the most relevant pipeline logs into a single archive you can attach in your support request:

The Nextflow log ($NXF_LOG_FILE if set, otherwise .nextflow.log in your launch directory)
Selected files from the output directory (preserving folder structure) if an output directory is available:
- pipeline_info/execution_trace_*.txt
- **/*.log
- **/*.report.json
- **/*.meta.json

Note: the logs and QC report JSON files can contain sample names and other sample-level details. If you do not want to share that information, do not send the bundle.

Create a file called collect_pixelator_logs.sh with the contents below, then run it in the same directory where you ran nextflow and send the resulting .zip (or .tar.gz) to support.

#!/usr/bin/env bash
set -euo pipefail

# Optional: pipeline output directory (containing pipeline_info/ and pixelator/)
# Leave empty to skip collecting output-dir artifacts.
OUTPUT_DIR="${OUTPUT_DIR:-}"

# Where to write the archive (defaults to current directory)
DEST_DIR="${DEST_DIR:-$PWD}"

ts="$(date +%Y%m%d_%H%M%S)"
bundle_dir="$(mktemp -d -t pixelator_logs_"$ts"_XXXXXX)"
stage_dir="$bundle_dir/log_bundle_$ts"
mkdir -p "$stage_dir"

# Nextflow log: use NXF_LOG_FILE if set, else .nextflow.log in current directory
nxf_log="${NXF_LOG_FILE:-.nextflow.log}"
if [[ -f "$nxf_log" ]]; then
  mkdir -p "$stage_dir/nextflow"
  cp -- "$nxf_log" "$stage_dir/nextflow/"
fi

# Output-dir artifacts (optional)
if [[ -n "${OUTPUT_DIR}" && -d "${OUTPUT_DIR}" ]]; then
  # Copy a curated subset from OUTPUT_DIR, preserving the folder structure.
  out_root="$stage_dir/output_dir"
  mkdir -p "$out_root"

  # Execution traces (kept under output_dir/pipeline_info/)
  if [[ -d "$OUTPUT_DIR/pipeline_info" ]]; then
    (
      cd "$OUTPUT_DIR"
      find ./pipeline_info -maxdepth 1 -type f -name 'execution_trace_*.txt' -exec sh -c '
        out="$1"; shift
        for f in "$@"; do
          mkdir -p -- "$out/$(dirname "$f")"
          cp -- "$f" "$out/$f"
        done
      ' sh "$out_root" {} +
    )
  fi

  # Logs + report/meta JSONs anywhere under OUTPUT_DIR (preserve relative paths)
  (
    cd "$OUTPUT_DIR"
    find . -type f \( -name '*.log' -o -name '*.report.json' -o -name '*.meta.json' \) -exec sh -c '
      out="$1"; shift
      for f in "$@"; do
        mkdir -p -- "$out/$(dirname "$f")"
        cp -- "$f" "$out/$f"
      done
    ' sh "$out_root" {} +
  )
fi

# Small manifest for context
cat > "$stage_dir/README.txt" <<EOF
Collected: $ts
OUTPUT_DIR: ${OUTPUT_DIR:-"(not set)"}
NXF_LOG_FILE: ${NXF_LOG_FILE:-"(not set)"}
Nextflow log used: $nxf_log
EOF

mkdir -p "$DEST_DIR"

# Create a .zip if possible, otherwise a .tar.gz
archive_base="$DEST_DIR/log_bundle_$ts"
if command -v zip >/dev/null 2>&1; then
  (cd "$bundle_dir" && zip -rq "${archive_base}.zip" "log_bundle_$ts")
  echo "Wrote ${archive_base}.zip"
else
  tar -C "$bundle_dir" -czf "${archive_base}.tar.gz" "log_bundle_$ts"
  echo "Wrote ${archive_base}.tar.gz (zip not found)"
fi

rm -rf "$bundle_dir"

Example usage (recommended):

chmod +x collect_pixelator_logs.sh
OUTPUT_DIR="/path/to/your/output_dir" ./collect_pixelator_logs.sh

If you don't have easily accessible the output directory, you can still run:

./collect_pixelator_logs.sh

To see exactly which files were included in the bundle (without extracting it), run one of:

unzip -l log_bundle_*.zip

or, if your bundle is a .tar.gz:

tar -tzf log_bundle_*.tar.gz

For a general Nextflow / nf-core debugging walkthrough, see Troubleshooting basics.

How is cell calling done?

In the nf-core/pixelator pipeline, cells are identified as highly connected subgraphs within the molecular interaction network. Most of these graphs are very small — representing spurious edges, partial subgraphs that failed to connect to a cell, or cell debris. These are easily distinguished from real cells by their size.

Cell calling separates "true" cells from background noise using the cell's number of protein molecules:

Method	What it does
Molecule Rank Plot	Components are ranked by total antibody molecules (UMIs). A sharp "knee" or inflection point typically separates cells from debris or partial disconnected cell graphs.
Minimum Threshold	As a fallback, the pipeline requires >8,000 protein molecules per cell (default) to filter out the smallest graphs.

Learn more

See the full explanation in our algorithms documentation.

Data Analysis

Common questions about working with your results.

How do I interpret the colocalization log ratio?

The $\log_2(\text{ratio})$ compares the observed number of edges (spatial connections) joining a pair of proteins to the number expected by chance:

Value	Interpretation
Positive	Colocalization or clustering — proteins are physically closer than random chance suggests
Zero	Proteins are randomly distributed relative to each other
Negative	Spatial separation — the two proteins are segregated from each other

➡️ Tutorials: Python | R

What normalization method should I use?

We recommend the Centered Log Ratio (CLR) for protein abundance normalization.

➡️ Tutorials: Python | R

How do I assess sample quality?

Review the following benchmarks in your QC report:

Metric	Target	Why it matters
Valid Read Saturation	>20%	Ensures a sufficient number of valid reads were captured
Graph Edge Saturation	>40%	Indicates the library was sequenced deeply enough to capture network complexity
Graph Node Saturation	>60%	Indicates stable identification of sufficient protein molecules
Median avg. coreness	>1.5	High coreness indicates structurally sound, intact cell components with high connectivity

How do I filter out low-quality cells?

To ensure your analysis is based on high-quality individual cells rather than debris or artifacts, we recommend a data-driven approach. Rather than using "one-size-fits-all" numbers, visually inspect the distribution of metrics for each sample to determine specific thresholds.

Key principle

Don't use fixed cutoffs — inspect your data and set thresholds per sample.

While we encourage evaluating all available cell QC metrics, you should at minimum filter based on:

Metric	How to use it
Molecule count (`n_umi`)	Use a molecule rank plot to set a minimum threshold. Remove cell fragments below the "knee" in the curve. The pipeline default is >8,000.
Isotype fraction (`isotype_fraction`)	Determine a threshold to filter out cells with high levels of non-specific background binding.

Cells that fail to meet these thresholds should be excluded during the Data Processing stage.

➡️ Step-by-step: Python | R

How do I annotate my cells?

Cell annotation translates protein abundance levels into distinct biological identities. There are two main approaches:

Approach	Best for	How it works
Automated annotation	Quick start, efficiency	Uses a pre-annotated reference dataset to map cell identities onto your PNA data. We provide a PBMC reference, or you can supply your own.
Marker-based annotation	Precision, gold standard	Manual examination of specific protein marker abundances to confirm cell types with high precision.

Getting the best results

For automated annotation, your reference dataset should closely match your sample's biological context — similar sample type and protein panel will give the most accurate results.

Learn more:

➡️ Tutorials: Python | R

➡️ Industry standards: Scanpy | Seurat

General​

How do I cite PNA/MPX?​

Where can I find example datasets?​

Running the Pipeline​

How do I run pixelator on an HPC cluster?​

How can I diagnose a pipeline error?​

Finding individual logs​

How is cell calling done?​

Data Analysis​

How do I interpret the colocalization log ratio?​

What normalization method should I use?​

How do I assess sample quality?​

How do I filter out low-quality cells?​

How do I annotate my cells?​

General

How do I cite PNA/MPX?

Where can I find example datasets?

Running the Pipeline

How do I run pixelator on an HPC cluster?

How can I diagnose a pipeline error?

Finding individual logs

How is cell calling done?

Data Analysis

How do I interpret the colocalization log ratio?

What normalization method should I use?

How do I assess sample quality?

How do I filter out low-quality cells?

How do I annotate my cells?