SDRF-Proteomics

1. Status of this Template

This document provides guidelines for annotating affinity-based proteomics experiments in SDRF-Proteomics format.

Type: Technology Template

Status: Released

Version: 1.1.0 - 2026-01

2. Abstract

Affinity proteomics uses affinity-based capture methods (antibodies, aptamers, or other reagents) to detect and quantify proteins. Unlike mass spectrometry-based proteomics, these methods typically produce a single data file per study containing measurements for all samples across all targeted proteins.

This template covers:

  • Olink - Proximity Extension Assay (PEA)

  • SomaScan - Aptamer-based protein profiling

3. Key Differences from MS-Based Proteomics

Understanding these differences is critical for correct SDRF annotation.

3.1. Data File Structure

MS-Based Proteomics Affinity Proteomics

Each row = one sample-to-raw-file relationship

Each row = one sample in a shared data file

Multiple raw files (one per sample or fraction)

One data file per study containing all samples

Each sample typically has its own data file

Same data file appears in multiple rows

3.2. Technology Type

MS-Based Affinity-Based

proteomic profiling by mass spectrometry

protein expression profiling by antibody array or protein expression profiling by aptamer array

3.3. Assay Name Purpose

Important

In affinity proteomics, assay name serves a different purpose than in MS-based proteomics:

  • MS-based: Unique identifier for each MS run/data file

  • Affinity-based: Sample ID from the platform data file - this is the critical linking column that connects SDRF rows to samples in the shared data file

3.4. Columns NOT Applicable

The following MS-specific columns should NOT be included in affinity proteomics SDRF files:

  • comment[proteomics data acquisition method] - No MS acquisition

  • comment[label] - No isotopic labeling

  • comment[cleavage agent details] - No protein digestion

  • comment[fraction identifier] - Typically no fractionation

  • comment[modification parameters] - Not relevant

  • comment[precursor mass tolerance] / comment[fragment mass tolerance] - No MS tolerances

  • comment[dissociation method] / comment[collision energy] - No MS/MS

4. Template Hierarchy

base
  └── affinity-proteomics (THIS TEMPLATE)
        ├── olink (Olink-specific columns)
        └── somascan (SomaScan-specific columns)

5. Checklist

5.1. Required Columns

ColumnRequirementDescriptionExample Values
source name REQUIRED Unique identifier for the biological sample patient_001_plasma, sample_A
characteristics[organism] REQUIRED Species (NCBI Taxonomy) homo sapiens
characteristics[organism part] REQUIRED Anatomical part or sample type blood plasma, blood serum, CSF
characteristics[disease] RECOMMENDED Disease state (use "normal" for healthy) rheumatoid arthritis, normal
characteristics[biological replicate] REQUIRED Biological replicate number or "pooled" 1, 2, pooled, not available
assay name REQUIRED Sample ID from platform data file - critical linking column XB6, Sample_ID_1, L14_NIST
technology type REQUIRED Must be affinity-specific technology type protein expression profiling by antibody array
comment[instrument] REQUIRED Affinity platform/instrument Olink Explore HT, SomaScan Assay 7K
comment[data file] REQUIRED Single data file name (same for all rows) olink_npx.csv, somascan_results.adat
ColumnRequirementDescriptionExample Values
comment[sample type] RECOMMENDED Sample classification from platform sample, sample_control, negative_control, plate_control
comment[plate] RECOMMENDED Plate identifier for batch effect analysis 1, 2, plate_A
comment[panel name] RECOMMENDED Commercial panel name Olink Explore 3072, SomaScan 7K
comment[panel version] OPTIONAL Panel version v4.1, 2023-01
comment[quantification unit] RECOMMENDED Platform-specific measurement unit NPX, RFU, MFI, pg/mL
comment[technical replicate] RECOMMENDED Technical replicate number 1

6. Technology Type Values

Use the appropriate technology type for your platform:

Platform Technology Type Value

Olink (PEA)

protein expression profiling by antibody array

SomaScan

protein expression profiling by aptamer array

7. Understanding the Data File Structure

7.1. One Data File, Multiple Rows

In affinity proteomics, one data file contains all samples. The SDRF reflects this:

  • Same comment[data file] value for all rows

  • assay name (Sample ID) distinguishes different samples

  • Multiple rows may exist for the same sample (one per protein measured, if tracking at that granularity)

Example:

source name  | assay name   | comment[data file]
sample_1     | Sample_ID_1  | olink_npx.csv
sample_2     | Sample_ID_2  | olink_npx.csv
sample_3     | Sample_ID_3  | olink_npx.csv

7.2. Assay Name = Sample ID

The assay name column contains the Sample ID from the platform data file:

  • This is the key that links SDRF rows to the data file

  • Each unique Sample ID in the data file = one unique assay name in SDRF

  • Must match exactly what appears in the platform output

8. Handling Special Cases

8.1. Pooled Samples

For samples pooled from multiple individuals:

Column Value

characteristics[individual]

pooled

characteristics[age]

pooled (or age range if known, e.g., "40Y-50Y")

characteristics[sex]

pooled (if mixed) or specific value if all same sex

characteristics[biological replicate]

pooled

8.2. Control Samples

For negative controls, plate controls, and other QC samples:

Column Value

characteristics[disease]

not applicable

characteristics[age]

not applicable

characteristics[sex]

not applicable

characteristics[individual]

not applicable

comment[sample type]

negative_control, plate_control, sample_control

source name characteristics[organism] characteristics[organism part] characteristics[disease] characteristics[biological replicate] assay name technology type comment[instrument] comment[panel name] comment[quantification unit] comment[data file]
patient_001 homo sapiens blood plasma rheumatoid arthritis 1 XB6 protein expression profiling by antibody array Olink Explore HT Olink Explore 1536 NPX olink_npx.csv
patient_002 homo sapiens blood plasma normal 2 XB65 protein expression profiling by antibody array Olink Explore HT Olink Explore 1536 NPX olink_npx.csv
Sample metadata Data file metadata

10. Example: SomaScan Proteomics

source name characteristics[organism] characteristics[organism part] characteristics[disease] assay name technology type comment[instrument] comment[panel name] comment[quantification unit] comment[data file]
subject_A homo sapiens blood serum type 2 diabetes mellitus Sample_001 protein expression profiling by aptamer array SomaScan Assay 7K SomaScan 7K v4.1 RFU somascan_results.adat
Sample metadata Data file metadata

11. Example: With Controls

For experiments that include various control types:

source name characteristics[organism] characteristics[disease] assay name comment[sample type] comment[plate] comment[data file]
patient_001 homo sapiens cardiovascular disease L14_P001 sample 2 olink_reveal.parquet
pool_control homo sapiens not applicable L14_NIST sample_control 2 olink_reveal.parquet
neg_ctrl not applicable not applicable L14_NEG1 negative_control 2 olink_reveal.parquet

12. Platform-Specific Information

12.1. Quantification Units

Platform Quantification Unit Description

Olink

NPX

Normalized Protein eXpression (log2 scale)

SomaScan

RFU

Relative Fluorescence Units

12.2. Common Instruments

Platform Instrument Examples

Olink

Olink Explore HT, Olink Signature Q100, Olink Flex

SomaScan

SomaScan Assay 7K, SomaScan Assay 11K

12.3. Data File Formats

Platform Typical Format Extension

Olink

Excel, CSV, or Parquet with NPX values

.xlsx, .csv, .parquet

SomaScan

ADAT format (proprietary) or CSV

.adat, .csv

13. Best Practices

  1. Use platform Sample IDs as assay name: The assay name must match exactly what appears in the platform data file to enable proper linking.

  2. Specify technology type correctly: Always use affinity-specific technology types, never MS technology types for non-MS platforms.

  3. Document quantification units: Different platforms use different scales - always specify (NPX, RFU, MFI, pg/mL).

  4. Include plate information: For multi-plate experiments, comment[plate] is important for batch effect analysis.

  5. Classify samples properly: Use comment[sample type] to distinguish biological samples from controls.

  6. Handle pooled samples correctly: Use "pooled" for individual-specific fields when samples are pooled.

  7. Mark controls appropriately: Use "not applicable" for sample characteristics of controls where the value doesn’t apply.

14. Validation

pip install sdrf-pipelines
parse_sdrf validate-sdrf --sdrf_file your_file.sdrf.tsv --template affinity-proteomics
Note
MS-specific validations (label, cleavage agent, etc.) do not apply to affinity proteomics.

15. Template File

The affinity proteomics SDRF template file is available in this directory:

17. References

  • Assarsson E, et al. (2014) Homogenous 96-plex PEA immunoassay exhibiting high sensitivity, specificity, and excellent scalability. PLoS ONE.

  • Gold L, et al. (2010) Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE.

  • Olink Proteomics: https://www.olink.com/

  • SomaLogic: https://somalogic.com/