SDRF-Proteomics

metaproteomics

v1.0.0 sample

Base SDRF template for metaproteomics experiments (microbial community proteomics). Extends base directly and defines MIxS-aligned sample metadata. When combined with ms-proteomics, sample-metadata columns (organism, disease, cell type) are excluded. Use a child template (human-gut, soil, water) for environment-specific fields.

Key Guidance

The metaproteomics template is aligned with the Genomics Standards Consortium (GSC)

MIxS standard for environmental and host-associated sample metadata.

Template hierarchy:

metaproteomics (this template) - shared environmental context

human-gut - human gut microbiome (MIxS human-gut extension)

soil - soil metaproteomics (MIxS soil extension)

water - aquatic metaproteomics (MIxS water extension)

Usage: combine a child template with ms-proteomics, e.g.:

human-gut + ms-proteomics

soil + ms-proteomics

Key columns:

  • characteristics[environmental sample type]: Required. ENVO/EFO term describing

the environmental material (soil, seawater, gut microbiome, biofilm).

  • characteristics[geographic location]: Recommended. GAZ term or coordinates.
  • characteristics[collection date]: ISO 8601 date of sample collection.
  • characteristics[environmental medium]: ENVO term for the environmental material.
  • comment[metagenome accession]: Link to matched metagenome dataset.

PRIDE ontology terms include MIxS cross-references in their definitions

(e.g., MIXS:0001107). Column names use spaces (SDRF convention) rather than

underscores (MIxS convention).

Inheritance

base metaproteomics

Combination Rules

Columns

18 own + 9 inherited = 27 total · click a row for details
Column Requirement Source Description
characteristics[environmental sample type] required metaproteomics Type of environmental sample analyzed (ENVO or EFO term). Corresponds to MIxS env_medium (MIXS:0000014).
Type of environmental sample analyzed (ENVO or EFO term). Corresponds to MIxS env_medium (MIXS:0000014).
ontology — ontologies: envo, efo
Should be a valid ENVO or EFO term
soil, seawater, gut microbiome, wastewater, biofilm
characteristics[geographic location] recommended metaproteomics Geographic location where sample was collected (GAZ term or coordinates). Corresponds to MIxS geo...
Geographic location where sample was collected (GAZ term or coordinates). Corresponds to MIxS geo_loc_name (MIXS:0000010).
allows N/A allows not available
ontology — ontologies: gaz
Should be a valid GAZ term or coordinates
Pacific Ocean, Amazon rainforest, 47.6062 N, 122.3321 W
characteristics[environmental medium] recommended metaproteomics Environmental material from which the sample was obtained (ENVO term). Corresponds to MIxS env_me...
Environmental material from which the sample was obtained (ENVO term). Corresponds to MIxS env_medium (MIXS:0000014).
allows N/A allows not available
ontology — ontologies: envo
Should be a valid ENVO environmental material term
soil, seawater, freshwater, feces, biofilm
characteristics[collection date] optional metaproteomics Date when sample was collected (ISO 8601)
Date when sample was collected (ISO 8601)
allows N/A allows not available
date
Sample collection date
2024, 2024-01, 2024-01-15
characteristics[sample collection method] optional metaproteomics Method used to collect the environmental sample
Method used to collect the environmental sample
allows N/A allows not available
pattern — pattern: ^.+$
Collection method description
grab sample, core sample, swab, filtration
characteristics[depth] optional metaproteomics Depth at which sample was collected. Corresponds to MIxS depth (MIXS:0000018).
Depth at which sample was collected. Corresponds to MIxS depth (MIXS:0000018).
allows N/A allows not available
number_with_unit
Depth with unit
10 m, 50 cm, 100 m
characteristics[altitude] optional metaproteomics Altitude or elevation of sampling site. Corresponds to MIxS elevation (MIXS:0000093).
Altitude or elevation of sampling site. Corresponds to MIxS elevation (MIXS:0000093).
allows N/A allows not available
number_with_unit
Altitude with unit
500 m, 1200 m, 0 m
characteristics[temperature] optional metaproteomics Temperature at sampling location. Corresponds to MIxS temperature (MIXS:0000113).
Temperature at sampling location. Corresponds to MIxS temperature (MIXS:0000113).
allows N/A allows not available
number_with_unit
Temperature in Celsius
25 °C, 4 °C, -20 °C
characteristics[ph] optional metaproteomics pH at sampling location
pH at sampling location
allows N/A allows not available
pattern — pattern: ^\d+(\.\d+)?$
pH value
7.0, 5.5, 8.2
characteristics[sample storage] optional metaproteomics Storage conditions for the sample before analysis
Storage conditions for the sample before analysis
allows N/A allows not available
pattern — pattern: ^.+$
Storage conditions
-80C, liquid nitrogen, 4C
comment[metagenome accession] optional metaproteomics Accession number for matched metagenome data
Accession number for matched metagenome data
allows N/A allows not available
accession
Database accession number
MGYA00001234, SRP123456
characteristics[microbiome source] optional metaproteomics Source of the microbiome being studied (e.g., gut microbiome, rhizosphere microbiome)
Source of the microbiome being studied (e.g., gut microbiome, rhizosphere microbiome)
allows N/A allows not available
pattern — pattern: ^.+$
Microbiome source description
gut microbiome, rhizosphere microbiome, oral microbiome, skin microbiome
characteristics[biomass estimation] optional metaproteomics Estimated microbial biomass in the sample
Estimated microbial biomass in the sample
allows N/A allows not available
pattern — pattern: ^.+$
Biomass estimation
1e9 cells/g, high biomass, low biomass
characteristics[host contamination] optional metaproteomics Level of host protein contamination if known
Level of host protein contamination if known
allows N/A allows not available
pattern — pattern: ^.+$
Host contamination level
low (<5%), moderate (5-20%), high (>20%)
comment[contaminant database] optional metaproteomics Contaminant database(s) used in database search
Contaminant database(s) used in database search
allows N/A allows not available
pattern — pattern: ^.+$
Contaminant database name(s)
cRAP, MaxQuant contaminants, cRAP;MaxQuant contaminants
characteristics[mock community] optional metaproteomics Identifier or name of mock community standard used
Identifier or name of mock community standard used
allows N/A allows not available
pattern — pattern: ^.+$
Mock community identifier
ZymoBIOMICS Microbial Community Standard, ATCC MSA-1000
characteristics[mock community composition] optional metaproteomics Description of mock community composition (species and ratios)
Description of mock community composition (species and ratios)
allows N/A allows not available
pattern — pattern: ^.+$
Community composition description
8 bacteria + 2 yeasts at defined ratios, even mix of 10 species
comment[expected organism list] optional metaproteomics Semicolon-separated list of organisms expected in mock community
Semicolon-separated list of organisms expected in mock community
allows N/A allows not available
pattern — pattern: ^.+$
Semicolon-separated organism list
E. coli;B. subtilis;S. cerevisiae;L. fermentum, Bacillus subtilis;Staphylococcus aureus

Contributors