Mouse proteomes

These data were published in Krahmer et al. 2018.

Protein and Phosphopeptide Correlation Profiling of Mouse Liver in Steatosis

To map protein and phosphopeptide localization in the liver of mice under a low fat control diet (LFD) or 3 and 12 weeks of high fat diet (HFD), organelles of total liver homogenates were separated by density centrifugation. All fractions of the gradients were analyzed by label-free proteomics and EasyPhos. To generate profiles that reflect subcellular distributions, the intensities for each identified protein or phosphopeptide were scaled from 0-1 and plotted over all fractions. Normalized intensities of average profiles from 3-4 biological replicates for each condition are shown.  For organelle assignment, a set of organelle marker proteins was used for parameter optimization and training of the SVM based supervised learning approach implemented in Perseus software (Sigma=0.2 and C=4). With SVM classification the main subcellular localization was assigned to every identified protein and phosphopeptide for each condition separately, or for all conditions combined. For the assignment of a second organelle contribution, a separate algorithm determined the highest correlation between the protein or phosphopeptide profile determined by our PCP experiment with in silico generated combination profiles. The correlation value between the experimentally determined protein or phosphopeptide profile and the assigned in silico generated combination profile is given as measure for the quality of secondary organelle assignment. The alpha value (0-1) is a quantitative measure for the second organelle contribution. Proteins significantly changing their subcellular localization between the LFD condition and the HFD conditions were identified by a correlation based outlier test (FDR<0.2). The analysis was performed pairwise between the HFD3 or HFD12 vs LFD control. 

Generation of Protein and Phosphopeptide Profiles

In order to normalize for differences in protein input in the organelle fractions, the LFQ intensities of phosphopeptides were divided by the total sum of intensities in each sample. For protein profiles the LFQ values were used. Those values represent values already normalized for protein input. To generate the protein and phosphopeptide profiles, the intensities for each identified protein or phosphopeptide were scaled from 0-1. Thereby each identified proteins or phosphopeptide has a value from 0-1 in each of the organelle fractions. To the fraction with the maximum intensity the value of 1 was assigned whereas fractions in which the protein or phosphopeptide was not quantified were set to 0. By plotting those 0-1 scaled intensities over all fractions profiles were generated that are independent of protein levels and just represent the distribution between organelle fractions. For the generation of median protein and phosphopeptide profiles of the biological replicates, intensities of each fraction of all biological replicates from one condition were summed before scaling those summed intensities from 0-1.
 

Organelle Assignments

Identification of Separable Compartments and Organelle Markers

In order to identify the cellular compartments that can clearly be separated by our PCP approach, protein or phosphopeptide profiles (medians from biological replicates) of the proteins and phosphopeptides identified in all three conditions (LFD, HFD3, HFD12) were used for Euclidian hierarchical clustering with average linkage, as implemented in Perseus. This revealed clusters of proteins or phopshopeptides corresponding to distinct subcellular compartments. For these compartments, we then compiled a list of 2199 marker proteins or 4130 phosophopeptides, respectively. Markers were chosen based on their documented GO-annotations and stable cluster assignment among all experimental conditions (selected marker proteins and phophopeptides are indicated in Tables S3 and S4). Due to overlapping and not validated annotations in the database, a marker selection exclusively based on GO-annotations was not useful. Proteins that are subunits of major cytosolic protein complexes and proteins involved in RNA binding translational complexes, whose position in the gradient overlays with organelle clusters, are indicated in Table S3. For HFD12 Golgi apparatus and LD compartment were combined into one category, as they were not separable under this condition.

SVM-Based Assignment of the Main Organelle

The defined marker set was used for parameter optimization and training of our SVM based supervised learning approach implemented in Perseus software (Deeb et al., 2015). Parameters were set to Sigma=0.2 and C=4. With SVM classification the main subcellular localization was assigned to every identified protein for each condition separately, or for all conditions combined (indicated in Tables S3 and S4). For every protein SVM classification was performed on all fractions of all biological replicates combined. The typical prediction accuracy for marker proteins was around 95%, and 90% for marker phosphopeptides.

Assignment of a Secondary Organellar Localization by Correlation Analysis

As most proteins showed dual subcellular localizations, we implemented an algorithm for correlation analysis in Perseus software to estimate a second subcellular compartment contribution. This algorithm determines the highest correlation between the protein or phosphopeptide profile determined by our PCP experiment with in silico generated combination profiles (the main organelle profile determined in the previous SVM analysis combined with every other possible median organelle marker profile). The correlation value between the experimentally determined protein or phosphopeptide profile and the assigned in silico generated combination profile is given in the output table Tables S3 and S4, and is a measure for the quality of secondary organelle assignment. A quality filter (correlation >0.4) was applied to discard unreliable assignments. The alpha value (0-1) is a quantitative measure for the second organelle contribution.

Correlation- Based Outlier Test

Proteins significantly changing their subcellular localization between the LFD condition and the HFD conditions were identified by a correlation based outlier test. The analysis was performed pairwise between the HFD3 or HFD12 vs LFD control. In a first step proteins were quality filtered, retaining only proteins with at least two reproducible profiles among all three biological replicates for both compared conditions. Pearson correlations of the profiles of all three biological replicates for both compared conditions were calculated. Only proteins with a maximal Pearson correlation >0.5 between the top2 profiles for both compared conditions were kept for further analysis. In the next step the best two correlated profiles were selected for each protein for both conditions. For those top2 profiles the profile correlations were calculated for both conditions and averaged (MeanCorr(within same conditions)). Then the average correlations of the biological replicates between the different conditions were calculated (MeanCorr(between conditions)). Proteins reproducibly changing between both conditions were determined based on the difference of both correlation values: dCorr= MeanCorr(between conditions)-MeanCorr(within same conditions). The calculations were performed for Pearson as well as Spearman correlations. Proteins were then sorted from highest (likely hit) to lowest dCorr (likely not changing). For each of the comparisons (early and late time points vs LFD) Hits from Spearman and Pearson correlations were combined for each condition and a combined FDR was calculated. The threshold of the dCorr value was set to 0.28 for both comparisons resulting in a combined FDR 0.2.

To increase the sensitivity for the detection of proteins relocalizing to or from LDs (those affect only one fraction of the protein profile therefore, the relocalization has less impact on the total profile correlation value), we separately identified significantly relocalized LD proteins as those with a significant change of protein abundance (0-1 scaled LFQ intensity) in the LD fraction (FR1) (Student’s t-test FDR 0.2), using the set of proteins with reproducible profiles for both time points of HFD compared to LFD.