MDBiomarkers (v 1.01)
  • Site map
  • Home
  • Data curation
  • Visualization
    • Interactive Biomarker App
  • Tutorial
  • FAQ
  • Funding/Acknowledgements

Tutorial

On this page

  • How is a biomarker defined?
  • App
    • Overview table
      • Searching for a single biomarker
      • Comparing multiple biomarkers in the Overview table
      • Association Summaries
      • Selecting a biomarker from the table
    • Biomarker-specific details
      • All identifications
      • Known involvement in disease
      • Top (positive) correlations (steroid-naive only) within network module panel
      • DMD vs Healthy Controls (serum; Protein target)
      • DMD vs Healthy Controls (tissue; mRNA target)
      • Treatment-responsive (serum; Protein target) panel
      • Association with age panel
      • Correlations: biomarker and clinical outcomes (steroid-naïve, 4–8 years)

How is a biomarker defined?

For simplicity, we consider a biomarker as a marker of biological activity associated with a UniProt ID and Entrez Gene ID. This means that fragments which originate from the same protein or isoforms which have the same IDs are grouped together in our compilation.

If the technology used to quantify the biomarker signal provided different target names for these, and that data was available to us (sometimes, Supplemental Material of published papers does not have this information), this was used, and is present to allow for discerning the different targets. Consider the example of Complement C3. There are many fragments of C3, including C3a, C3b, C3d, iC3b, and C3adesarg. These all share the same UniProt and EntrezGene ID and are therefore grouped together.

When multiple aptamers (e.g., Somalogic) or probes (e.g., Affymetrix panels) were used, but they had the same target name, these were used interchangeably, but all individual signals were retained and provided (rather than aggregating them together in some form). The idea here is that some of the aptamers, or probes may be more or less effective given the properties of the serum sample and the biomarker target. What are the implications of this? The website provides the number of significant findings for various associations, such as association with DMD vs healthy controls. In the Overview Table (Section 2.1), you will see the fraction of the number of significant results across all aggregated results under a Uniprot ID.

App

There are two tabs in the app, as shown in Figure 1: Overview table (Section 2.1) and biomarker-specific details.

Figure 1: Tabs in the app

Overview table

The purpose of this tab is for searching for your biomarker of interest and getting a quick summary of the findings for that biomarker. Figure 2 shows the first 5 rows of the searchable table.

Figure 2: First five rows of the searchable table in the Overview table tab

There are multiple columns for identification including TargetFullName, Target, UniProt, EntrezGeneID, and EntrezGeneSymbol. Each of these columns is searchable using the white box below the column names.

Searching for a single biomarker

Suppose we want to search for MDC. If we enter “MDC” in the Target column, we will get multiple results, as shown in Figure 3. We could have used the full name of the biomarker in the TargetFullName search box: “macrophage-derived chemokine.”

Figure 3: Results from searching “MDC” in the Target column of the table in the Overview Table tab

A more flexible method for searching is to use the overall search column. If we enter “MDC” there, as shown in Figure 4, we get multiple results due to partial matching.

Figure 4: Results from searching “MDC” using the overall search bar of the table in the Overview Table tab

Multiple target names can occur due to there being multiple fragments from a single protein, or different naming conventions from different datasets combined together. Because the first entry contains mdc as one of the target names, it is the desired entry. We single-click on this entry to show biomarker-specific details (Section 1).

Comparing multiple biomarkers in the Overview table

Using an “OR” symbol, i.e., using “|”, multiple biomarkers can be compared in the overview table. For example, in the search box under the EntrezGeneSymbol column, put in CKM|FCER2|FGA, and a comparative overview shows (see Figure 5)

Figure 5: Results from comparing CKM to FCER2 to FGA in the EntrezGeneSymbol column of the table in the Overview Table tab

Association Summaries

The last few columns, as shown in Figure 2, provide the number of significant findings for multiple associations:

  1. DMD specific (serum) - is this biomarker elevated or depressed in serum of patients with DMD compared to healthy controls?
  2. DMD specific (tissue) - is this biomarker elevated or depressed in tissue of patients with DMD compared to healthy controls?
  3. Treatment responsive - does this biomarker respond to treatment?
  4. Change with age - does this biomarker change with age?

In cases where there are no fragments or isoforms of the protein in the merged dataset (ex. neuropilin-2), the fractions provided in the table reflect the consistency of getting a significant finding across different measurements methods (including different aptamers, probes, etc.), and the reproducibility of getting the same significance for a given measurement method, based on adjusted p-values used across different studies. In cases where there are different targets/fragments under a single UniProt ID, some fragments may have strong associations and others may not, but these results will be combined into a single fraction (see Section 1). To get a detailed picture, you will need to select the protein by single-clicking on a row and going to the Biomarker-specific details tab (Section 2.2) to understand the details. We do not recommend doing screening based on the overview table only.

Selecting a biomarker from the table

To get more information on a biomarker, single-click on that row in the table. This will bring you to the biomarker-specific details tab.

Biomarker-specific details

At the top of the page is a drop-down list of all biomarkers. This provides another method for searching the database. By default, the first entry of the sorted dataset is listed in this search bar. You can click this search bar, delete what is already in the search bar with a backspace, then start typing the name of your favourite biomarker to see the list of possible options. Click on the one you are interested in, and then click search on the right hand side. Note that the label used in this search bar combines the target full name(s), the target ID(s), the UniProt ID, and the Entrez Gene ID.

Direct links for UniProt, QuickGO, and Human Protein Atlas annotations for the selected biomarker are provided at the top of the page.

All identifications

All identifications including UniProt, EntrezGeneID, EntrezGeneSymbol, and TargetFullName, Target, Somalogic/Somascan IDs, and Affymetrix probe IDs are provided in this subpanel, e.g., Figure 6.

Figure 6: All available identifications for the biomarker

Known involvement in disease

Known involvement of the biomarker in disease generally (not necessarily muscle disease) is provided in this subpanel, e.g., Figure 7.

Figure 7: Known involvement in disease

The rest of this biomarker-specific view is a series of panels, discussed below.

Top (positive) correlations (steroid-naive only) within network module panel

The results in this panel use data subsets from a clinical trial in young (4 to <8 years) treatment-naive boys. Weighted correlation network analysis was applied to biomarker concentrations at baseline measured by SOMAscan to create hierarchical clusters (called network modules) of biomarkers with similar responses. For the selected biomarker (row), the top 10 highest correlations with other biomarkers (columns) within the network module are presented, as shown in Figure 8. Labels provide the UniProt ID and target name.

(a) MDC
(b) Creatine Kinase
Figure 8: Examples of the top (positive) correlations (steroid-naive only) within network module panel

DMD vs Healthy Controls (serum; Protein target)

The panel DMD vs Healthy Controls (serum; Protein target) shows results comparing proteins measured in serum of patients with DMD compared to healthy controls. An overall summary is available at the top of the panel, showing the fraction of statistically significant findings across the datasets compiled for this biomarker and the directionality in the majority of the biomarkers at baseline vs controls, and how this responds to treatment. This is the same as what the overview page had provided. For example, for IGFBP3, it shows that IGFBP3 is lower in serum of DMD vs healthy controls based on 4 out of 6 significant findings and increases in serum on treatment with steroids based on 3 significant findings.

There are two views for these panels: plot view (default) and table view.

The plot view shows a volcano plot in which the -log10 of the adjusted p-value is plotted against the log2 of the foldchange, as shown in Figure 9. Statistically significant datapoints (above the dotted line) are coloured orange and non-significant are coloured blue. Different shapes are used for different technologies (circle for SOMAscan, triangle for TMT). Hovering over the datapoint will prompt a mouseover/tooltip that indicates the citation, the target ID, the age range of the cohort, the technology used to measure the biomarker, the fold change of that protein in DMD compared to healthy controls, and the adjusted p-value of the comparison. The age range can be filtered; in the plot below, the age range was restricted to upper bound of 12 years for the samples from which data is being summarized.

By clicking on Table, the information can be obtained in a tabular format including p-values, fold changes, etc. This table can be filtered in the searchbox as well. Both the figures and tables allow for exporting the compiled findings from a subpanel. To export the figure, either take a screenshot, or when hovering over the figure, click the camera button to export the figure as a png plot. To export the table, click the Download button in the subpanel to export to either an Excel or CSV format.

(a) Creatine Kinase plot
(b) Creatine Kinase table
Figure 9: Plot view of the DMD vs Healthy Controls (serum; protein target) panel under the biomarker-specific details tab.

For biomarkers in which multiple fragments are grouped together (e.g., Complement C3), the tooltip will allow you to distinguish between different targets. The target can be filtered by removing or adding back in specific targets.

DMD vs Healthy Controls (tissue; mRNA target)

The panel DMD vs Healthy Controls (tissue; mRNA target) shows results comparing proteins measured in tissue of patients with DMD compared to healthy controls. The features described above also apply to the DMD vs Healthy Controls (tissue; mRNA target) panel.

Treatment-responsive (serum; Protein target) panel

The panel Treatment-responsive (serum; Protein target) shows proteins measured in serum associated with treatment by glucocorticoids. The features described above also apply to the Treatment-responsive (serum; Protein target) panel.

Association with age panel

The association with age panel indicates if the biomarker showed a statistically significant association with age, and what direction the association was. At the top of this panel, when available, there is a table providing information about previously published association with age in unaffected kids, i.e., without DMD. Below this, information on DMD specific associations is provided.There are two views available for this: a plot view (default) and a tabular view, the latter allowing for export of the table (including on healthy unaffected controls) into an excel or csv file. Figure 10 show the plot view for this panel for Creatine Kinase and Fibrinogen respectively.

The colour of the data point indicates the significance of the result, where orange indicates a significant adjusted p-value and blue indicates a non-significant adjusted p-value. The sign of the age association is indicated by which side of the plot the datapoint is on (left being a negative foldchange) as the x-axis label explains. Hovering over the data point will prompt a tooltip that indicates the citation, the target ID, the age range of the cohort, treatment status, the technology used to measure the biomarker, and the adjusted p-value. The table view shows the slope direction (note that modeling techniques varied and, therefore, slopes are not directly comparable across citations), the statistical significance of the finding, the citation from which the result was taken, the technology used to measure the biomarker, the age range of the cohort, the target, the adjusted p-value, and the trend annotation in healthy controls, where available.

In cases where multiple targets are grouped together, these targets are separated along the y-axis.

(a) Creatine Kinase
(b) Fibrinogen
Figure 10

Association with age panel under the biomarker-specific details tab.

Correlations: biomarker and clinical outcomes (steroid-naïve, 4–8 years)

A correlation matrix is provided. The correlations presented are simply the cross-sectional correlations between biomarker concentrations and a variety of clinical outcomes commonly used in young age in DMD. No significance testing was done. A table view is also possible.