Estudos genéticos de metabolomas pareados revelam processos enzimáticos e de transporte na interface do plasma e da urina

Nature Genetics (2023) Citar este artigo

996 acessos

83 Altmétrica

Detalhes das métricas

Os rins operam na interface do plasma e da urina, limpando os resíduos moleculares enquanto retêm solutos valiosos. Estudos genéticos de metabolomas pareados de plasma e urina podem identificar processos subjacentes. Conduzimos estudos genômicos de 1.916 metabólitos plasmáticos e urinários e detectamos 1.299 associações significativas. Associações com 40% dos metabólitos implicados teriam sido perdidas ao estudar apenas o plasma. Detectamos achados específicos da urina que fornecem informações sobre a reabsorção de metabólitos no rim, como transporte de glicerol mediado por aquaporina (AQP)-7 e diferentes pegadas metabolômicas de proteínas expressas no rim no plasma e na urina que são consistentes com sua localização e função , incluindo os transportadores NaDC3 (SLC13A3) e ASBT (SLC10A2). Determinantes genéticos compartilhados de 7.073 combinações metabólito-doença representam um recurso para entender melhor as doenças metabólicas e revelaram conexões da dipeptidase 1 com enzimas digestivas circulantes e com hipertensão. Estender os estudos genéticos do metaboloma além do plasma produz percepções únicas sobre os processos na interface dos compartimentos do corpo.

O rim humano limpa pequenos resíduos moleculares do plasma enquanto retém solutos valiosos, como aminoácidos, para manter a homeostase metabólica. Após a filtração glomerular do plasma para o ultrafiltrado urinário primário, sua composição é modificada em um processo altamente coordenado ao longo do néfron. Centenas de proteínas de transporte altamente especializadas movem solutos através das membranas das células que revestem o néfron para reabsorver moléculas importantes enquanto excretam ativamente as tóxicas ou desnecessárias1. Muitas dessas proteínas de transporte, bem como as enzimas responsáveis por gerar ou quebrar os metabólitos transportados, foram identificadas por meio do estudo de doenças monogênicas humanas. Eles representam alvos de drogas atraentes não apenas para tratar doenças renais, mas também doenças metabólicas, como exemplificado pelos inibidores dos transportadores SGLT2 e URAT1 (refs. 2,3). No entanto, muitos transportadores e enzimas, bem como seus substratos e produtos in vivo ainda precisam ser caracterizados. Nossa hipótese é que vincular informações de estudos genéticos humanos a metabolomas de plasma e urina forneceria novos insights sobre os papéis dessas proteínas na saúde e na doença.

Os efeitos genéticos nos níveis de metabólitos na urina podem refletir processos sistêmicos, como absorção intestinal de metabólitos dependente do genótipo ou reações de transformação hepática que são detectadas na urina devido à filtração dos respectivos metabólitos do plasma. Eles também podem refletir processos específicos do rim, por exemplo, a produção ativa, recaptação ou secreção de pequenas moléculas pelas células que revestem o néfron. Estudos com medições pareadas de metabólitos plasmáticos e urinários têm o potencial de distinguir entre esses processos.

Aqui, estudamos diferenças e semelhanças em relação às influências genéticas em metabolomas derivados de duas 'matrizes', plasma e urina, para testar a hipótese de que ambas fornecem informações complementares. Por meio da integração sistemática de informações genéticas de todo o genoma com medições pareadas de metabólitos de plasma e urina de 5.023 participantes do estudo da Doença Renal Crônica Alemã (GCKD), destacamos os processos sistêmicos subjacentes e específicos dos rins. Detectamos 1.299 associações significativas em todo o genoma e mostramos que estudar apenas o plasma teria perdido associações com quase 40% dos metabólitos. Destacamos exemplos de associações específicas da urina, de pegadas que os transportadores expressos nos rins deixam no plasma e nos metabolomas da urina e de papéis sistêmicos previamente não descritos de uma enzima enriquecida pelo rim. Este estudo gera um rico recurso para futura validação experimental de processos enzimáticos e de transporte ainda não caracterizados que podem representar uma ligação molecular entre variantes genéticas e características e doenças humanas.

0.8). In summary, discovery GWAS of the plasma and urine metabolomes identified a wealth of significantly associated loci, the basis for subsequent characterizations./p> 0.8), gray labels indicate genetic regions identified in both plasma and urine without intermatrix colocalization, and red or blue labels indicate genetic regions exclusively identified in plasma or urine, respectively. The number of plasma and urine mQTLs annotated to a gene is given in parentheses (plasma, urine). The pie chart reflects the proportions of the 282 unique genes that were annotated as enzymes and transporters. Official gene symbols for PYCRL and ERO1L are PYCR3 and ERO1A, respectively./p>5 colocalizing regions are color coded and labeled. For the three other groups, all genes assigned to >50 colocalizing metabolite regions are color coded and labeled./p> 0.8; Methods) involving 1,162 mQTLs. Colocalizing associations were divided into four groups (Supplementary Table 10): those where the same genetic signal affected different metabolites in the same matrix ((1) ‘intraplasma’, n = 3,189; (2) ‘intraurine’, n = 3,155), the same metabolite in both plasma and urine ((3) ‘intermatrix, same metabolite’, n = 204) and different metabolites in plasma and urine ((4) ‘intermatrix, different metabolite’, n = 4,048)./p>50% of the 3,155 intraurine colocalizations (Fig. 3c). This is consistent with FADS1 encoding a central enzyme in polyunsaturated fatty acid metabolism17 and the predominance of these lipid metabolites in plasma and with NAT8 encoding an N-acetyltransferase highly expressed in the kidney that generates water-soluble molecules for excretion18 and the abundance of N-acetylated metabolites in urine. Similarly, the organic anion transporter encoded by SLCO1B1 and the solute transporters encoded by the SLC17A family show high and specific expression in liver and kidney, respectively, where they transport dozens of physiological and pharmacological substrates19,20./p>

0.8) with rs601338, at which the minor A allele encodes the stop-gain variant p.Trp154Ter (NP_000502.4) that was associated with higher levels of only these two urine metabolites. The encoded fucosyltransferase 2 is a ubiquitously expressed enzyme that mediates the inclusion of fucose into glycans on a variety of glycolipids and glycoproteins. Individuals homozygous for p.Trp154Ter have lower risk of several infectious diseases during childhood25,26, a selective advantage. Indeed, we detected positive selection at this and other loci, including positive controls such as the LCT locus (Methods and Supplementary Table 21). The extended homozygosity of the haplotype carrying the minor, derived allele at the galactosylglycerol mQTL further supported positive selection (Fig. 5b)./p>64-fold higher urine but not plasma glycerol levels (Fig. 5g), thereby confirming a single case report through evidence from population studies./p> 0.8), with color coding representing the phenotype category. Effect directions are indicated by the line type (solid, positive association; dashed, inverse association). CNS, central nervous system; NOS, not otherwise specified./p>50% of the observed metabolite variance. Although this translates into much smaller effects on complex diseases such as hypertension, arthropathies or gallstone disease, colocalization can nominate shared pathophysiological mechanisms and inform about potential therapeutic targets, repurposing opportunities and potential side effects of approved drugs. Our study includes numerous such examples, supported by the recent launch of new drugs such as evinacumab, a monoclonal antibody targeting angiopoietin-like 3 (ANGPTL3) to treat dyslipidemia, or the SLC10A2 inhibitor odevixibat to treat cholestasis. Even if a target implicated by metabolites in our study is not desirable or amenable for therapeutic modulation, disease-associated metabolites may represent valuable intermediate biomarkers for risk prediction or response to treatment./p>60 ml min−1 per 1.73 m2 with UACR > 300 mg per g (or urinary protein/creatinine ratio > 500 mg per g) were included53. This study used biomaterials collected at the baseline visit, shipped frozen to a central biobank and stored at −80 °C54. A more detailed description of the study design, standard operating procedures and the recruited study population has been published53,55. The GCKD study was registered in the national registry for clinical studies (DRKS 00003971) and approved by local ethic committees of the participating institutions (universities or medical faculties of Aachen, Berlin, Erlangen, Freiburg, Hannover, Heidelberg, Jena, München and Würzburg)53. All participants provided written informed consent. For this project, metabolites were quantified from stored EDTA plasma and spot urine. Information on genome-wide genotypes, covariates and metabolites was available for 4,960 (plasma) and 4,912 (urine) persons./p>

4,500 purified standards) was used for metabolite identification. Known metabolites reported in this study conformed to confidence level 1 (the highest confidence level of identification) of the Metabolomics Standards Initiative58,59, unless otherwise denoted with an asterisk. Additional mass spectral entries have been created for compounds of unknown structural identity (unnamed biochemicals; >2,750 in the Metabolon library), which have been identified by virtue of their recurrent nature (both chromatographic and mass spectral). Peaks were quantified using the area under the curve and normalized to correct for variation resulting from instrument interday tuning differences by the median value for each run day. Likewise, metabolites in the ARIC replication sample were also quantified with the Metabolon HD4 platform./p>50% missing data. A total of 130 plasma and 131 urine metabolites were removed, as less than 300 genotyped samples were available./p>5% of samples outlying >5 s.d.). Three plasma samples and one urine sample represented an outlier >5 s.d. along one of the first 15 principal components based on metabolites with complete information. The final dataset consisted of 1,296 plasma and 1,401 urine log2-transformed traits for subsequent GWAS. Supplementary Table 2 provides detailed annotation of the metabolites, including heritability estimates for metabolites with at least one genetic association. Over the course of this project, two formerly different urine metabolites were merged because they represented the same molecule: X-12739 and X-24527 to the glutamine conjugate of C6H10O2 (1)* and X-23667 and X-24759 to (2-butoxyethoxy)acetic acid./p> 0.8) within a window of ±500 kb around the index SNP based on genetic data from the 1000 Genome Project phase 3 version 5 of European ancestry using https://snipa.helmholtz-muenchen.de/snipa/?task=proxy_search. For each study, the best available proxy SNP in terms of maximal LD and minimal distance was selected. Summary statistics were downloaded from https://metabolomics.helmholtz-muenchen.de/gwas/index.php?task=download (Shin et al.6), http://www.hli-opendata.com/Metabolome (Long et al.7, only summary statistics with P value < 10−5), https://omicscience.org/apps/crossplatform/ (Lotta et al.8), https://pheweb.org/metsim-metab/ (Yin et al.10), https://omicscience.org/apps/mgwas/mgwas.table.php (Surendran et al.11) and http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/; accession numbers for European GWAS are GCST90199621–GCST90201020 (Chen et al.12). Hysi et al.9 shared their summary statistics upon request./p> 0.6 using GCTA-GRM71. GCTA-GREML72 was then used to estimate the proportion of variation in log2-transformed and, in the case of urine, pq-normalized metabolite levels that can be explained by the SNPs for all metabolites that gave rise to an mQTL./p> 0.8). For each mQTL, the GCTA-COJO Slct algorithm version 1.91.6 (ref. 73) was used to identify independent genome-wide significant SNPs (Pconditional < 3.9 × 10−11), using a collinearity cutoff of 0.1. For mQTL with multiple independent SNPs, approximate conditional analyses were carried out conditioning on the other independent SNPs in the region using the GCTA-COJO Cond algorithm to estimate conditional effect sizes. Statistical fine mapping was performed for all independent SNPs per mQTL. In loci with a single independent SNP, approximate Bayes factors (ABFs) were calculated from the original GWAS effect estimates using Wakefield's formula74 with a standard deviation prior of 1.33. For mQTL with multiple independent SNPs, ABFs were derived from the conditional effect estimates. The SNP's ABF was used to calculate the posterior probability for the variant driving the association signal (PPA, ‘causal variant’). Credible sets were calculated by summing the PPA across PPA-ranked variants until the cumulative PPA was >99%. log2-transformed credible set sizes were regressed on the MAFs of independent index SNPs./p>

We also performed colocalization analyses of mQTLs with disease outcomes and biomarker measurements in the UK Biobank, with two representative kidney function traits and with trans pQTLs using the precomputed pQTL data from Sun et al.79 to gain insights into clinical consequences and potential molecular mediators of mQTLs. Association summary statistics between SNPs and 30 biomarkers from the UK Biobank baseline examination, including the liver function markers AST, ALT, GGT, bilirubin and albumin, were computed using BOLT-LMM80 (application no. 20272) in the same subset of European-ancestry participants as previous studies81. Precomputed GWAS summary statistics of diseases as ascertained in the UK Biobank and analyzed using phecodes were obtained from https://www.leelabsg.org/resources (1,403 binary traits) and from https://yanglab.westlake.edu.cn/data/ukb_fastgwa/imp_binary/ (2,325 of 2,989 binary traits82; traits containing job-coding terms were excluded from the analysis). There were 816 phecodes analyzed in both, but only unique phecodes were counted for positive colocalizations. We used GWAS summary statistics of creatinine-based and cystatin C-based eGFR (eGFRcrea and eGFRcys) from Stanzick et al.1.2 million individuals. Nat. Commun. 12, 4350 (2021)." href="/articles/s41588-023-01409-8#ref-CR83" id="ref-link-section-d87044679e3267"83, who meta-analyzed kidney function GWAS from the CKDGen Consortium and the UK Biobank. The GWAS summaries were downloaded from the CKDGen data website at https://ckdgen.imbi.uni-freiburg.de. Colocalization testing between mQTL and trans pQTL was performed within a window of ±500 kb around the mQTL's index SNP when at least one trans pQTL association with P < 0.05 ÷ 409 ÷ 3,000 for plasma and P < 0.05 ÷ 410 ÷ 3,000 for urine was present within a window of ±100 kb around the index SNP. Similarly, colocalization analysis between mQTL and biomarkers, diseases and kidney function traits was performed within ±500 kb of the index SNP when there were one or more associated variants with MAF > 0.01 and P < 0.05 ÷ 409 or P < 0.05 ÷ 410, respectively, within ±100 kb of the index SNP, using the method outlined above./p> 0.01)./p>

1.2 million individuals. Nat. Commun. 12, 4350 (2021)./p>25% are labeled with the associated metabolite and most likely gene; error bars represent h2 variance) can contain interesting biological information: for example, three metabolites with larger estimated heritabilities in urine than in plasma are N-acetylated amino acids, all of which have an mQTL at NAT8. NAT8 is highly and selectively expressed in the kidney, where the encoded enzyme N-acetylates molecules to make them water soluble for subsequent excretion./p>3% are shown). The two inner bands represent the effect size of the mQTL in plasma (framed in red) and urine (framed in blue). Shades of orange indicate positive effect sizes, shades of aquamarine negative ones. The two outer bands represent the variance in metabolite levels in plasma and urine explained by the index SNP of the corresponding mQTL, where a darker shade of green corresponds to a greater explained variance./p>