In addition to the 207 sequences collected in Norway that were
included in this study, three additional isolates were sequenced and excluded because they coded for truncated proteins. CagA EPIYA genotyping To discriminate the East Asian from the European isolates, the CagA genotype was determined in the 20 Korean samples and 50 of the Norwegian ones. Amplification and sequencing of the 3’ region of the cagA gene was performed as described selleck chemicals by Yamaoka et al.. Amplification of vacA To confirm the African origin of one of the Norwegian samples, PCR amplification of the vacA signal sequence and mid-region was performed as described by Atherton et al. . Biogeographic analysis Reference Selleck ARRY-438162 phylogenetic tree A reference phylogenetic tree was constructed using concatenated HK genes (atpA, efp,
ppa, tphC, ureI, trpC, and mutY) collected from the H. pylori Multi Locus Sequence 4EGI-1 Typing (MLST) database http://pubmlst.org/helicobacter/ as described by Falush et al.. In addition, 19 of the 29 currently-sequenced H. pylori genomes (See Appendix 1 for further annotation) collected from the National Center for Biotechnology Information (NCBI) database http://www.ncbi.nlm.nih.gov and four Norwegian isolates, sequenced according to the H. pylori MLST protocol, were used in the reference tree construction. In total, 393 sequences were aligned using ClustalW , and regions with gaps were removed using BioEdit . Model selection in MEGA5  was used Celecoxib to determine the best fit model for maximum likelihood (ML) analysis. PhyML v3.0  was used to generate 1000 ML bootstrap trees using the generalized time-reversible (GTR) model in which both the discrete gamma distribution (+G) with five rate categories and invariable sites (+I) were set to 0.61, as this was the model with the lowest Bayesian Information
Criterion score. A consensus tree was constructed with Phylip’s Consense package  and imported into FigTree v1.3.1 http://tree.bio.ed.ac.uk/software/figtree/ for further visualization. These resolved trees contain monophyletic groups not contradicting more frequent groups with a 50% default threshold (majority-rule). As a supplement, a strict analysis with a higher threshold was included where only groups occurring more than 75% are included. PldA phylogenetic tree The phylogenetic tree for pldA gene sequences was constructed using the same method as described for the reference tree. The pldA sequences were obtained through a Blast search of jhp_0451, limiting the search to H. pylori genome sequences. Only pldAON sequences coding for the entire OMPLA protein were included in this study. In addition, 19 of the 29 currently-sequenced H. pylori genomes collected from the NCBI database were aligned with the pldA gene sequences from the 227 isolates described in the current study. Genomes containing pldA genes that coded for truncated proteins were excluded from analyses.