In this case, we have chosen to compare our results with Pace and Scholtz’s scale, but other scales are qualitatively very similar, with Ala, Glu, Met,
Leu, Phe, Lys and Gln generally acknowledged as being helix forming residues. For instance, one secondary structure propensity scale that is commonly found in biochemistry textbooks lists Glu as the most favorable helix residue, which is more consistent with the composition of the glycine repeats in FliH. However, this same scale also lists Tyr as being somewhat unfavourable in helices, whereas in FliH Tyr is strongly favoured in position x1 of AxxxG and GxxxG motifs. This underscores the often stated caveat AZD5582 in vivo that context is everything in protein structure. The presence of glycine in such helical segments reinforces this point, as glycine residues are not normally acknowledged as being helix formers except within certain local sequence contexts. Looking beyond the PDB to find proteins with glycine repeats We report that there are no sequences found in the PDB set that we downloaded containing helices with glycine repeats anywhere near the length of those Selleck Nutlin 3a found in some FliH proteins. As a relatively small fraction of all known protein sequences have had their structures solved, one would
have a better chance of finding long glycine repeats by searching a larger database of protein sequences (not structures), such as the Swiss-Prot database. Some preliminary analysis was performed as a starting point for addressing this problem. The entire Swiss-Prot database, which consisted of 261,515 sequences at the time that it was downloaded, Thiamet G was searched for FliH-like glycine repeat segments. Of course, since these sequences do not contain secondary
structure information, there was no way to limit the search to α-helices. Eighteen sequences were found that contained repeat segments of length 11 or longer; however, all of these segments consisted of low-complexity repeats (for instance, the protein with Swiss-Prot accession number P19260 contains the repeat GSAGGSAGGSAGGSAGGSAGGSAGGSAGGSAGGSAGGSAGGSAGGSAGG), and thus were in no way analogous to repeats in FliH. The longest glycine repeat segment that was not a low-complexity repeat was of length 10, which was found in a presumably uncharacterized protein from Rickettsia japonica simply called “”17 kDa surface antigen”" (Swiss-Prot accession number Q52764). Further analysis would have to be done with this Swiss-Prot-derived sequence information in order to identify repeat segments that are similar to those found in FliH. Conclusion While many different short protein sequence motifs have been characterized, the glycine repeats in FliH and YscL are an unusual example.