Carbohydrates, commonly known as sugars, are complex biological molecules linked to many fundamental cellular processes in living organisms, so accurate scientific information is important, but new research by scientists at the University of York Structural Biology Laboratory reveals that much of the deposited data on carbohydrate structures may be flawed.

Structural studies of large biological molecules such as proteins and glycoproteins (molecules combining carbohydrate and protein elements, also known as glycans) are vital in determining how these molecules function. Reporting the correct structures of glycans is increasingly important for approval of new drugs by regulatory bodies such as the US Food and Drug Administration and the focus on the pharmaceutical and therapeutic potential of glycoproteins is driving a rise in new protocols and techniques for their production, which is in turn increasing data on new carbohydrate-containing protein structures. This means a much wider range of data on carbohydrate structures is now available for statistical analysis.

To determine molecular structure, scientists use techniques like X-ray crystallography and the resulting data is deposited in the worldwide Protein Data Bank (PDB) but a new study has analyzed the conformation and fit to experimental data of a subset of deposited carbohydrates: N-glycan-forming D-pyranoside, (chosen because they are all expected to be in the same naturally-favoured low-energy conformation, making the identification of anomalies easier.


3-D representation of one of the studied carbohydrate structures. The monosaccharide on the right fits well the experimental data (blue mesh) and is in the expected conformation, while the one on the left shows a distorted conformation and poor fit to the data. Credit: CCP4mg (www.ccp4.ac.uk)

Dr. Jon Agirre, Professor Gideon Davies, Professor Keith Wilson, and Dr. Kevin Cowtan found that nearly two-thirds of N-glycan d-pyranosides show a poor fit to the experimental data.

Agirre says, “64 percent of all N-glycan d-pyranosides show a correlation to density of less than 0.8, reflecting a poor fit to the experimental data. Indeed, 12 percent show a correlation smaller than 0.5. On top of that, about 25 percent of the studied sugars are in energetically improbable conformations; these are almost certainly wrong.”

Davies adds, “This creates a vicious circle: publication and deposition of incorrect structures informs subsequent statistical analyses that suggest the deposited structures are normal.”

The software developed for performing the analysis (Privateer) has been published by Computational Collaborative Project 4 (CCP4).