Dissecting efficiency of a 5' rapid amplification of cDNA ends (5'-RACE) approach for profiling T-cell receptor beta repertoire.

Department of Biotechnology and Bioindustry Sciences, National Cheng Kung University, Tainan, Taiwan. Molecular Diagnostic Laboratory, Department of Pathology, National Cheng Kung University Hospital, Tainan, Taiwan.

PloS one. 2020;(7):e0236366

Abstract

Deep sequencing of T-cell receptor (TCR) genes is powerful at profiling immune repertoire. To prepare a TCR sequencing library, multiplex polymerase chain reaction (mPCR) is widely applied and is highly efficient. That is, most mPCR products contain the region critical for antigen recognition, which also indicates regular V(D)J recombination. Multiplex PCR, however, may suffer from primer bias. A promising alternative is 5'-RACE, which avoids primer bias by applying only one primer pair. In 5'-RACE data, however, non-regular V(D)J recombination (e.g., TCR sequences without a V gene segment) has been observed and the frequency varies (30-80%) between studies. This suggests that the cause of or how to reduce non-regular TCR sequences is not yet well known by the science community. Although it is possible to speculate the cause by comparing the 5'-RACE protocols, careful experimental confirmation is needed and such a systematic study is still not available. Here, we examined the 5'-RACE protocol of a commercial kit and demonstrated how a modification increased the fraction of regular TCR-β sequences to >85%. We also found a strong linear correlation between the fraction of short DNA fragments and the percentage of non-regular TCR-β sequences, indicating that the presence of short DNA fragments in the library was the main cause of non-regular TCR-β sequences. Therefore, thorough removal of short DNA fragments from a 5'-RACE library is the key to high data efficiency. We highly recommend conducting a fragment length analysis before sequencing, and the fraction of short DNA fragments can be used to estimate the percentage of non-regular TCR sequences. As deep sequencing of TCR genes is still relatively expensive, good quality control should be valuable.