Raman Spectroscopic and Chemometric Investigation of Lipid–Protein Ratio Contents of Soybean Mutants

Seeds belonging to fourth generation mutants (M4) of Ataem-7 cultivar (A7) variety and S04-05 (S) breeding line salt-tolerant soybeans were investigated by Raman spectroscopy, complemented by chemometrics methods, in order to evaluate changes induced by mutations in the relative lipid–protein contents, and to find fast, efficient strategies for discrimination of the mutants and the control groups based on their Raman spectra. It was concluded that gamma irradiation caused an increase in the lipid to protein ratio of the studied Ataem-7 variety mutants, while it led to a decrease of this ratio in the investigated S04-05 breeding line mutants. These results were found to be in agreement with data obtained by reflectance spectrum analysis of the seeds in the full ultraviolet to near-infrared spectral region and suggest the possibility of developing strategies where gamma irradiation can be used as a tool to improve mutant soybean plants targeted to different applications, either enriched in proteins or in lipids. Ward's clustering and principal component analysis showed a clear discrimination between mutants and controls and, in the case of the studied S-type species, discrimination between the different mutants. The grouping scheme is also found to be in agreement with the compositional information extracted from the analysis of the lipid–protein contents of the different samples.


Introduction
Soybean (Glycine max (L.) Merr.) is an important oilseed crop that is economically the most significant bean in the world. 1 It has been assuming a prominent role in human and animal diets, due to its high nutritional values, as high protein and oil content, and versatility for targeting different uses. Around 400 byproducts of soybean are valuable for various industries, where it is used as ingredient for production of adhesives, biofuel, soaps, doughs, alcohol, paints, fertilizers, among others. [2][3][4][5][6][7][8][9][10] Because soybean contains no starch, it is also a good source of protein for diabetics.
Genetic transformation applications have been widely used as attractive advancement for soybean breeding programs, allowing to improve agronomical qualities of soybean. [11][12][13] Induced mutation techniques are useful to increase the frequency of the mutations to generate genetic variations. Developing new variations to solve different agricultural crop problems are usually successfully achieved by mutation breeding in a shorter period, comparing to classical breeding processes. Gamma radiation is well known to cause rapid and reliable physicochemical and biological alterations in organisms. [14][15][16] In mutation breeding studies, the next crucial step is to select the desired genotypes for new cultivars. Due to the fact that the selected mutants from M2 and M3 generations differ for several traits, it is important to discriminate the desired genotypes to improve new cultivars. Therefore, the molecular structural evaluation is advantageous to determine the traits of the mutants.
Numerous methods have been developed for the last couple of decades to measure the protein and oil content of crop seeds. Conventional analysis techniques are mainly destructive, expensive and labor intensive, 9,17,18 and the use of alternative, cheaper techniques of essentially nondestructive nature, have been acquiring increasing importance for this purpose. Raman spectroscopy has been used along with infrared (IR) spectroscopy, in both mid-IR (MIR) and near-IR (NIR) regions, as reliable analysis tools to probe and characterize biomolecules in complex systems, since these techniques are particularly well suited for identification of functional groups belonging to proteins, lipids and other biochemically relevant molecules, through characteristic spectral features they originate. 10,[19][20][21] Raman spectroscopy has several advantages over infrared that makes it a more effective technique for compositional analysis of biochemically relevant systems. First, Raman spectra are considerably less perturbed by the presence of water (a ubiquitous constituent of biological materials) in the sample, compared to infrared spectra (including the NIR region). In fact, water gives rise to bands of very weak Raman intensity, while in infrared water bands are very intense and overshadow wide spectral regions which are relevant for investigation of other species. 22 Second, Raman spectroscopy requires only very minor preprocessing of the samples, hence being a very practical, easy to use method, and an essentially nondestructive technique. Finally, Raman spectra of biomolecules usually show bands with narrower profiles compared to infrared spectra, thus allowing a more detailed discrimination of different spectral features and facilitating identification of the characteristic bands ascribable to functional groups belonging to different constituents present in the sample. 10 Both Raman and infrared spectroscopies have been used in the last decade for composition analysis of soybean seeds, in particular protein, oil, carbohydrates and water contents. 9,10,18 Recently, compositional alterations in second generation soybean mutants obtained by harvesting gamma irradiated soybeans, as well as conformational alterations of their protein isolates, were investigated by means of infrared spectroscopy. 23,24 In the present study, salt tolerant mutant soybean seeds were investigated. The original seeds (obtained from Black Sea Agricultural Research Institute, Samsun, Turkey) were subjected to gamma irradiation (4.29 Gy/min dose rate) to induce random mutations, allowing subsequent selection of the salt tolerant mutants. The seeds were generated and harvested for several generations to test stability of salinity tolerance of the selected mutants. A detailed description of the procedures followed to generate and select the mutants and evaluate their salt tolerance ability has been provided elsewhere. 25,26 The present article focuses on the determination of the changes in the lipid-protein ratio contents of Ataem-7 soybean line (A7) and S04-05 soybean breeding line (S) M4 generation mutants by Raman spectroscopy, together with multivariate data analyses of the spectral data. This study aims to demonstrate that the Raman analysis allows a reliable, fast determination of this fundamental compositional feature, while it may also be used successfully to perform discrimination between the mutants and control groups (non-irradiated soybean seeds of the same variety).

Materials and methods
In this study, seeds of Ataem-7 soybean variety (A7) and S04-05 breeding line (S) provided by the Black Sea Agricultural Research Institute, Samsun, Turkey, were irradiated with a Cs-137 gamma source by 150, 200, and 250 Gy (4.29 Gy/ min dose rate). Six mutants (one mutant derived from the Ataem-7 variety, five mutants derived from S04-05 soybean breeding line) induced by this irradiation were selected among M4 generation seeds that were tested under both in vitro and in vivo conditions and proven to be 90 mM NaCl tolerant.
Samples were named as A7-C (control) and A7-M (mutant) in the case of Ataem-7 variety, and S-C (control) and S1-M, S2-M, and S3-M. S4-M and S5-M (mutants) for the S04-05 breeding line. The lipid and protein contents of the soybean seeds were determined by UV-Vis-NIR reflectance spectrum data analysis between 800-2500 nm wavelengths, with 1 nm resolution, by using Foss 6500 NIR system, in order to obtain reference values for testing the results obtained by Raman spectroscopy. 27 The total lipid and protein contents were calculated in accordance with calibration programme based on dry weight with respect to control samples ad represented as percentage values.
Raman spectra were acquired using a B&W Tek iRaman Plus-785 model micro-Raman system, equipped with a 785 nm excitation laser. The laser power at the sample was set as 280 mW. Before used, all soy seeds were peeled and gently divided into half. No further treatment was applied to the seeds for spectra collection. Each sample was placed on an aluminum covered thin glass slide and Raman spectra recorded from different parts of the seeds, in the 400-3000 cm -1 spectral range. An integration time of 30 s was used, and 32 scans were averaged for subsequent spectral and statistical analysis.
Spectra post-processing, such as baseline correction, normalization and calculation of derivative spectra, as well as other corrections such as multiplicative scatter correction (MSC), standard normal variate (SNV), and the multivariate analyses were performed using OMNIC (Thermo Fisher Scientific) and The Unscrambler (CAMO Software). Band areas were calculated using OMNIC in a supervised way and used to estimate the differences in the lipid-protein ratios in the different soybean mutants relatively to the controls. The estimated uncertainty in the percent changes in lipid-protein ratios is smaller than 2%. Cluster analysis and principal component analysis (PCA) were performed using the registered raw spectra for A7 soybeans, only subjected to baseline correction. For the analysis of the S soybeans, which included five different mutants, the spectra were first vector normalized, SNV and MSC corrections were applied, and then the second derivative spectra were obtained and used to perform the analyses. Ward's method using squared Euclidean distances was used to construct the dendrograms.

Results and discussion
Soybean has a complex composition, which includes complete proteins, carbohydrates and lipids, as well as vitamins and minerals. 28 The usual absolute amounts of the three main types of nutrients in soybean, proteins, carbohydrates and lipids are approximately 40%, 30%, and 20% of the total weight, respectively. 29. It has been shown that gamma irradiation within the doses used in the present investigation gives rise to mutant soybean species where essentially the relative contents of lipids and proteins is affected, the remaining constituents being essentially unaffected by this procedure, including trace constituents. 23 The vector normalized average Raman spectra of the control and mutant groups of the two investigated soybean lines, A7 and S, are depicted in Figures 1 and 2. The spectra exhibit differences in the relative intensities of the bands, demonstrating that the composition of the nutrients of the various groups is dissimilar. A general assignment of the observed bands is given in Table I, following the data in the literature. 10,20,30 Above 1200 cm -1 , the Raman spectra are ruled by bands having dominant contributions from lipids (1200-1500 cm -1 range, ascribable mainly to the CH2, twisting, wagging and scissoring modes, and CH3 deformation vibrations; 1720-1750 cm -1 , C=O stretching; and 2800-3000 cm -1 , CH2/CH3 stretching modes) or proteins (1500-1650 cm -1 , amides I and II). 10 Contribution of other constituents of the soybeans to the total band intensity in these spectral regions can be expected to be minor.
Besides, since the total amount of nutrients others than lipids and proteins can be expected to be nearly constant in all groups studied, 23 the observed intensity differences can be attributed to the relative amount of lipids and proteins in the different groups. A good test for this assumption, in the case of the lipid-related spectral ranges, consists in plotting the ratios of the bands observed in the different spectra in the three lipid-related regions against each other. As expected, straight lines were obtained, confirming that the band intensities in these spectral regions have essentially contributions from the lipids (a few outliers were identified, most probably resulting from artifacts due to the performed baseline correction treatment, which were removed from the dataset). These four spectral regions were then used in the present study to estimate the variations in the lipid-protein ratios of the various groups of mutants compared to the corresponding control groups. The results are summarized in Table   II.
The striking result is that the studied salt tolerant soybean mutants developed from A7, S plants were found to have responded distinctively to the performed gamma irradiation, with the lipids contents increasing in A7 mutants and decreasing in S mutants. Indeed, as shown in Table II, the lipid-protein ratio increases by ca. 6% in A7 mutants, compared to the control group, while for S mutants the general trend is the opposite, though the different mutants were found to respond in a somewhat different manner to the gamma irradiation treatment: in S3-M and S1-M the lipid-protein ratio decreases by 9% and 7%, respectively, in relation to the control group, being the mutants where the changes are larger; the decrease in the lipid-protein ratio for both S4-M and S5-M are more modest (2% and 3%, respectively); S2-M is predicted to have a lipid-protein ratio practically unchanged when compared to the control group. The identification of the reasons for the different behavior of the soybean mutants or within the different mutants of the S line is beyond the scope of this investigation, but the present results clearly demonstrate the great potential of gamma irradiation to develop mutant soybean plants targeted to different applications, either enriched in proteins or in lipids.
In Table II, the results obtained from the Raman data are compared with quantitative data determined using ultraviolet-visible-near-infrared (UV-Vis-NIR) analysis, the agreement between the two sets of data being very good. Considering the costs of the two types of analyses, it is noteworthy to see that the cheap and faster Raman analysis can provide such a good estimation of the lipid-protein ratio variation in the studied samples.
A cluster analysis of the different soybean groups, according to the Ward's method using squared Euclidean distance, was performed for both A7 and S mutants. The results are presented in Figures 3 and 4, respectively. For A7 variety, since only one type of mutant was investigated, the raw spectra (only with baseline correction) were used, allowing a good clustering of the samples, as shown in   Figure 7) shows that, in consonance with this conclusion, the main differences in the two spectra occur below 1200 cm -1 , i.e., in the fingerprint spectral region where, with a few exceptions, the bands cannot be assigned to a single class of compounds but receive significant contributions from several species present in the sample (see Table I). On the other hand, the lipid and protein dominant spectral regions are rather similar in these two spectra, as expected, since they correspond to points in the scores plot with relatively similar PC1 scores (see Figure 6). An interesting observation resulting from the comparison of the Raman spectra shown in Figure 7 is the fact that in the spectrum of the mutant the characteristic band of phenylalanine at about 1003 cm -1 is considerable more intense than in the spectrum of the control, indicating that the amino acids' contents of the two samples is different, at least in what concerns the relative amount of phenylalanine, which is well known to exist in considerable amounts in soybean (up to ~2% of the total weight). 10 On the whole, the principal component analyses performed on both A7 and S salt tolerant soybean mutants are in very good agreement, in one hand with the trends extracted from the cluster analysis regarding the similarities/dissimilarities of the various S line species studied, and on the other hand with the results obtained from the Raman intensities on the effects of the gamma irradiation on the relative contents of lipids and proteins of the soybeans. The conjugation of the Raman with the chemometrics analyses is then a convenient strategy for discrimination between the mutants and control groups, besides being also a fast, reliable method for evaluation of compositional parameters, specifically the lipid-protein ratio.

Conclusion
In this study, the relative lipid-protein contents of Ataem-7 variety and S04-05 breeding line salt-tolerant soybean seeds belonging to fourth generation mutants (M4), together with the corresponding controls, were investigated by means of Raman spectroscopy and compared with the data obtained using the UV-Vis-NIR reflectance spectroscopy detection method.     Axes were normalized to unity.   OCO bending of esters/phenylalanine a Assignments refer to the expected major contributors. In the case of the methylenic vibrations, for example, contributions to polysaccharides and proteins can also be expected Table II. Lipid-protein ratios in the studied soybean control groups and salt tolerant mutants, and percent variation relative to the controls determined by Raman spectroscopy and quantitative analysis using UV-Vis-NIR reflectance spectroscopy. Percent variation relative to the controls -6% -1% -9% -2% -3% 6% Average lipid/protein ratio: Percent variation relative to the controls -6.2% 1.3% -9.4% -5.6% -0.3% 4.2%