One of the most useful properties of steel, and a trait that is highly valued by engineers, is its great strength. A wide variety of steels are commercially available, the differences between them being essentially due to the amount of various alloying elements which are added and the subsequent processing to which the steel is subjected. Changes in these parameters have a profound effect on the microstructure, leading to great variations in the macroscale properties of the metal.
Interactions between all of these factors are complex and generally poorly understood. As a result, it is not currently possible to predict the strength of a steel without any testing, although complete knowledge of its composition and processing history ought to be sufficient information to be able do so. The ability to predict strengths from these variables remains a highly desirable goal.
An empirical approach is an alternative way to model such complex phenomena, and neural networks are most suited to these problems. A description of the mathematical theory behind neural networks can be found in any number of textbooks, and discussion of associated issues is omitted from this report. One of the key features of the algorithms used to build the models in the present report is that they were developed within a Bayesian framework. This means that it is possible to calculate error bars on predictions, which indicate the degree of uncertainty in the models. As a result, a new measure of the error in a model, the 'Log Predictive Error' (LPE) can be defined, which does not penalise a model too much when it makes poor predictions if suitably large error bars accompany them. Details are presented in [1], and the relevant software is freely downloadable from that [2]. A user-friendly interface is commercially available from Neuromat (www.neuromat.co.uk).
Two neural networks have been developed to model different measures of the strength of steels: the yield strength (or elastic limit, sometimes abbreviated to YS) and the ultimate tensile strength (or UTS). Although these two properties are intimately linked, the relationship between them is non-trivial. Stronger steels, meaning steels with higher values for the tensile strength, are more likely to have high values for the yield strength too, but this is not necessarily the case, as can be ascertained from figure 1, which shows the relationship between these two properties for the steels in the database used to train the models.
For certain applications, this anomaly can be exploited. Structural engineers often use steel to strengthen buildings; so high strength alloys are of particular interest to them. But for some cases (for example when constructing a building in an area particularly prone to earthquakes), it is critical that the metal be capable of tolerating large strains without suffering permanent damage. In effect this means that a small value of the YS/UTS ratio is desired. It is, therefore, important to understand how and why the strength of steels (both the yield and the tensile strength) varies.
Figure 1: The relationship between the YS and the UTS of various steels.
It is thought that the increased strength, when compared with that of plain iron, comes about thanks to several mechanisms. Firstly, thanks to the seminal work of Hall and Petch, the grain size, d, is known to be intimately related to the strength of any metal. It can be shown (details can be found as an integral part of textbooks related to the field) that the magnitude of this effect is directly proportional to d-1/2. Secondly, any addition will to some extent enter into solid solution with the iron matrix, so some solid solution strengthening effect will also be present, its amplitude depending on the solubility of each element. Thirdly, precipitation of secondary phases may contribute to the overall strength. Other factors that are believed to be important include 'forest' dislocation strengthening, sub-grain size and texture. Finally (and most importantly) the intrinsic strength of the iron matrix, the friction stress opposing dislocation motion, greatly effects the strength of any alloy. Many published equations [3] for the yield stress, sYS, are therefore of the type
,
although most models omit one or more of the individual terms.
In writing such an expression, we assume that all strengthening mechanisms are additive. This is not necessarily true [3], a fact that compromises existing models using this approach. Although models of this form can be made to fit experimental data satisfactorily, there is no theoretical justification for them. A neural network model makes no assumptions on the data, other than requiring that the output varies continuously with all of the input parameters. It seems reasonable to assume that this restriction is satisfied in the case presently of interest.
All neural networks require a database from which they 'learn' trends in the data, which are then used to make predictions. The present study was completed using the results kindly donated by Corus from numerous experimental studies conducted during the 1950s and 60s. Because the data has been collated from many disparate sources, none of which were originally intended to be used to build a neural network, its quality is variable. Many studies recorded the contents of only certain components of the steel, and processing information was often incomplete.
For some elements (e.g. Nb, V), it was though safe to assume that none was present if no value had been recorded. For others, impurities such as S or P, such an approach would have been unreasonable. Where no composition was recorded for these elements, the value was set to the average of those steels for which the data was available. In other instances, inequalities were documented (e.g. <0.02wt% Al); in these cases, reasonable (but arbitrary) values were assigned for the relevant content.
For certain elements, content data was very sparse; so much so that it was not deemed useful to include it as a variable in the model. In all, information relating to 16 elements was sufficiently detailed to be considered statistically significant in the network analysis. They were: C, Si, Mn, P, S, Cr, Mo, Ni, Al, B, Cu, N, Nb, Sn, Ti and V.
It should be noted, however, some of the elements were included when as few as 60 of the experimental steels reported their content. Consequently, users of the model ought to be wary when predicting strengths of steels where the secondary alloying elements and impurities are present in quantities different from the database average. Very large error bars, indicating that the neural network has little confidence in its solution, will in any case accompany such predictions.
Information regarding the processing was equally imprecise. Often, steels that had been tempered had no record as to the length of time, or temperature at which they were held, and could therefore not be used in the database. Post-austinitising cooling rate was equally rarely noted; where no special mention was made, steels were assumed to be air cooled (corresponding to a rate of 1K/sec).
Omitting insufficiently precise data vastly reduced the size of the databases used to train the models: that used for the YS model consisted of 1560 different steels, and 1563 lines were used in the UTS model. Most steels were common to both databases.
Some authors (e.g. [3], [4]) have suggested that the solid solution strengthening effect of the alloying elements is related to the square root of the concentration. All alloying elements will strengthen (or weaken) the steel to some extent by entering into solid solution, although they could potentially affect the strength by other mechanisms as well. The number of atoms distorting the lattice is the most important factor in solid solution strengthening, so the square root of the atomic concentration was thought to be a useful parameter describing this effect.
Where relationships of an exponential nature like this are suspected, it is appropriate to use modified variables as input parameters for the neural network. In an attempt to produce more accurate predictions, the network was trained with the square root of the atomic concentrations (as well as the standard concentrations) as inputs. Atomic concentrations were calculated by dividing the weight percent presence of each element by its atomic weight (expressed to 1 d.p.).
When microalloying elements (i.e. niobium, titanium and/or vanadium) are added to a steel, the dominant effect is precipitation strengthening [5], since the solubility of these elements in ferrite is low. Their strong affinity for both carbon and nitrogen cause them to precipitate out as very stable carbides or nitrides. More realistically, the particles are complex carbonitrides, which remain present in the austenitic steel during any heat treatment, acting as barriers to crystal growth by causing Zener drag. As well as this pinning effect, it has been reported that niobium (and to a lesser extent vanadium) can delay the recrystalisation of the steel during any heat treatment. Once cooled, the refined austenitic microstructure translates to a smaller grain size in the ferrite which in turn, according to the celebrated Hall-Petch relationship, strengthens the steel.
The extent of strengthening by these means is clearly controlled by the relative concentrations of the elements present. This study will concentrate on the effect of niobium and vanadium; the interactions with titanium and aluminium (which also precipitate out, reducing the amount of nitrogen available for the formation of other nitrides) will not be considered.
The chemical formulae for
the pure precipitates of interest are: VC, VN, NbC and NbN. It was
therefore though that a ratio of the form
,
where atX is the atomic percent of element X, might be a
suitable measure of the strengthening due to carbonitride
precipitation. Accordingly, this number was calculated for each line
in the database and used as another input variable.
In developing the strength models, it was decided that only variables over which the steelmaker has direct control would be included as inputs. This means that the model will be able to make predictions of realistic experiments that could actually be carried out. If reliable, such an algorithm would be very useful to industry, since it could 'replace' (or rather, make suggestions for appropriate) experimental castings.
Essentially, this approach has meant that the only other possible input variables were process parameters, of which there were 6: the austinitising temperature and subsequent cooling or quench rate; the tempering time and temperature; and finally the rolling reduction and roll finish temperature. When a particular steel was not subjected to any tempering treatment, this was indicated to the model by setting the tempering time to zero and the tempering temperature to some value between the maximum and the minimum of the tempering temperature for those steels which were tempered. Similarly, if the steel was not rolled, the rolling reduction was set to 0%, with the finishing temperature fixed at some random value in the range of the finishing temperature of those that were rolled.
Many models were trained, using different combinations of input variables. Two pairs of models will be described here: one pair using 22 variables (the wt% concentrations and processing parameters) - these will be referred to as the 'simple' models; the
|
|
|
|
|
|
Figure 2: Variation of sv (left), LPE (centre), TE (right) in individual models, as a function of the complexity of the model. Top row: Complex YS model; Bottom row: Complex UTS model
other, 'complex,' models used the square root of the atomic concentration and the 'precipitate ratio' (defined earlier) as extra inputs, a total of 39.
The training was conducted on models with between 1 and 20 hidden units. For any given number of hidden units, six individual models were trained using different (random) initial values for the weights. It should be recalled that different initial values result in very different models, so effectively 120 separate models were trained on each database.
Figure 2 shows how the perceived noise, sv, the log predictive error (LPE) and the test error (TE) varied with the number of hidden units in the models for the neural networks trained on the 'complex' databases. The corresponding graphs for the 'simple' databases showed similar trends.
Although sv decreases in both models as the number of hidden units increases, the scales should be noted: for the yield strength model, the minimum is around 0.05, whereas it falls as low as 0.025 for the tensile strength model. This could be taken to mean that the YS data are inherently noisier. In fact, the values quoted for the yield strength sometimes referred to the 0.2% proof stress, not the lower yield point. The difficulties in experimentally determining the yield stress have been detected by the model.
In choosing the committee of models to use as a predictor, the difference between the 'simple and 'complex' models become more apparent. Again, readers should be aware of the scale when inspecting figures 3 and 4. Careful scrutiny of figure 4 will reveal that the variation in the combined test error is greater in the complex UTS model than in the corresponding simple model, and that the minimum error is much smaller for the complex model.
|
|
Figure 3: Test error as a function of number of models in the committee for the YS models. Left (a): 'Simple' model; Right (b): 'Complex' model.
It is suggested that a greater variation in the committee training error is a positive sign: the limited variation in the committee error produced by the best 'simple' models may be due to the fact that they are all very similar. As a result, the advantages of improved generalisation over the whole input space when forming a committee, so that the overall prediction is in some sense 'averaged out' to some optimal value, are foregone.
Although the preceding argument could be used to support the simple model for the YS network, figure 3 indicates that the minimum committee error is comparable in both cases. However, because the yield strength and tensile strength of steel are related, and since the YS data are so much noisier, it was thought that using similar database structures to model both properties would be reasonable.
Methods of selection are a subjective. It is reasonable to suggest that the simple models may all have similar test errors because they all model the underlying metallurgy very well, and so would form the best committees too.
In fact there are further inconsistencies in the way in which the committee was chosen. The trial committees are formed by taking the 'best' individual model, the
Figure 4: Test error as a function of number of models in the committee for the UTS models. Left (a): 'Simple' model; Right (b): 'Complex' model.
|
|
'best' two, three and so on. The 'best' models are ranked according to decreasing log predictive error. However, the 'best' committee is chosen by taking the smallest (combined) training error. Ranking the models using some other criterion, for example by increasing TE, could give an even smaller committee test error.
Unfortunately, the pressure of time has not allowed a more detailed investigation of these questions. The best model to predict the UTS was (perhaps arbitrarily) taken to be the committee formed by taking the average of the 11 'complex' models with the highest LPE, as indicated in figure 4(b). For the YS the best committee was made up of only 4 models, as can be ascertained in figure 3(b).
The committee models were subsequently tested on the entire database (training + testing), in order to ascertain how well they perform. Their predictive performance is presented in figure 5. Both results are satisfactory. As with all graphs in this report,
the error bars indicate one standard deviation and plot only the uncertainty in the
Figure 5: (a) Predicted UTS vs Experimental UTS for the entire database
(b) Predicted YS vs Experimental YS for the entire database
model, taking no account of the perceived noise. The UTS predictions are clustered more tightly around the diagonal than in the corresponding YS plot, signifying that it
However, the graphs are slightly misleading: they plot the data with which the models were trained, where we would expect a good fit. To test whether the models are 'good', we ought not concern ourselves solely with minimising test errors; useful models must also replicate known metallurgical trends.
Pure iron is a relatively weak material, but the addition of carbon is known to greatly increase its strength. The solubility of carbon in ferrite is very low, so the effect of entering into solid solution is small. Instead, increasing the carbon content increases the proportion of pearlite phase present in the steel. Since pearlite is stronger that plain ferrite, more carbon results in a stronger steel. A graph presenting the predicted UTS and YS for a steel containing 1wt% Mn, 0.15wt% Si, no microalloying additions and 'average' values for the other variables is presented in figure 6.
Figure 6: Variation of UTS and YS with carbon content.
The model has reproduced predicted the trends suggested by metallurgical considerations: both the UTS and the YS are predicted to increase as more C is added to the steel. Note also that the UTS is always predicted to be higher than the YS, although the latter property is not an input for the model of the former (and vice-versa).
It is known that ferrite is softer than pearlite, so is likely to yield first under an applied stress and will work harden before the pearlite behaves plastically. It therefore seems reasonable to postulate that the volume fraction of pearlite does not have a significant impact on the yield strength of a mainly ferritic steel. On the other hand, when greater stresses are applied, all of the phases present are subjected to large plastic strains, so all of the phases will be obliged to undergo deformation. In this case, pearlite content will have an effect on the material behaviour.
In summary: carbon is expected to have a greater effect on the UTS than on the YS. This is frequently observed in practice, and is predicted by the neural network models.
Although carbon is routinely added to iron in order to strengthen it, an excessive amount brings about other problems. Principally, enhanced weldability and better toughness properties mean that that steels with a low carbon content are generally preferred. Alloying with other elements is known to strengthen iron without the drawbacks associated a high carbon content: consequently, it has become common practice to introduce silicon to steels.
???(1.1) & (1.2)
...??.(2.1) & (2.2)
where (%pe) denotes the amount of pearlite present as a percentage of the total microstructure, d the grain size and wX the wt% content of element X (Nf refers to the free nitrogen). These expressions were derived by simple linear regression several decades ago, 1.1 and 2.1 in 1963 [4], 1.2 and 2.2 in 1978 (as quoted in [6]).
All of these equations depend upon the volume fraction of pearlite, a microstructural quantity that is not an input for our neural network. Nevertheless, we can still test it against this model: It is though that silicon strengthens purely (or at least overwhelmingly) by entering into solid solution in the ferrite. Values quoted in the literature [3] for the magnitude of this effect (83 MPa per wt% in solution) are very
|
|
Figure 7: The effect of Si content on the YS (left) and the UTS (right) of two steels.
close to the coefficients of wSi in equations (1) and (2), evidence that would support this hypothesis. Because silicon does not affect the microstructure (for example, by altering (%pe)), merely varying the silicon content of a steel effects the strength purely by altering the mount of silicon in solid solution.
As can be seen in figure 7, the effect of silicon on a 0.15wt% C, 0.6wt% Mn ('ferritic') steel, was indeed to increase the strength. Although the effect on the UTS in not entirely clear for small additions, above about 0.1wt% Si the relationship between appears to be linear. The gradient of the line through the predictions is roughly 110 MPa per wt% Si. 7(a) indicates that the YS is also directly related to the silicon concentration: the gradient of this line was about 50 MPa per wt%. Although the gradients calculated do not correspond exactly with the coefficients in equations (1) and (2), they are of a comparable magnitude, and a more precise fit would remain well within the error bars.
On the same axes, strength predictions are shown for one steel with a higher volume fraction of the pearlite phase. [It is not possible to do this explicitly with the model: setting concentration inputs to 0.6wt% C and 1wt% Mn ought to ensure that a good deal of pearlite is present]. These results are rather more surprising: they appear to indicate that silicon concentration and strength are not linearly related in pearlitic steels - at least not for small additions.
Gladman et al. [7] proposed a model, developed by purely regressive means, for the strength of pearlitic steels as a function of the volume fraction of ferrite, fa, the interlamellar spacing, S0, the grain size, d, the free nitrogen content, wNf, and the Mn and Si content, wMn, wSi respectively:
????..
(3.1) & (3.2)
[The units are MPa, although S0 and d are to be expressed in mm].
In many ways, the difference between equations (3) and equations (1) and (2) is an artificial construct: intuitively, a continuous increase in pearlite content will lead to a continuous change in strength. Yet different models for high- and low-pearlite steels create a discontinuity in predicted strength for steels with an intermediate pearlite content, where we switch from one model to the other. It would be more satisfactory to have one model, predicting the strength in a continuous manner. One of the attractive features of neural networks is that they can provide predictions over the entire input space.
It is intriguing that 'best fit' equations (3) imply that silicon has a greater effect on the UTS than on the YS. The very fact that the coefficients of wSi are different in the two equations might indicate that silicon does not purely act as a solid solution strengthener. It might equally indicate that the assumption that the strength ought to be related directly to concentration is flawed, although the errors bars given of the wSi coefficient (about ±21 MPa for both equations) could conceivably place each value at about the accepted value for solid solution strengthening, as in (1) and (2). The marked convexity of the YS 'curve' observed in figure T5 could also explain the underestimate of the wSi coefficient in (3.1): a cure of this shape could lead to an under-estimate of the gradient of the linear section in the range 0.3-0.9wt% Si, where the experiments were conducted.
We should, however, be wary of model predictions: few data were available for very low Si concentrations, and the predictions below 0.05wt% are accompanied by large error bars, so must be treated with caution; and the yield strength model is known to be considerably less accurate than the tensile strength model. In fact, the whole model is far more confident when predicting properties of ferritic steels, due to the nature of the training database. Perhaps, therefore, too much ought not be read into these discrepancies, although these trends were apparent for all high-carbon steels.
The addition of manganese to a steel increases its strength in (at least) three ways: firstly, by solid solution strengthening; secondly, it lowers the transformation temperature, which refines the grain size, which will strengthen the steel according to the Hall-Petch relationship; finally, it alters the eutectoid carbon content, increasing the pearlite content of the steel. Because of these multiple effects, it is not possible to test the absolute accuracy of our model with respect to the effect of manganese against the theoretical predictions of expressions (1), (2) and (3) as we did with silicon.
However, it is possible to qualitatively predict that increasing the Mn content will strengthen a steel. The model concurs with this prediction, as can be seen in figure 8. Whilst investigating annealed 0.15wt% C steels, Gladman and Pickering [Glad63] reported that "increasing the Mn content from 0.5 to 1.5% caused the UTS to increase from 27.6 to 33.6 ton/in2" (426 to 519 MPa, an increase of 93 MPa). The numerical results of the network simulation were a UTS of 392 MPa at 0.5wt% Mn and 488 MPa at 1.5wt% Mn, an increase of exactly 96 MPa. Although the numbers generated by simulation and experiment are not precisely equal (because the intrinsic strength of
Figure 8: Increasing the YS and UTS of a 0.15wt% C, 0.4wt% Si steel by adding Mn.
the steel due to other additions or processing is different), the increment upon Mn addition is very similar. This is a most satisfactory result.
Tests were also conducted to see whether the model adequately describes the effect of alloying niobium and vanadium into a steel. The metallurgical theory behind these additions was discussed briefly in the 'Model development' section.
It is well known [5] that Nb is a more effective strengthening agent than V, which has a much higher solubility at the elevated temperatures characteristic of the austinitisation. Figure 9 confirms that the model has 'learnt' this trend: it qualitatively predicts, for this 0.25wt% C steel, that adding niobium has a greater effect.
Microalloying additions strengthen the steel essentially by forming carbonitrides. In vanadium alloyed steels, the nitride is more stable than the carbide [8], and it is though that the majority of the particles are VN. Clearly then, the amount of nitrogen in the steel will have an effect on the extent of the precipitation strengthening. [8] found experimentally that the yield strength of a V-microalloyed steel increased by some 5 MPa for every 0.001wt% N added to the steel, essentially independently of the processing conditions. These findings were put to the model, and the results shown in figure 10.
|
|
Figure 9: Relative effect of Nb and V additions to the strength of a 0.25wt% C steel.
The model correctly predicts that increasing the nitrogen content of the steel will cause a strengthening effect. Although the model has produced large error bars, indicating a good deal of uncertainty in its predictions, the numerical output coincides with the experimentally determined values mentioned above: the anticipated 50 MPa increase in yield strength for an extra 0.01wt% N in solution was accurately predicted.
Sparse data for microalloyed steels in the database has meant that large error bars were endemic in these sorts of predictions, particularly when trying to predict the tensile strengths. Nevertheless, the neural network has achieved a reasonable degree of accuracy.
Figure 10: The affect of varying N content in a V-microalloyed 0.25wt% C steel.
The age of the experimental data used to train the neural network has mean that, in most instances, it the concentration of the minor elements present in steels was not recorded; the missing data were usually assigned some 'average' value. Because this had to be done for so many of the inputs for certain elements, it would be unreasonable to expect the neural network to properly understand the effect that they have on the strength of the steel.
It is well established [9] that ductile failure is initiated by the nucleation of voids at second phase particles. In steels, these particles are either carbides, silicates or sulphides. Increasing the concentration of S in a steel might therefore be expected to increase the number of such inclusions, leading to a greater likelihood of void nucleation and thus a lower UTS. The effect of S on the YS will be less marked.
These metallurgical arguments provide a physical justification for the trends seen in figure 11 from a physical basis. Note the very large error bars, except around 0.03wt% S, the 'average' value given to the missing data.
Many other investigations have been carried out. For example, a small addition of P increases the strength, in keeping with the large solid solution strengthening effect associated with this element. Other tendencies indicated by the model are equally supported by theory.
Figure 11: The effect of S concentration on the YS and UTS of steel.
Figure T10: The effect of roll finishing temperature on yield strength
The final parameters in the model are the processing variables. Again, there was a problem with sparse data here, particularly for the 'cooling rate' variable; most samples were air cooled, or assumed to cool at that rate. In testing the effect of these variables, we must settle for qualitative predictions. For example, tempering is known to reduce strength, but precise quantitative models of the tempering process against which we can test the neural network do not exist. Temper weakening is correctly predicted.
Figure 12 shows how the yield strength of rolled steel varies as a function of the temperature at which the rolling finishes. The condition of steel as it cools from 800-500 Celsius is recognised to be of importance. A decrease in the roll finishing temperature will lead to smaller grains (i.e. a bigger value of d1/2) and a stronger material. The model reflects this.
This report has described the development of neural network models describing the strength of steels and discussed the problems associated. Two committees were formed and tested to see whether their trends fitted with known metallurgical phenomena. Both models were satisfactory, although noise in the yield strength data meant that the tensile strength model performed considerably better.
The advantages of the Bayesian approach, which means that error bars can be calculated were clear. Difficulties in selecting which committee is likely to produce the best physical model were also apparent. Further work, based on an expanded and more accurate database would be likely to produce better models.
The author would like to express his gratitude to Thomas Sourmail for the considerable amount of time that he dedicated to explaining neural network theory and helping with software difficulties. Thanks are also due to Harry Bhadeshia, the project supervisor; his metallurgical knowledge was very helpful in suggesting suitable tests and interpreting the results.
[1] Bayesian non-linear modelling with neural networks'
DJC MacKay,
University of Cambridge programme for industry (modelling phase transformations in steels), (1995)
http://www.msm.cam.ac.uk/phase-trans/mphil/lit.html
[2] http://wol.ra.phy.cam.ac.uk/mackay/Software.html
[3] Materials Science and Technology: A comprehensive treatment (vol 7)
Vol. ed. FB Pickering, VCH (1992)
[4]: 'An investigation into some factors which control the strength of carbon steels,' FB Pickering & T Gladman, (in 'Metallurgical developments in carbon steels'), Iron and Steel Institute Special Report # 81 (1963)
[5]: The Physical Metallurgy of Microalloyed Steels,
T Gladman, The Institute of Materials (1997)
[6]: 'Neural network analysis of steel plate processing,'
SB Singh, HKDH Bhadeshia, DJC MacKay, H Carey, I Martin,
Iron and Steelmaking 25/5 (1998), 355
[7]: 'Structure-property relationships in high-C ferrite-pearlite steels,'
T Gladman, I D McIvor, FB Pickering,
Journal of the Iron and Steel Institute 210, 916 (1972)
[8]: 'Strengthening mechanisms in vanadium microalloyed steels intended for long products,' S Zajac, T Siwecki, WB Hutchinson & R Lagneborg,
Iron and Steel Journal International 38/10, 1130 (1998)
[9]: Steels: Microstructure and Properties,
RWK Honeycombe & HKDH Bhadeshia, Edward Arnold (1995)
[10]: 'Neural networks in materials science,' HKDH Bhadeshia,
Iron and Steelmaking International (1999)