ANN-based Shear Capacity of Steel Fiber-Reinforced Concrete Beams Without Stirrups

Comparing experimental results on the shear capacity of steel fiber-reinforced concrete (SFRC) beams without mild steel stirrups, to the ones predicted by current design equations and other available formulations, still shows significant differences. In this paper we propose the use of artificial intelligence to estimate the shear capacity of these members. A database of 430 test results reported in the literature is used to develop an artificial neural network-based formula that predicts the shear capacity of SFRC beams without shear reinforcement. The proposed model yields maximum and mean relative errors of 0.0% for the 430 data points, which represents a better prediction (mean Vtest / VANN = 1.00 with a coefficient of variation of 1× 10-15) than the existing expressions, where the best model yields a mean value of Vtest / Vpred = 1.01 and a coefficient of variation of 27%.


Introduction
Since concrete is strong in compression but weak in tension, adding steel fibers to the material can be a solution to the limited strength in tensionthey keep crack widths small (Amin et al. 2016). In structural applications, steel fiber-reinforced concrete is combined with regular steel reinforcement.

Data Gathering
A database comprising 430 test results (outcomes of repeated tests were averaged) reported in the literature (see Lantsoght 2019b) was used to feed all ANN models. Tab. 1 shows the input and output variables/ranges considered in this study. Geometrical variables include the beam width (b) and effective depth (d), and the clear shear span to effective depth ratio (av/d), as depicted in Fig. 1. The reinforcement ratio ρ = As / (bd), where As is the rebar area, and the steel yield strength (fy), characterize the longitudinal reinforcement. Concrete mix is characterized by the maximum aggregate size (da) and the average concrete compressive strength (fc,cyl, taken from cylinders). Lastly, the fiber factor (F) described before and the tensile strength of the steel fibers (ftenf) were also used as inputs. The output is the sectional shear capacity (Vutot) as shown in Fig. (b), which includes the beam self-weight. In total, nine input variables and one output variable were adopted. The dataset considered is available in Developer (2019a and in most cases a fiber factor within 0.5-1 was adopted (higher values result in concrete mixes with low workability)for further details on fibers, please check Lantsoght 2019a. Electronic copy available at: https://ssrn.com/abstract=3457585

Introduction
Machine learning, one of the six disciplines of Artificial Intelligence (AI) without which the task of having machines acting humanly could not be accomplished, allows us to 'teach' computers how to perform tasks by providing examples of how they should be done (Hertzmann and Fleet 2012). The decision about which modelling technique to use in an arbitrary problem depends primarily on the availability of both the theory explaining the underlying phenomena and the data. When there is abundant data (also called examples or patterns) explaining a certain phenomenon, but its theory richness is poor, machine learning (e.g., Artificial Neural Networks) can be a perfect tool. An illustration of the several possible scenarios is presented in Basheer and Hajmeer (2000), as shown in Fig 2 (shadowed areas represent regions where any of the contiguous tools might me used).  practical applications, virtually covering any field of knowledge (Wilamowski andIrwin 2011, Prieto et. al 2016). In its most general form, an ANN is a mathematical model designed to perform a particular task, based in the way the human brain processes information, i.e. with the help of its processing units (the neurons). ANNs have been employed to perform several types of real-world basic tasks. Concerning functional approximation, ANN-based solutions are frequently more accurate than those provided by traditional approaches, such as multi-variate nonlinear regression, besides not requiring a good knowledge of the function shape being modelled (Flood 2008).
The general ANN structure consists of several nodes disposed in L vertical layers (input layer, hidden layers, and output layer) and connected between them, as depicted in Fig. 3.
Associated to each node in layers 2 to L, also called neuron, is a linear or nonlinear transfer (also called activation) function, which receives the so-called net input and transmits an output . All ANNs implemented in this work are called feedforward, since data presented in the input layer flows in the forward direction only, i.e. every node only connects to nodes belonging to layers located at the right-hand-side of its layer, as shown in Fig. 3. ANN's computing power makes them suitable to efficiently solve small to large-scale complex problems.

Learning
Each connection between 2 nodes is associated to a synaptic weight (real value), which, together with each neuron's bias (also a real value), are the most common types of neural net unknown parameters that will be determined through learning. Learning is nothing else than determining network unknown parameters through some algorithm in order to minimize network's performance measure, typically a function of the difference between predicted and target (desired) outputs. When ANN learning has an iterative nature, it consists of three phases: just of the inputs themselveslearning is called supervised (e.g., functional approximation, classification) or unsupervised (e.g., clustering), whether data used is labelled or unlabeled, respectively. During an iterative learning, while the training dataset is used to tune network unknowns, a process of cross-validation takes place by using a set of data completely distinct from the training counterpart (the validation dataset), so that the generalization performance of the network can be attested. Once 'optimum' network parameters are determined, typically associated to a minimum of the validation performance curve (called early stopsee Fig. 3 in ), many authors still perform a final assessment of model's accuracy, by presenting to it a third fully distinct dataset called 'testing'. Heuristics suggests that early stopping avoids overfitting, i.e. the loss of ANN's generalization ability. One of the causes of overfitting might be learning too many input-target examples suffering from data noise, since the network might learn some of its features, which do not belong to the underlying function being modelled (Haykin 2009).

The Universal Approximation Theorem
For a nonlinear input-output mapping, this theorem states (Haykin 2009) that a single hidden layer multi-layer perceptron network (MLPN), with (i) any bounded, monotone-increasing and continuous activation function for the hidden neurons, and (ii) an identity transfer function for the output neurons, is sufficient to compute an arbitrarily good approximation of any continuous function in a general n-dimensional spacethe absolute difference between any estimated and target outputs can be less than any ε > 0, for all input space values. However, the theorem does not say that the aforementioned network features are optimal in the sense of learning time or generalization.

Implemented ANN features
The 'behavior' of any ANN depends on many 'features', having been implemented 15 ANN features in this work (including data pre/post processing ones  With respect to the ANN formulation used in Abambres and Lantsoght (2018), a few changes were carried out for this work. They were (i) the elimination of performance improvements (feature 14), although that feature is still integrated in the code for eventual future use, and (ii) the algorithm used in feature 4. The latter is described next. It might happen that the actual distribution pt-pv-ptt to be used in the simulation is not equal to the one imposed a priori (before step 1).

Network Performance Assessment
Several types of results were computed to assess network outputs, namely (i) maximum error, (ii) % errors greater than 3%, and (iii) performance, which are defined next. All where (i) dqp is the q th desired (or target) output when pattern p within iteration i (p=1,…, Pi) is presented to the network, and (ii) yqLp is net's q th output for the same data pattern. Moreover, (1) is replaced by 1 whenever |dqp| < 0.05dqp in the nominator keeps its real value. This exception to eq. (1) aims to reduce the apparent negative effect of large relative errors associated to target values close to zero. Even so, this trick may still lead to (relatively) large solution errors while groundbreaking results are depicted as regression plots (target vs. predicted outputs).

Maximum Error
This variable measures the maximum relative error, as defined by eq. (1), among all output variables and learning patterns.

Percentage of Errors > 3%
This variable measures the percentage of relative errors, as defined by eq. (1), among all output variables and learning patterns, that are greater than 3%.

Performance
In functional approximation problems, network performance is defined as the average relative error, as defined in eq. (1), among all output variables and data patterns being evaluated (e.g., training, all data).

Software Validation
Several benchmark datasets/functions were used to validate the developed software, involving low-to high-dimensional problems and small to large volumes of data. Due to paper length limit, validation results are not presented herein but they were made public by Researcher (2018).
Moreover, several papers involving the successful application of this software have already been published and can be downloaded here.

Parametric Analysis Results
Aiming to reduce the computing time by cutting in the number of combos to be runnote that all features combined lead to hundreds of millions of combos, the whole parametric simulation was divided into nine parametric SAs, where in each one feature 7 only takes a single value. This measure aims to make the performance ranking of all combos within each 'small' analysis more Intel® Core™ i7 8700K @ 3.70-4.70 GHz.
Tab. 5. ANN feature (F) methods used in the best combo from each parametric sub-analysis (SA).

Proposed ANN-Based Model
The proposed model is the one, among the best ones from all parametric SAs, exhibiting the lowest maximum error (SA 9). That model is characterized by the ANN feature methods {1, 2, 6, 4, 5, 7, 5, 1, 3, 3, 1, 5, 3, 1, 3} in Tabs  It is worth recalling that, in this manuscript, whenever a vector is added to a matrix, it means the former is to be added to all columns of the latter (valid in MATLAB).

Input Data Preprocessing
For future use of the proposed ANN to simulate new data Y1,sim (9 x Psim matrix), concerning Psim patterns, the same data preprocessing (if any) performed before training must be applied to the input dataset. That preprocessing is defined by the methods used for ANN features 2, 3 and 5 (respectively 2, 6 and 5see Tab. 2), which should be applied after all (eventual) qualitative variables in the input dataset are converted to numerical (using feature 1's method). Next, the necessary preprocessing to be applied to Y1,sim, concerning features 2, 3 and 5, is fully described.

Dimensional Analysis and Dimensionality Reduction
Since neither dimensional analysis (d.a.) nor dimensionality reduction ( where one recalls that operator './' divides row i in the numerator by INP(i, 2).

ANN-Based Analytical Model
Once determined the preprocessed input dataset {Y1,sim}n after (9 x Psim matrix), the next step is to present it to the proposed ANN to obtain the predicted output dataset {Y5,sim}n after (1 x Psim vector), which will be given in the same preprocessed format of the target dataset used in learning. In order to convert the predicted outputs to their 'original format' (i.e., without any transformation due to normalization or dimensional analysisthe only transformation visible will be the (eventual) qualitative variables written in their numeric representation), some postprocessing is needed, as described in detail in 3.7.3. Next, the mathematical representation of the proposed ANN is given, so that any user can implement it to determine {Y5,sim}n after , thus eliminating all rumors that ANNs are 'black boxes'.

Conclusions
This paper shows how artificial neural networks (ANN) can be used to predict the shear capacity of steel fiber-reinforced concrete (SFRC) beams without stirrups. For this purpose, a database of 430 test results gathered from the literature was adopted. Nine input variables were taken to describe the problem, whereas the maximum sectional shear force at collapse (including beam self-weight) was the selected target variable. After an extensive ANN-based parametric analysis, the resulting 'optimal' model yielded maximum and mean relative errors of 0.0% for all the 430 data points, which outperforms (for those 430 instances) the currently available formulas and code provisions.
One limitation of this study is that the proposed model can only be used within the variable ranges of the dataset. While it covers the practical ranges of all material properties, it does not cover large-sized beams. As such, we recommend the performance of further tests covering the missing realistic scenarios, so that more robust and versatile data-driven analytical models (based on larger and richer datasets) can be developed. This study has not yet allowed a full description of the mechanics underlying the shear behavior of SFRC members without stirrups, but parametric studies by means of accurate and robust ANN-based models will facilitate the evaluation and improvement of existing and future mechanistic models.