While shell based RC systems do offer flexibility they also have downsides including copy and paste leading to subtly different behaviour across units. Dependency resolution was also a bit of a hack on top of scripts to deal with concepts like run levels.
The declarative approach of a proper configuration is a better and more scalable solution.
Yes training is the most expensive but it's still an additional trillion or so floating point operations per generated token of output. That's not nothing computationally.