We propose an entirely data-driven approach to model inference based on a weighted Shannon entropy. We apply our method to recover network interactions of stochastic systems from observed sequence time-series. Our approach has significantly better performance than published methods in the difficult limit when the observed data comprises only a few time steps and the distribution of network couplings has large variance, while also matching performance in the limit of large numbers of observed time steps and small variance. Our parallel algorithm scales to infer large systems such as brain networks. Using the maximization of the weighted Shannon entropy, we can accurately recover coupling strengths in non-equilibrium systems with little observed data. Furthermore, our algorithm can be used to infer higher order interactions systematically. The lack of a tunable learning parameter is another advantage of our algorithm.