The question is how I can utilize this knowledge of the upper bound on values to improve the regression result?

My intuition is to run a "normal" linear regression on all coordinates in $S$ giving $g(x)$ and then construct ${g}^{\prime}(x)=g(x)+c$, with $c$ being the lowest number such that $\mathrm{\forall}x,y((x,y)\in S\phantom{\rule{thickmathspace}{0ex}}\u27f9\phantom{\rule{thickmathspace}{0ex}}y\le {g}^{\prime}(x))$, e.g. such that ${g}^{\prime}(x)$ lies as high as it can whilst still touching at least point in $S$. I do, however, have absolutely no idea if this is the best way to do it, nor how to devise an algorithm that does this efficiently.