# Let's imagine I have a function f(x) whose evaluation cost (monetary or in time) is not constant: for instance, evaluating f(x) costs c(x)∈R for a known function c. I am not interested in minimizing it in few iterations, but rather in minimizing it using "cheap" evaluations. (E.g. maybe a few cheap evaluations are more "informative" to locate the minima, rather than using closer-to-the-minima and costly evaluations.) My questions are: Has this been studied or formalized? If so, can you point me to some references or the name under which this kind of optimization is known? Just to clarify, I am not interested in either of these: Standard optimization, since its "convergence rate" is linked to the total number of function evaluations. Constrained optimization, since my constraint on minimiz

Let's imagine I have a function $f$(x) whose evaluation cost (monetary or in time) is not constant: for instance, evaluating f(x) costs $c\left(x\right)\in \mathbb{R}$ for a known function c. I am not interested in minimizing it in few iterations, but rather in minimizing it using "cheap" evaluations. (E.g. maybe a few cheap evaluations are more "informative" to locate the minima, rather than using closer-to-the-minima and costly evaluations.)
My questions are: Has this been studied or formalized? If so, can you point me to some references or the name under which this kind of optimization is known?
Just to clarify, I am not interested in either of these:
Standard optimization, since its "convergence rate" is linked to the total number of function evaluations.
Constrained optimization, since my constraint on minimizing $f$(x) is not defined in terms of the subspace allowed to x, but in terms of the added costs of all the evaluations during the minimization of $f$.
Goal optimization, since using few resources to minimize $f$ can not be expressed (I think) as a goal.
Edit
Surrogate-based optimization has been proposed as a solution. While it faces the fact that the function is expensive to compute, it does not face (straightforwardly) the fact that different inputs imply different evaluations costs. I absolutely think that surrogate-based optimization could be used to solve the problem. In this line, is there a study of how to select points xi to estimate the "optimization gain"? In terms of surrogate-based optimization: what surrrogate model and what kind of Design of Experiment (DoE) could let me estimate the expected improvement?
Also, some remarks to contextualize. Originally, I was thinking on tuning the hyperparameters of machine learning algorithms. E.g., training with lots of data and lots of epochs is costly, but different architectures may be ranked based on cheaper trainings and then fine-tuned. Therefore:
Let us assume that c(x) can be estimated at zero cost.
There is some structure in c(x): large regions have higher cost, large regions have cheaper cost, the cost is smooth, the minima may be in either region.
Let us assume that $f$(x) is smooth. Also, if possible, let us assume it has some stochastic error that we can model (such as Gaussian error with zero mean and known variance).
You can still ask an expert for help

• Questions are typically answered in as fast as 30 minutes

Solve your problem for the price of one coffee

• Math expert for every subject
• Pay only if we can solve it

Blaine Day
Yes, this has been studied before.
The replacement of a (computationally) heavy cost function with a cheaper, but less accurate version, is known as surrogate cost/fitness function. You might look for Surrogate-Based Modeling and Optimization techniques (in this reference are quite some available).
###### Not exactly what you’re looking for?
Skye Vazquez
You could use a penalty function to add on the cost of evaluation onto the function itself. This would help an optimizer choose between a costly and cheap region when their objective values are roughly equal. Rather than increasing the penalty coefficient each iteration, as on Wikipedia, to force the optimizer to converge to low-penalty regions, we can instead decrease the penalty coefficient each iteration so that the optimizer may gradually explore further into high-penalty regions if it deems it necessary.
Formally, we minimize the objective function given by:
$\underset{x}{min}f\left(x\right)+{\sigma }_{k}c\left(x\right)$
where ${\sigma }_{k}\to 0$