Talk:Hyperparameter optimization

This is the talk page for discussing improvements to the Hyperparameter optimization article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

This article is rated Start-class on Wikipedia's content assessment scale.
It is of interest to the following WikiProjects:

Computing

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
???	This article has not yet received a rating on the project's importance scale.
	This article has been automatically rated by a bot or other tool because one or more other projects use this class. Please ensure the assessment is correct before removing the `\|auto=` parameter.

Statistics Mid‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics articles
Mid	This article has been rated as Mid-importance on the importance scale.

Add Nelder-Mead?

A standard gradient-free method for iterative improvement is the Nelder-Mead method. Works for a continuous multidimensional space. Should perhaps be mentioned here, along with the general area of black-box (= zeroth-order = gradient-free) optimization. Eclecticos (talk) 16:46, 23 September 2023 (UTC)[reply]

About DeepSwarm

At least for deep neural networks, I realize that this article is now partially conceptually obsolete, in that some modern tools can optimize both the architecture and the hyperparameters simultaneously, with the caveat that this combined optimization doesn't apply to transfer learning. Having said that, to the extent that we're maintaining a separation of the hyperparameter optimization and neural architecture search articles, the preferred location of DeepSwarm would definitely be in the latter. I will try and take some steps to add some prominent software for the same to that article, of course including DeepSwarm. Meanwhile, I need an academic reference for DeepSwarm, and I preferably need it to be listed in its readme. --Acyclic (talk) 23:09, 8 May 2019 (UTC)[reply]

Random search

The section about random search says: "Main article: Random search". Is this link actually correct? The linked article talks about Rastrigin, as if this was the established meaning of the term "Random search". (Maybe it is. I don't know.) But the statement on the current page is that "[Random Search] replaces the exhaustive enumeration of all combinations by selecting them randomly". This contradicts the algorithm on the linked article, I think? Which one is it? --Matumio (talk) 20:45, 3 September 2019 (UTC)[reply]

After reading some of the references, I think the link is just plain wrong. I removed it. The are calling it "randomized search" in sklearn. Maybe this would be the better term and avoid the above confusion? --Matumio (talk) 21:10, 3 September 2019 (UTC)[reply]