Solving the 2-level atom non-LTE problem using soft actor-critic reinforcement learning

Panos, Brandon; Milić, Ivan

Please use this identifier to cite or link to this item: https://research.matf.bg.ac.rs/handle/123456789/3184

DC Field	Value	Language
dc.contributor.author	Panos, Brandon	en_US
dc.contributor.author	Milić, Ivan	en_US
dc.date.accessioned	2026-02-25T08:09:43Z	-
dc.date.available	2026-02-25T08:09:43Z	-
dc.date.issued	2026-01-01	-
dc.identifier.uri	https://research.matf.bg.ac.rs/handle/123456789/3184	-
dc.description.abstract	We present a novel reinforcement learning (RL) approach for solving the classical 2-level atom non-LTE radiative transfer problem by framing it as a control task in which an RL agent learns a depth-dependent source function $S(\tau)$ that self-consistently satisfies the equation of statistical equilibrium (SE). The agent’s policy is optimized entirely via reward-based interactions with a radiative transfer engine, without explicit knowledge of the ground truth. This method bypasses the need for constructing approximate lambda operators ($\Lambda ^{}$) common in accelerated iterative schemes. Additionally, it requires no extensive precomputed labelled data sets to extract a supervisory signal, and avoids backpropagating gradients through the complex RT solver itself. Finally, we show through experiment that a simple feedforward neural network trained greedily cannot solve for SE, possibly due to the moving target nature of the problem. Our $\Lambda ^{}-\text{Free}$ method offers potential advantages for complex scenarios (e.g. atmospheres with enhanced velocity fields, multidimensional geometries, or complex microphysics) where $\Lambda ^{*}$ construction or solver differentiability is challenging. Additionally, the agent can be incentivized to find more efficient policies by manipulating the discount factor, leading to a reprioritization of immediate rewards. If demonstrated to generalize past its training data, this RL framework could serve as an alternative or accelerated formalism to achieve SE. To the best of our knowledge, this study represents the first application of reinforcement learning in solar physics that directly solves for a fundamental physical constraint.	en_US
dc.language.iso	en	en_US
dc.publisher	Oxford University Press	en_US
dc.relation.ispartof	Ras Techniques and Instruments	en_US
dc.subject	algorithms	en_US
dc.subject	numerical methods	en_US
dc.subject	radiative transfer - machine learning - reinforcement learning	en_US
dc.subject	simulations	en_US
dc.title	Solving the 2-level atom non-LTE problem using soft actor-critic reinforcement learning	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1093/rasti/rzag005	-
dc.identifier.scopus	2-s2.0-105029806197	-
dc.identifier.isi	001681352600001	-
dc.identifier.url	https://api.elsevier.com/content/abstract/scopus_id/105029806197	-
dc.contributor.affiliation	Astronomy	en_US
dc.relation.issn	2752-8200	en_US
dc.description.rank	M20/M50	en_US
dc.relation.firstpage	Article no. rzag005	en_US
dc.relation.volume	5	en_US
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.cerifentitytype	Publications	-
item.grantfulltext	none	-
item.languageiso639-1	en	-
item.openairetype	Article	-
item.fulltext	No Fulltext	-
crisitem.author.dept	Astronomy	-
crisitem.author.orcid	0000-0002-0189-5550	-
Appears in Collections:	Research outputs

Show simple item record

Google Scholar^TM

Check

Google Scholar^TM

Altmetric

Altmetric

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM