REINFORCE Policy
Cross-source consensus on REINFORCE Policy from 1 sources and 4 claims.
1 sources · 4 claims
How it works
Risks & contraindications
Evidence quality
Highlighted claims
- The policy used four parameters for per-gene scores and a fifth parameter to control the feature retention fraction. — StackFeat RL: Reinforcement Learning over Iterative Dual Criterion Feature Selection for Stable Biomarker Discovery
- The learned retention fraction was bounded between 0.25 and 0.90 and initialized at 0.575. — StackFeat RL: Reinforcement Learning over Iterative Dual Criterion Feature Selection for Stable Biomarker Discovery
- The REINFORCE component mainly adjusted retention fraction rather than materially using per-gene penalty modulation. — StackFeat RL: Reinforcement Learning over Iterative Dual Criterion Feature Selection for Stable Biomarker Discovery
- The model's gene-selection gradient assumes independent Bernoulli outcomes, which is an approximation because ElasticNet selections are correlated. — StackFeat RL: Reinforcement Learning over Iterative Dual Criterion Feature Selection for Stable Biomarker Discovery