Humans combine deliberate planning in novel contexts with fast habitual responses in familiar contexts.
This paper models that switching behavior in a unified active-inference framework where habits are represented
as interpretable symbolic rules.
The core training design is a biologically inspired wake-sleep cycle:
wake extracts candidate rules from real trajectories when they consistently reduce free energy;
sleep performs generative replay to consolidate, prune, and semantically anchor those rules.
Across sports trajectories, driving behavior, medical diagnosis, and Atari strategy, the approach improves
predictive accuracy and efficiency against logic-based, deep learning, active inference, model-based RL,
and LLM-based baselines, while yielding interpretable habit structures.