Reinforcement learning for adaptive threshold control

Restorative brain-computer interfaces (BCI) provide feedback of neuronal states to normalize pathological brain activity and achieve behavioral gains. Adaptive algorithms have proven to be powerful for assistive BCIs, but their inherent class switching clashes with the operant conditioning goal of restorative BCIs. Due to the treatment rationale of restorative BCIs, the classifier should be limited to constrained feature space, thus limiting the possibility of classifier adaptation. I argued in a earlier post, that the regularization for statistical reasons versus the feature constrainment for clinical reasons is one of the major differences between restorative and assistive BCI.

In this context, the possibilities for adaptation of the classifier are very limited. Actually, for a linear classifier, only the threshold can be changed in an attempt for optimization. Yet, this would mean deviating from the threshold resulting in maximum classification accuracy, something considered suboptimal for assistive BCI. But can such an adaptive threshold control be considered - at least theoretically - to be nonetheless beneficial for neurofeedback learning? To find out, we applied a Bayesian model of neurofeedback and reinforcement learning for different thresholds across a number of feedbacl iterations. After each feedback iteration, we determined the threshold that had resulted in minimal action entropy and maximal instructional efficiency. We found that without adaptation, i.e. for a fixed threshold regime, reinforcement learning is most efficient when performed at the threshold resulting in maximum classification accuracy.

So far so good. But then we run the simulation again, and used the information from the earlier simulation about which threshold had the best instructional efficiency at which iteration to adapt the threshold online. We found - interestingly - that threshold adaptation does improve reinforcement learning in the long run, even when compared to the optimal learning trajectory achieved with a threshold at maximum classification accuracy. This was particularly the case in simulations with low classification accuracy, as would be the case for BCI illiteracy or for the highly constrained feature space of restorative BCIs.

Put shortly: Adapting the threshold for markers of instructional efficacy instead of aiming for maximum classification accuracy has the potential to improve reinforcement learning in neurofeedback or restorative BCI.

You can find the full paper in Frontiers in Neuroscience: Bauer, Robert, and Alireza Gharabaghi. “Reinforcement Learning for Adaptive Threshold Control of Restorative Brain-Computer Interfaces: A Bayesian Simulation.” Front. Neurosci., 2015. doi:10.3389/fnins.2015.00036. It's open access!