Sage Journals: Discover world-class research

Abstract

The adaptive learning problem concerns how to create an individualized learning plan (also referred to as a learning policy) that chooses the most appropriate learning materials based on a learner’s latent traits. In this article, we study an important yet less-addressed adaptive learning problem—one that assumes continuous latent traits. Specifically, we formulate the adaptive learning problem as a Markov decision process. We assume latent traits to be continuous with an unknown transition model and apply a model-free deep reinforcement learning algorithm—the deep Q-learning algorithm—that can effectively find the optimal learning policy from data on learners’ learning process without knowing the actual transition model of the learners’ continuous latent traits. To efficiently utilize available data, we also develop a transition model estimator that emulates the learner’s learning process using neural networks. The transition model estimator can be used in the deep Q-learning algorithm so that it can more efficiently discover the optimal learning policy for a learner. Numerical simulation studies verify that the proposed algorithm is very efficient in finding a good learning policy. Especially with the aid of a transition model estimator, it can find the optimal learning policy after training using a small number of learners.

Keywords

adaptive learning system transition model estimator Markov decision process deep reinforcement learning deep Q-learning neural networks model free

Get full access to this article

View all access options for this article.

References

Ackerman

T. A.

Gierl

M. J.

Walker

C. M.

(2003). Using multidimensional item response theory to evaluate educational and psychological tests. Educational Measurement: Issues and Practice, 22(3), 37–51.

Bertsekas

D. P.

Tsitsiklis

J. N.

(1996). Neuro-dynamic programming (Vol. 5) . Athena Scientific Belmont.

Birnbaum

(1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord

F. M.

Novick

M. R.

(Eds.), Statistical theories of mental test scores (pp. 397–479). Addison-Wesley.

Caicedo

J. C.

Lazebnik

(2015). Active object localization with deep reinforcement learning. Proceedings of the IEEE International Conference on Computer Vision (pp. 2488–2496).

Chang

H.-H.

(2015). Psychometrics behind computerized adaptive testing. Psychometrika, 80(1), 1–20.

Chen

Culpepper

S. A.

Wang

Douglas

(2018). A hidden Markov model for learning trajectories in cognitive diagnosis with application to spatial rotation skills. Applied Psychological Measurement, 42(1), 5–23.

Chen

Liu

Ying

(2018). Recommendation system for adaptive learning. Applied Psychological Measurement, 42(1), 24–41.

Firth

(1993). Bias reduction of maximum likelihood estimates. Biometrika, 80(1), 27–38.

François-Lavet

Henderson

Islam

Bellemare

M. G.

Pineau

(2018). An introduction to deep reinforcement learning. Foundations and Trends® in Machine Learning, 11(3–4), 219–354.

10.

Goodfellow

Bengio

Courville

. (2016). Deep learning. MIT press.

11.

Holly

Lillicrap

Levine

(2017). Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. 2017 IEEE International Conference on Robotics and Automation (ICRA) (pp. 3389–3396).

12.

Lan

A. S.

Baraniuk

R. G.

(2016). A contextual bandits framework for personalized learning action selection. Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC: EDM (pp. 424–429).

13.

Lan

A. S.

Studer

Baraniuk

R. G.

(2014). Time-varying learning and content analytics via sparse factor analysis. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 452–461).

14.

Zhang

Chang

H. H.

(2021). Optimal hierarchical learning path design with reinforcement learning. Applied Psychological Measurement, 45(1), 54–70.

15.

Lord

F. M

. (1980). Application of item response theory to practical testing problems. Lawrence Eribaum Associates.

16.

Lord

F. M.

Novick

M. R.

Birnbaum

. (1968). Statistical theories of mental test scores. 1968. Reading: Addison-Wesley Google Scholar.

17.

Makransky

Glas

C. A.

(2014). An automatic online calibration design in adaptive testing. Journal of Applied Testing Technology, 11(1), 1–20.

18.

Masters

G. N.

(1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174.

19.

McGlohen

Chang

H.-H.

(2008). Combining computer adaptive testing technology with cognitively diagnostic assessment. Behavior Research Methods, 40(3), 808–821.

20.

Mnih

Kavukcuoglu

Silver

Graves

Antonoglou

Wierstra

Riedmiller

. (2013). Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 . https://arxiv.org/abs/1312.5602

21.

Mnih

Kavukcuoglu

Silver

Rusu

A. A.

Veness

Bellemare

M. G.

Graves

Riedmiller

M. A.

Fidjeland

A. K.

Ostrovski

Petersen

Beattie

Sadik

Antonoglou

King

Kumaran

Wierstra

Legg

Hassabis

(2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529.

22.

Mulaik

. (1972). A mathematical investigation of some multidimensional Rasch models for psychological tests. Annual Meeting of the Psychometric Society, Princeton, NJ.

23.

Muraki

. (1992). A generalized partial credit model: Application of an EM algorithm. ETS Research Report Series, 1992(1), i–30.

24.

Rasch

(1960). Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research.

25.

Reckase

M. D

. (1972). Development and application of a multivariate logistic latent trait model. [Doctoral dissertation, Syracuse University]. Dissertation Abstracts International, 33.

26.

Reddy

Levine

Dragan

. (2017). Accelerating human learning with deep reinforcement learning. Nips’17 Workshop: Teaching Machines, Robots, and Humans.

27.

Samejima

(1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 34(Suppl 1), 1–97.

28.

Sutton

R. S.

Barto

A. G

. (2018). Reinforcement learning: An introduction. MIT press.

29.

Sympson

J. B

. (1978). A model for testing with multidimensional items. Proceedings of the 1977 Computerized Adaptive Testing Conference.

30.

Tang

Chen

Liu

Ying

(2019). A reinforcement learning approach to personalized learning recommendation systems. British Journal of Mathematical and Statistical Psychology, 72(1), 108–135.

31.

Tseng

F.-L.

Hsu

T.-C

. (2001). Multidimensional adaptive testing using the weighted likelihood estimation: A comparison of estimation methods. Annual Meeting of NCME, Seattle.

32.

Wang

(2015). On latent trait estimation in multidimensional compensatory item response models. Psychometrika, 80(2), 428–449.

33.

Wang

Yang

Culpepper

S. A.

Douglas

J. A.

(2018). Tracking skill acquisition with cognitive diagnosis models: A higher-order, hidden Markov model with covariates. Journal of Educational and Behavioral Statistics, 43(1), 57–87.

34.

Warm

T. A.

(1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427–450.

35.

Weiss

D. J.

(1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6(4), 473–492.

36.

Whitely

S. E.

(1980). Multicomponent latent trait models for ability tests. Psychometrika, 45(4), 479–494.

37.

Sun

Nikovski

Kitamura

Mori

Hashimoto

. (2019). Deep reinforcement learning for joint bidding and pricing of load serving entity. IEEE Transactions on Smart Grid, 10(6), 6366–6375.

38.

Xing

Van Der Schaar

(2016). Personalized course sequence recommendations. IEEE Transactions on Signal Processing, 64(20), 5340–5352.

39.

Zhang

(2013). A procedure for dimensionality analyses of response data from various test designs. Psychometrika, 78(1), 37–58.

40.

Zhang

Xie

Song

(2011). Investigating the impact of uncertainty about item parameters on ability estimation. Psychometrika, 76(1), 97–118.

41.

Zhang

Chang

H.-H.

(2016). From smart testing to smart learning: How testing technology can assist the new generation of education. International Journal of Smart Technology and Learning, 1(1), 67–92.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.27 MB

0.00 MB