Zzong's Notes

❯

machine_learning

❯

non-stationary

2026년 6월 14일1 min read

Non-stationary

시간이 지남에 따라 expected reward 가 바뀌는 이슈

B) Related

C) References

링크된 언급

2

Burst-induced Multi-Armed Bandit for Learning Recommendation

non-stationary MAB

epsilon-greedy algorithm

Tracking a non-stationary Problem

함께 보면 좋은 글

adversarial learning

popularity bias 를 해결하는데 사용한다. basic idea 는 recommender G 와 introduced adversary D 의 min-max game 을 수행하는것 .

ensemble

Ensemble 기계 학습 모델의 예측 결과를 서로 합치는 방식을 의미 추천시스템의 경우 MAB 를 통해서도 추천 결과를 앙상블 할 수 있다.

Burst-induced Multi-Armed Bandit for Learning Recommendation

Burst-induced Multi-Armed Bandit for Learning Recommendation B) Abstract 해결하려는 문제: a non-stationary and context-free Multi-Armed Bandit problem, 유저나 아이템에 대한 어떠한 정보가 없는 경우 C)...

metric learning

Metric Learning A.1) 정의 metric learning 은 데이터 포인트 간의 거리를 측정하는 방법을 학습하는 기법입니다.

joint training

Joint Training Joint-training consists of training together multiple networks with different learning objectives in order to benefit from the synergy of the learning.

epsilon-greedy algorithm

\varepsilon-Greedy 알고리즘은 \varepsilon 확률로 가능한 모든 action 들 중 하나를 동일한 확률로 임의 선택하는 것이다. 그 외에는 greedy 알고리즘과 동일하다.

challenges of RS

Challenge of RS 추천 시스템에서 발생할 수 있는 문제점들 nosiy data 일반적으로 historical 데이터는 모두 사용자의 성향을 반영한다고 가정하지만, 사용자는 자신이 선택한 아이템을 좋아하지 않을 수 있다.

Recommendations as treatments - debiasing learning and evaluation

Abstract 추천 시스템에서 학습 또는 평가를 위한 데이터는 대부분 selection bias 에 취약하다. 이러한 bias 는 RS 의 action 이나 유저들의 self-selection 을 통해 생성된다.

Counterfactual learning for recommender system

Abstract counterfactual learning technologies for tackling the bias problem in recommendation.

transfer learning

Transfer Learning B) In NLP 자연어처리 분야에서는 언어의 일반적인 이해를 신경망에 학습시키기 위해 다음과 같은 방식을 수행한다. Generalized pretraining 에서 사용했던 입/출력 레이어를 제거한다.

Non-stationary
B) Related
C) References