Zzong's Notes

❯

machine_learning

❯

❯

❯

vicuna

2026년 6월 14일1 min read

Vicuna

ShareGPT 에서 모은 약 125K 개의 사용자 대화 데이터를 기반으로 파인튜닝한 Llama 기반 모델

B) Training

B.1) 데이터셋

ShareGPT 데이터셋은 공개하지 않음

B.1.1) Preprocessing

To ensure data quality, we convert the HTML back to markdown and filter out some inappropriate or low-quality samples.
Additionally, we divide lengthy conversations into smaller segments that fit the model’s maximum context length.

C) Related

D) References

GitHub - lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

링크된 언급

1

CLIP Vicuna InstructBLIP GPT-4V

함께 보면 좋은 글

Llama

Llama B) Llama 2 모델 사이즈 7, 13, 34, 70 billion and a Llama chat variant with the same sizes B.1) 기존 모델과 비교 increased the size of the pretraining corpus by 40% doubled the...

Train Large Model

배경 Qwen2.5 의 72b 급 대용량 모델을 학습하는 방법에 대해 조사해보자.

LVLM 를 따라하는 정보 넣기

General LVLMs 학습 방법 Visual-encoder → cross-modal connector → LLM Vision Transformer (ViT) Component Image-text pairs 를 이용해서 학습 수행 주로 CLIP 모델을 인코더로 활용한다.

llm_as_classifier

llm as classifier 왜 전통적인 분류 모델을 사용하지 않고, LLM 을 통해 분류 문제를 해결하려 할까? B) LLM 만의 장점 학습 데이터셋이 많지 않은 경우, 빅 모델이 성능 면에서 더 효율적일 수 있다.

GPT-2

GPT-2 Let’s reproduce GPT-2 (124M) - YouTube B) Questions dropout 은 왜 softmax 이후에 적용하는 걸까? GPT 모델에서 cheating 방지를 위해 masking 하는 방식은 아직도 이해를 잘 못하겠음.

FastChat

FastChat LLM 파인 튜닝 용 라이브러리 B) Arguments Description tf32 C) Related D) References.

Prompt 종류

Prompt 종류 A.1) Instruction Prompt Extract the name of the author from the quotation below.

DistributedDataParallel

DistributedDataParallel a batch is sent to each GPU worker which has its own copy of the model.

DPO를 한 문장으로 잡기

DPO를 한 문장으로 잡기 DPO(Direct Preference Optimization)는 chosen/rejected 답변 쌍을 이용해 LLM을 선호도에 맞게 미세조정하는 방법이다.

GPT

GPT B) Related C) References.

Vicuna
B) Training
B.1) 데이터셋
B.1.1) Preprocessing
C) Related
D) References