RAG์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๊ฒƒ์€ ์‹œ์Šคํ…œ์˜ ํšจ์œจ์„ฑ๊ณผ ์‚ฌ์šฉ์ž ๋งŒ์กฑ๋„๋ฅผ ์ธก์ •ํ•˜๋Š” ๋ฐ ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž๊ฐ€ ์›ํ•˜๋Š” ์ •๋ณด๋ฅผ ์–ผ๋งˆ๋‚˜ ๋น ๋ฅด๊ณ  ์ •ํ™•ํ•˜๊ฒŒ ์ฐพ์•„์ฃผ๋Š”์ง€๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ์—ฌ๋Ÿฌ ์ง€ํ‘œ๋“ค์ด ์‚ฌ์šฉ๋˜๋ฉฐ, ๊ทธ์ค‘ ๋Œ€ํ‘œ์ ์ธ ๊ฒƒ์ด mAP (Mean Average Precision), mRR (Mean Reciprocal Rank), nDCG (Normalized Discounted Cumulative Gain) ์ž…๋‹ˆ๋‹ค. ๊ฐ ์ง€ํ‘œ๋Š” ํ‰๊ฐ€ํ•˜๋Š” ๊ด€์ ๊ณผ ๋ชฉ์ ์ด ๋‹ค๋ฅด๋ฏ€๋กœ, ์‹œ์Šคํ…œ์˜ ํŠน์„ฑ๊ณผ ํ‰๊ฐ€ ๋ชฉํ‘œ์— ๋งž์ถฐ ์ ์ ˆํ•œ ์ง€ํ‘œ๋ฅผ ์„ ํƒํ•˜๊ณ  ํ•ด์„ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ๐Ÿ˜Š

1. mRR (Mean Reciprocal Rank)

์ •์˜ (Definition)

mRR (Mean Reciprocal Rank) ์€ ์—ฌ๋Ÿฌ ์ฟผ๋ฆฌ(Query)์— ๋Œ€ํ•ด ๊ฒ€์ƒ‰๋œ ๊ฒฐ๊ณผ ๋ฆฌ์ŠคํŠธ์—์„œ ์ฒซ ๋ฒˆ์งธ ์ •๋‹ต(๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ)์ด ๋‚˜ํƒ€๋‚œ ์ˆœ์œ„์˜ ์—ญ์ˆ˜(Reciprocal Rank, RR) ๋ฅผ ๊ตฌํ•˜๊ณ , ์ด ๊ฐ’๋“ค์˜ ํ‰๊ท ์„ ๋‚ธ ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค. ์ฆ‰, ์‹œ์Šคํ…œ์ด ์‚ฌ์šฉ์ž๊ฐ€ ์ฐพ๋Š” โ€˜๋‹จ ํ•˜๋‚˜์˜ ์ •๋‹ตโ€™์„ ์–ผ๋งˆ๋‚˜ ๋นจ๋ฆฌ ์ฐพ์•„์ฃผ๋Š”์ง€์— ์ดˆ์ ์„ ๋งž์ถฅ๋‹ˆ๋‹ค.

์„ค๋ช… (Explanation)

  • Reciprocal Rank (RR): ํŠน์ • ์ฟผ๋ฆฌ์— ๋Œ€ํ•œ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ๋ฆฌ์ŠคํŠธ์—์„œ ์ฒซ ๋ฒˆ์งธ ์ •๋‹ต ํ•ญ๋ชฉ์˜ ์ˆœ์œ„(rank)๊ฐ€ k๋ผ๊ณ  ํ•  ๋•Œ, RR์€ 1/k ๋กœ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ฒซ ๋ฒˆ์งธ ๊ฒฐ๊ณผ๊ฐ€ ๋ฐ”๋กœ ์ •๋‹ต์ด๋ฉด RR์€ 1/1 = 1, ์„ธ ๋ฒˆ์งธ ๊ฒฐ๊ณผ๊ฐ€ ์ฒซ ์ •๋‹ต์ด๋ฉด RR์€ 1/3์ด ๋ฉ๋‹ˆ๋‹ค. ๋งŒ์•ฝ ๊ฒฐ๊ณผ ๋ฆฌ์ŠคํŠธ์— ์ •๋‹ต์ด ์—†๋‹ค๋ฉด RR์€ 0์ด ๋ฉ๋‹ˆ๋‹ค.
  • Mean (ํ‰๊ท ): ์—ฌ๋Ÿฌ ์ฟผ๋ฆฌ (Q)์— ๋Œ€ํ•œ RR ๊ฐ’์„ ๋ชจ๋‘ ๊ณ„์‚ฐํ•œ ํ›„, ์ด ๊ฐ’๋“ค์˜ ์‚ฐ์ˆ  ํ‰๊ท ์„ ๊ตฌํ•˜์—ฌ ์ตœ์ข… mRR ๊ฐ’์„ ์–ป์Šต๋‹ˆ๋‹ค. ๊ณต์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค: mRR = (1 / |Q|) * ฮฃ(1 / rank_i) (์—ฌ๊ธฐ์„œ rank_i๋Š” i๋ฒˆ์งธ ์ฟผ๋ฆฌ์—์„œ ์ฒซ ๋ฒˆ์งธ ์ •๋‹ต์˜ ์ˆœ์œ„)
  • ํŠน์ง•:
    • ๐Ÿค” ๋‹จ์ˆœ์„ฑ: ๊ณ„์‚ฐ์ด ๋น„๊ต์  ๊ฐ„๋‹จํ•˜๊ณ  ์ง๊ด€์ ์ž…๋‹ˆ๋‹ค.
    • ๐Ÿฅ‡ ์ฒซ ์ •๋‹ต ์ค‘์š”: ์‚ฌ์šฉ์ž๊ฐ€ โ€˜ํ•˜๋‚˜์˜ ์ •๋‹ตโ€™์„ ์ฐพ๋Š” ๊ฒฝ์šฐ (์˜ˆ: ์งˆ๋ฌธ ๋‹ต๋ณ€ ์‹œ์Šคํ…œ, ์•Œ๋ ค์ง„ ํ•ญ๋ชฉ ๊ฒ€์ƒ‰)์— ํŠนํžˆ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ๊ด€๋ จ์„ฑ ๋†’์€ ๊ฒฐ๊ณผ์˜ ์ˆœ์œ„์—๋งŒ ์ง‘์ค‘ํ•ฉ๋‹ˆ๋‹ค.
    • ๐Ÿ“‰ ํ•œ๊ณ„์ : ์ฒซ ๋ฒˆ์งธ ์ •๋‹ต ์ดํ›„์˜ ๋‹ค๋ฅธ ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ๋“ค์˜ ์ˆœ์œ„๋‚˜ ๊ฐœ์ˆ˜๋Š” ์ „ํ˜€ ๊ณ ๋ คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ฆ‰, ์ƒ์œ„ ๊ฒฐ๊ณผ์— ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ •๋‹ต์ด ์žˆ์–ด๋„ ์ฒซ ๋ฒˆ์งธ ์ •๋‹ต์˜ ์ˆœ์œ„๋งŒ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ์‹œ (Example)

3๊ฐœ์˜ ์ฟผ๋ฆฌ์— ๋Œ€ํ•œ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์™€ ์ฒซ ๋ฒˆ์งธ ์ •๋‹ต์˜ ์ˆœ์œ„๊ฐ€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ด…์‹œ๋‹ค.

  • ์ฟผ๋ฆฌ 1: [๊ฒฐ๊ณผ A (์ •๋‹ต), ๊ฒฐ๊ณผ B, ๊ฒฐ๊ณผ C] โ†’ ์ฒซ ์ •๋‹ต ์ˆœ์œ„(rank) = 1, RR = 1/1 = 1
  • ์ฟผ๋ฆฌ 2: [๊ฒฐ๊ณผ D, ๊ฒฐ๊ณผ E, ๊ฒฐ๊ณผ F (์ •๋‹ต)] โ†’ ์ฒซ ์ •๋‹ต ์ˆœ์œ„(rank) = 3, RR = 1/3
  • ์ฟผ๋ฆฌ 3: [๊ฒฐ๊ณผ G, ๊ฒฐ๊ณผ H (์ •๋‹ต), ๊ฒฐ๊ณผ I] โ†’ ์ฒซ ์ •๋‹ต ์ˆœ์œ„(rank) = 2, RR = 1/2

์ด ๊ฒฝ์šฐ, mRR์€ (1 + 1/3 + 1/2) / 3 = (6/6 + 2/6 + 3/6) / 3 = (11/6) / 3 = 11/18 โ‰ˆ 0.611 ์ž…๋‹ˆ๋‹ค. ์ด ์‹œ์Šคํ…œ์€ ํ‰๊ท ์ ์œผ๋กœ ์ฒซ ๋ฒˆ์งธ ์ •๋‹ต์„ ์•ฝ 1~2์œ„ ์‚ฌ์ด์—์„œ ์ฐพ์•„์ค€๋‹ค๊ณ  ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2. mAP (Mean Average Precision)

์ •์˜ (Definition)

mAP (Mean Average Precision) ๋Š” ๊ฐ ์ฟผ๋ฆฌ๋ณ„ Average Precision (AP) ๊ฐ’์˜ ํ‰๊ท ์ž…๋‹ˆ๋‹ค. AP๋Š” ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ๋ฆฌ์ŠคํŠธ์—์„œ ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ๋“ค์ด ์ƒ์œ„์— ์–ผ๋งˆ๋‚˜ ์ž˜ ๋žญํฌ๋˜์—ˆ๋Š”์ง€๋ฅผ ์ •๋ฐ€๋„(Precision) ์™€ ์žฌํ˜„์œจ(Recall) ์„ ๋ชจ๋‘ ๊ณ ๋ คํ•˜์—ฌ ํ‰๊ฐ€ํ•˜๋Š” ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค. ์ฆ‰, ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ์„ ๋ชจ๋‘ ์ž˜ ์ฐพ์•„๋‚ด๋Š” ๋Šฅ๋ ฅ์„ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค.

์„ค๋ช… (Explanation)

  • ์ •๋ฐ€๋„ (Precision): ๊ฒ€์ƒ‰๋œ ํ•ญ๋ชฉ ์ค‘ ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ์˜ ๋น„์œจ (Precision = TP / (TP + FP))
  • ์žฌํ˜„์œจ (Recall): ์ „์ฒด ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ ์ค‘ ๊ฒ€์ƒ‰๋œ ํ•ญ๋ชฉ์˜ ๋น„์œจ (Recall = TP / (TP + FN))
  • Average Precision (AP): ๋‹จ์ผ ์ฟผ๋ฆฌ์— ๋Œ€ํ•œ ํ‰๊ฐ€ ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค. ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ๋ฆฌ์ŠคํŠธ๋ฅผ ์œ„์—์„œ๋ถ€ํ„ฐ ์ˆœ์„œ๋Œ€๋กœ ํ™•์ธํ•˜๋ฉด์„œ, ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ์ด ๋‚˜์˜ฌ ๋•Œ๋งˆ๋‹ค ํ•ด๋‹น ์ˆœ์œ„๊นŒ์ง€์˜ ์ •๋ฐ€๋„(Precision)๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ์ด ์ •๋ฐ€๋„ ๊ฐ’๋“ค์˜ ํ‰๊ท ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค. ๊ด€๋ จ์„ฑ ์žˆ๋Š” ๋ชจ๋“  ๋ฌธ์„œ๋ฅผ ๊ฒ€์ƒ‰ํ–ˆ์„ ๋•Œ๋งŒ Recall ๊ฐ’์ด 1์ด ๋˜๋ฉฐ, ์ด๋•Œ์˜ Precision ๊ฐ’๋“ค์˜ ํ‰๊ท ์ด AP๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. ๊ณต์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œํ˜„๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค: AP = ฮฃ(P(k) * rel(k)) / (Number of relevant documents) (์—ฌ๊ธฐ์„œ P(k)๋Š” ์ƒ์œ„ k๊ฐœ ๊ฒฐ๊ณผ์—์„œ์˜ ์ •๋ฐ€๋„, rel(k)๋Š” k๋ฒˆ์งธ ๊ฒฐ๊ณผ๊ฐ€ ๊ด€๋ จ์„ฑ์ด ์žˆ์œผ๋ฉด 1, ์•„๋‹ˆ๋ฉด 0)
  • Mean (ํ‰๊ท ): ์—ฌ๋Ÿฌ ์ฟผ๋ฆฌ์— ๋Œ€ํ•œ AP ๊ฐ’์„ ๋ชจ๋‘ ๊ณ„์‚ฐํ•œ ํ›„, ์ด ๊ฐ’๋“ค์˜ ์‚ฐ์ˆ  ํ‰๊ท ์„ ๊ตฌํ•˜์—ฌ ์ตœ์ข… mAP ๊ฐ’์„ ์–ป์Šต๋‹ˆ๋‹ค.
  • ํŠน์ง•:
    • ๐Ÿ“Š ์ˆœ์„œ ๋ฐ ๊ฐœ์ˆ˜ ๊ณ ๋ ค: ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ์˜ ์ˆœ์„œ์™€ ๊ฐœ์ˆ˜๋ฅผ ๋ชจ๋‘ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค. ๊ด€๋ จ์„ฑ ๋†’์€ ํ•ญ๋ชฉ์ด ๊ฒฐ๊ณผ ๋ฆฌ์ŠคํŠธ ์ƒ์œ„์— ๋งŽ์ด ๋‚˜์˜ฌ์ˆ˜๋ก AP ๊ฐ’์ด ๋†’์•„์ง‘๋‹ˆ๋‹ค.
    • โš–๏ธ ์ •๋ฐ€๋„-์žฌํ˜„์œจ ๊ท ํ˜•: ์•”๋ฌต์ ์œผ๋กœ ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ ์‚ฌ์ด์˜ ๊ท ํ˜•์„ ๊ณ ๋ คํ•ฉ๋‹ˆ๋‹ค.
    • ๐ŸŽฏ ์ผ๋ฐ˜์  ์„ฑ๋Šฅ ํ‰๊ฐ€: ์ •๋ณด ๊ฒ€์ƒ‰ ๋ถ„์•ผ์—์„œ ๊ฐ€์žฅ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ํ‘œ์ค€์ ์ธ ํ‰๊ฐ€ ์ง€ํ‘œ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค. ํŠนํžˆ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ด€๋ จ์„ฑ ๋†’์€ ๊ฒฐ๊ณผ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•œ ๊ฒฝ์šฐ (์˜ˆ: ์›น ๊ฒ€์ƒ‰)์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.
    • ๐Ÿ”ข ์ด์ง„ ๊ด€๋ จ์„ฑ ๊ฐ€์ •: ๊ธฐ๋ณธ์ ์œผ๋กœ ํ•ญ๋ชฉ์ด โ€˜๊ด€๋ จ์„ฑ ์žˆ์Œโ€™ ๋˜๋Š” โ€˜๊ด€๋ จ์„ฑ ์—†์Œโ€™์˜ ๋‘ ๊ฐ€์ง€๋กœ๋งŒ ๊ตฌ๋ถ„๋œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค (Binary Relevance).

์˜ˆ์‹œ (Example)

์–ด๋–ค ์ฟผ๋ฆฌ์— ๋Œ€ํ•ด ์ด 5๊ฐœ์˜ ๊ด€๋ จ์„ฑ ๋†’์€ ๋ฌธ์„œ๊ฐ€ ์กด์žฌํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๊ณ , ์‹œ์Šคํ…œ์ด 10๊ฐœ์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ–ˆ์œผ๋ฉฐ, ๊ทธ ์ˆœ์„œ์™€ ๊ด€๋ จ์„ฑ ์—ฌ๋ถ€๊ฐ€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค๊ณ  ํ•ฉ์‹œ๋‹ค: [R, N, R, R, N, N, R, N, R, N] (R: Relevant, N: Not Relevant)

  1. 1๋ฒˆ์งธ ๊ฒฐ๊ณผ(R): Precision = 1/1 = 1.0 (๊ด€๋ จ ๋ฌธ์„œ 1/5 ์ฐพ์Œ)
  2. 3๋ฒˆ์งธ ๊ฒฐ๊ณผ(R): Precision = 2/3 โ‰ˆ 0.67 (๊ด€๋ จ ๋ฌธ์„œ 2/5 ์ฐพ์Œ)
  3. 4๋ฒˆ์งธ ๊ฒฐ๊ณผ(R): Precision = 3/4 = 0.75 (๊ด€๋ จ ๋ฌธ์„œ 3/5 ์ฐพ์Œ)
  4. 7๋ฒˆ์งธ ๊ฒฐ๊ณผ(R): Precision = 4/7 โ‰ˆ 0.57 (๊ด€๋ จ ๋ฌธ์„œ 4/5 ์ฐพ์Œ)
  5. 9๋ฒˆ์งธ ๊ฒฐ๊ณผ(R): Precision = 5/9 โ‰ˆ 0.56 (๊ด€๋ จ ๋ฌธ์„œ 5/5 ์ฐพ์Œ)

์ด ์ฟผ๋ฆฌ์˜ AP = (1.0 + 0.67 + 0.75 + 0.57 + 0.56) / 5 โ‰ˆ 3.55 / 5 = 0.71 ๋งŒ์•ฝ ์—ฌ๋Ÿฌ ์ฟผ๋ฆฌ์— ๋Œ€ํ•ด AP ๊ฐ’์„ ๊ณ„์‚ฐํ•˜์—ฌ ํ‰๊ท ์„ ๋‚ด๋ฉด ๊ทธ๊ฒƒ์ด mAP๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด 3๊ฐœ์˜ ์ฟผ๋ฆฌ์— ๋Œ€ํ•œ AP ๊ฐ’์ด ๊ฐ๊ฐ 0.71, 0.5, 0.8 ์ด๋ผ๋ฉด, mAP = (0.71 + 0.5 + 0.8) / 3 โ‰ˆ 0.67 ์ž…๋‹ˆ๋‹ค.

3. nDCG (Normalized Discounted Cumulative Gain)

์ •์˜ (Definition)

**nDCG (Normalized Discounted Cumulative Gain)**๋Š” ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ์˜ ์ˆœ์œ„ ํ’ˆ์งˆ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ง€ํ‘œ๋กœ, ํŠนํžˆ ๊ด€๋ จ์„ฑ์˜ ์ •๋„๊ฐ€ ๋‹ค์–‘ํ•œ(graded relevance) ๊ฒฝ์šฐ์— ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ƒ์œ„ ์ˆœ์œ„์— ์žˆ๋Š” ๊ฒฐ๊ณผ์— ๋” ๋†’์€ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๊ณ (Discounted), ์ด์ƒ์ ์ธ(๊ฐ€์žฅ ์ข‹์€) ์ˆœ์„œ๋กœ ์ •๋ ฌ๋˜์—ˆ์„ ๋•Œ์˜ ์ ์ˆ˜๋กœ ๋‚˜๋ˆ„์–ด(Normalized) 0๊ณผ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ์ •๊ทœํ™”ํ•ฉ๋‹ˆ๋‹ค.

์„ค๋ช… (Explanation)

  • Cumulative Gain (CG): ์ƒ์œ„ k๊ฐœ ๊ฒฐ๊ณผ์˜ ๊ด€๋ จ์„ฑ ์ ์ˆ˜(relevance score)๋ฅผ ๋‹จ์ˆœํžˆ ํ•ฉ์‚ฐํ•œ ๊ฐ’์ž…๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์˜ ์ˆœ์„œ๋Š” ๊ณ ๋ คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. CG_k = ฮฃ(rel_i) (i=1 to k, rel_i๋Š” i๋ฒˆ์งธ ๊ฒฐ๊ณผ์˜ ๊ด€๋ จ์„ฑ ์ ์ˆ˜)
  • Discounted Cumulative Gain (DCG): CG์— ์ˆœ์„œ ๊ฐœ๋…์„ ๋„์ž…ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ•˜์œ„ ์ˆœ์œ„ ๊ฒฐ๊ณผ์˜ ๊ด€๋ จ์„ฑ ์ ์ˆ˜์—๋Š” **ํŽ˜๋„ํ‹ฐ(discount)**๋ฅผ ๋ถ€์—ฌํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ i๋ฒˆ์งธ ์ˆœ์œ„์˜ ๊ฒฐ๊ณผ์—๋Š” 1 / log2(i+1)์˜ ํ• ์ธ์œจ(discount factor)์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ์ˆœ์œ„๊ฐ€ ๋‚ฎ์„์ˆ˜๋ก(i๊ฐ€ ํด์ˆ˜๋ก) ๋ถ„๋ชจ๊ฐ€ ์ปค์ ธ ๊ฐ€์ค‘์น˜๊ฐ€ ์ค„์–ด๋“ญ๋‹ˆ๋‹ค. DCG_k = ฮฃ(rel_i / log2(i+1)) (i=1 to k) (๋‹ค๋ฅธ ํ• ์ธ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค.)
  • Normalized DCG (nDCG): DCG ๊ฐ’์€ ์ฟผ๋ฆฌ๋‚˜ ๊ฒฐ๊ณผ์˜ ๊ฐœ์ˆ˜์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ๋น„๊ต๋ฅผ ์œ„ํ•ด ์ •๊ทœํ™”ํ•ฉ๋‹ˆ๋‹ค. **Ideal DCG (IDCG)**๋Š” ํ•ด๋‹น ์ฟผ๋ฆฌ์— ๋Œ€ํ•ด ๊ฐ€๋Šฅํ•œ ๊ฐ€์žฅ ์ด์ƒ์ ์ธ ์ˆœ์„œ (๊ด€๋ จ์„ฑ ๋†’์€ ์ˆœ์„œ๋Œ€๋กœ ์ •๋ ฌ)๋กœ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜์—ดํ–ˆ์„ ๋•Œ์˜ DCG ๊ฐ’์ž…๋‹ˆ๋‹ค. nDCG๋Š” ์‹ค์ œ DCG ๊ฐ’์„ IDCG ๊ฐ’์œผ๋กœ ๋‚˜๋ˆˆ ๊ฒƒ์ž…๋‹ˆ๋‹ค. nDCG_k = DCG_k / IDCG_k ์ด ๊ฐ’์€ ํ•ญ์ƒ 0๊ณผ 1 ์‚ฌ์ด์— ์œ„์น˜ํ•˜๋ฉฐ, 1์— ๊ฐ€๊นŒ์šธ์ˆ˜๋ก ์ด์ƒ์ ์ธ ์ˆœ์œ„์— ๊ฐ€๊น๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค.
  • ํŠน์ง•:
    • ๐Ÿ’ฏ ๋“ฑ๊ธ‰๋ณ„ ๊ด€๋ จ์„ฑ ๋ฐ˜์˜: ๊ด€๋ จ์„ฑ์„ ๋‹จ์ˆœ ์ด์ง„(0 ๋˜๋Š” 1)์ด ์•„๋‹Œ ์—ฌ๋Ÿฌ ๋“ฑ๊ธ‰(์˜ˆ: 0=๊ด€๋ จ ์—†์Œ, 1=์กฐ๊ธˆ ๊ด€๋ จ ์žˆ์Œ, 2=๋งค์šฐ ๊ด€๋ จ ์žˆ์Œ)์œผ๋กœ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ๐Ÿ“‰ ์ˆœ์œ„ ๊ฐ€์ค‘์น˜: ์ƒ์œ„ ์ˆœ์œ„ ๊ฒฐ๊ณผ์— ๋” ํฐ ์ค‘์š”๋„๋ฅผ ๋ถ€์—ฌํ•˜๋Š” ์ง๊ด€์ ์ธ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‚ฎ์€ ์ˆœ์œ„์˜ ์ข‹์€ ๊ฒฐ๊ณผ๋ณด๋‹ค๋Š” ๋†’์€ ์ˆœ์œ„์˜ ์ข‹์€ ๊ฒฐ๊ณผ๊ฐ€ ๋” ๊ฐ€์น˜ ์žˆ๋‹ค๋Š” ์ ์„ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค.
    • ๐ŸŒ ํ‘œ์ค€ํ™”๋œ ๋น„๊ต: ์ •๊ทœํ™”๋ฅผ ํ†ตํ•ด ์„œ๋กœ ๋‹ค๋ฅธ ์ฟผ๋ฆฌ๋‚˜ ์‹œ์Šคํ…œ ๊ฐ„์˜ ์„ฑ๋Šฅ ๋น„๊ต๊ฐ€ ์šฉ์ดํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ์‹œ (Example)

์–ด๋–ค ์ฟผ๋ฆฌ์— ๋Œ€ํ•ด ์ƒ์œ„ 5๊ฐœ ๊ฒฐ๊ณผ์˜ ๊ด€๋ จ์„ฑ ์ ์ˆ˜(0~3์ )๊ฐ€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ด…์‹œ๋‹ค: [3, 2, 3, 0, 1]

  • DCG ๊ณ„์‚ฐ:

    • DCG@1 = 3 / log2(1+1) = 3 / 1 = 3
    • DCG@2 = 3 + 2 / log2(2+1) โ‰ˆ 3 + 2 / 1.585 = 4.26
    • DCG@3 = 4.26 + 3 / log2(3+1) = 4.26 + 3 / 2 = 5.76
    • DCG@4 = 5.76 + 0 / log2(4+1) = 5.76
    • DCG@5 = 5.76 + 1 / log2(5+1) โ‰ˆ 5.76 + 1 / 2.585 = 6.15 (DCG@5 โ‰ˆ 6.15)
  • IDCG ๊ณ„์‚ฐ: ์ด์ƒ์ ์ธ ์ˆœ์„œ๋Š” ๊ด€๋ จ์„ฑ ์ ์ˆ˜๊ฐ€ ๋†’์€ ์ˆœ์„œ์ธ [3, 3, 2, 1, 0] ์ž…๋‹ˆ๋‹ค.

    • IDCG@1 = 3 / log2(1+1) = 3
    • IDCG@2 = 3 + 3 / log2(2+1) โ‰ˆ 3 + 3 / 1.585 = 4.89
    • IDCG@3 = 4.89 + 2 / log2(3+1) = 4.89 + 2 / 2 = 5.89
    • IDCG@4 = 5.89 + 1 / log2(4+1) โ‰ˆ 5.89 + 1 / 2.322 = 6.32
    • IDCG@5 = 6.32 + 0 / log2(5+1) = 6.32 (IDCG@5 โ‰ˆ 6.32)
  • nDCG ๊ณ„์‚ฐ:

    • nDCG@5 = DCG@5 / IDCG@5 โ‰ˆ 6.15 / 6.32 โ‰ˆ 0.973

์ด ์‹œ์Šคํ…œ์€ ์ƒ์œ„ 5๊ฐœ ๊ฒฐ๊ณผ์— ๋Œ€ํ•ด ๊ฑฐ์˜ ์ด์ƒ์ ์ธ ์ˆœ์„œ์— ๊ฐ€๊น๊ฒŒ ๊ฒฐ๊ณผ๋ฅผ ์ œ์‹œํ–ˆ๋‹ค๊ณ  ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค (nDCG@5 โ‰ˆ 0.973).

4. ์–ด๋–ค ์ง€ํ‘œ๋ฅผ ์–ธ์ œ ์‚ฌ์šฉํ•ด์•ผ ํ• ๊นŒ? ๐Ÿค”

  • mRR: ์‚ฌ์šฉ์ž๊ฐ€ ํ•˜๋‚˜์˜ ๋ช…ํ™•ํ•œ ์ •๋‹ต์„ ์ฐพ๊ณ  ์žˆ์œผ๋ฉฐ, ๊ทธ ์ •๋‹ต์ด ์–ผ๋งˆ๋‚˜ ๋นจ๋ฆฌ ๋‚˜ํƒ€๋‚˜๋Š”์ง€๊ฐ€ ์ค‘์š”ํ•  ๋•Œ (์˜ˆ: โ€œํ”„๋ž‘์Šค ์ˆ˜๋„๋Š”?โ€, ํŠน์ • ๋…ผ๋ฌธ ์ฐพ๊ธฐ).
  • mAP: ์‚ฌ์šฉ์ž๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ด€๋ จ์„ฑ ๋†’์€ ๊ฒฐ๊ณผ๋ฅผ ์ฐพ๊ณ  ์žˆ์œผ๋ฉฐ, ๊ฒฐ๊ณผ์˜ ์ˆœ์„œ์™€ ๊ด€๋ จ์„ฑ ์—ฌ๋ถ€๊ฐ€ ์ค‘์š”ํ•  ๋•Œ (์˜ˆ: โ€œ๋จธ์‹ ๋Ÿฌ๋‹ ํŠœํ† ๋ฆฌ์–ผโ€, โ€œ์—ฌํ–‰ ์ถ”์ฒœโ€). ์ด์ง„ ๊ด€๋ จ์„ฑ์œผ๋กœ ์ถฉ๋ถ„ํ•  ๋•Œ ์ฃผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  • nDCG: ๊ด€๋ จ์„ฑ์˜ ์ •๋„๊ฐ€ ๋‹ค์–‘ํ•˜๊ณ , ์ƒ์œ„ ์ˆœ์œ„์˜ ํ’ˆ์งˆ์ด ๋งค์šฐ ์ค‘์š”ํ•  ๋•Œ (์˜ˆ: ์›น ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ, ์˜ํ™”/์Œ์•… ์ถ”์ฒœ). mAP๋ณด๋‹ค ๋” ์„ธ๋ฐ€ํ•œ ํ‰๊ฐ€๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๊ฐ ์ง€ํ‘œ๋Š” ์ •๋ณด ๊ฒ€์ƒ‰ ์‹œ์Šคํ…œ์˜ ํŠน์ • ์ธก๋ฉด์„ ๊ฐ•์กฐํ•˜๋ฏ€๋กœ, ์‹œ์Šคํ…œ์˜ ๋ชฉํ‘œ์™€ ์‚ฌ์šฉ์ž์˜ ์š”๊ตฌ์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ์ง€ํ‘œ๋ฅผ ์„ ํƒํ•˜๊ฑฐ๋‚˜ ์—ฌ๋Ÿฌ ์ง€ํ‘œ๋ฅผ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค๊ฐ์ ์œผ๋กœ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๊ฒƒ์ด ๋ฐ”๋žŒ์งํ•ฉ๋‹ˆ๋‹ค. ๐Ÿ‘

์ฐธ๊ณ :

๊ด€๋ จ ๋…ธํŠธ: ์ •๋ณด ๊ฒ€์ƒ‰ ์‹œ์Šคํ…œ์˜ ๊ธฐ๋ณธ ์›๋ฆฌ ์ •๋ฐ€๋„(Precision)์™€ ์žฌํ˜„์œจ(Recall)์˜ ์ดํ•ด ํ‰๊ฐ€ ์ง€ํ‘œ ์„ ํƒ ๊ฐ€์ด๋“œ: ๋ถ„๋ฅ˜ ๋ฌธ์ œ vs. ๊ฒ€์ƒ‰ ๋ฌธ์ œ ์ถ”์ฒœ ์‹œ์Šคํ…œ ์„ฑ๋Šฅ ํ‰๊ฐ€ ์ง€ํ‘œ: RMSE, MAE, Precision@k, Recall@k ์งˆ์˜ ์‘๋‹ต(Question Answering) ์‹œ์Šคํ…œ ํ‰๊ฐ€ ๋ฐฉ๋ฒ• ๋žญํ‚น ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฐœ์š”: PageRank๋ถ€ํ„ฐ Learning to Rank๊นŒ์ง€ ์ด์ง„ ๊ด€๋ จ์„ฑ(Binary Relevance) vs ๋“ฑ๊ธ‰๋ณ„ ๊ด€๋ จ์„ฑ(Graded Relevance) B ํ…Œ์ŠคํŠธ๋ฅผ ํ†ตํ•œ ๊ฒ€์ƒ‰ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์„ฑ๋Šฅ ๋น„๊ต ์˜คํ”„๋ผ์ธ ํ‰๊ฐ€ ์ง€ํ‘œ์™€ ์˜จ๋ผ์ธ ์‚ฌ์šฉ์ž ๋งŒ์กฑ๋„์˜ ๊ด€๊ณ„ ์‚ฌ์šฉ์ž ๋กœ๊ทธ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ ๊ฒ€์ƒ‰ ํ’ˆ์งˆ ๊ฐœ์„ 

๐Ÿท๏ธ: ์ •๋ณด ๊ฒ€์ƒ‰ ํ‰๊ฐ€ ์ง€ํ‘œ mAP mRR nDCG ์„ฑ๋Šฅ ์ธก์ • ๊ธฐ๊ณ„ ํ•™์Šต ๋ฐ์ดํ„ฐ ๊ณผํ•™ ๊ฒ€์ƒ‰ ์—”์ง„ ์ถ”์ฒœ ์‹œ์Šคํ…œ ๋žญํ‚น