ML/MLOps Engineer
Distributed Job Scheduler system design

Distributed Job Scheduler system design ๋ฅผ ์ฝ๊ณ  ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.
๋Œ€๋Ÿ‰์˜ user-submitted jobs์„ ํ™•์žฅ ๊ฐ€๋Šฅํ•˜๊ณ  ๋‚ด๊ฒฐํ•จ์„ฑ ์žˆ๊ฒŒ ๊ด€๋ฆฌ, ์ถ”์  ๋ฐ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๋ถ„์‚ฐ ์ž‘์—… ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ์„ค๊ณ„ํ•ฉ๋‹ˆ๋‹ค.

High level system design

๋‹ค์Œ์€ ๋ถ„์‚ฐ ์ž‘์—… ์Šค์ผ€์ค„๋Ÿฌ์˜ ๊ณ ์ˆ˜์ค€ ์‹œ์Šคํ…œ ์„ค๊ณ„(high level system design)์ž…๋‹ˆ๋‹ค.

๊ฐ ์„œ๋น„์Šค๋ฅผ ์ž์„ธํžˆ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Job Submission Service

Job submission service๋Š” Jobs ํ…Œ์ด๋ธ”์— ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค. Jobs ํ…Œ์ด๋ธ”์˜ ์Šคํ‚ค๋งˆ๋Š” UserId+JobId๋ฅผ ๊ธฐ๋ณธ ํ‚ค๋กœ, UserId๋ฅผ ์ƒค๋“œ ํ‚ค๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

์ƒค๋“œ ํ‚ค(shard key)๋Š” ๋ถ„์‚ฐ ์‹œ์Šคํ…œ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ ์„œ๋ฒ„(์ƒค๋“œ)์— ๋‚˜๋ˆ„์–ด ์ €์žฅํ•  ๋•Œ ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ถ„๋ฐฐํ• ์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ๊ธฐ์ค€์ด ๋˜๋Š” ํ‚ค์ž…๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๊ฐ€ ์ˆ˜ํ‰์ ์œผ๋กœ ํ™•์žฅ๋˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ ์ƒค๋“œ๋กœ ๋‚˜๋ˆ„์–ด ์ €์žฅํ•˜๋Š” ๊ฒƒ์ด ํšจ์œจ์ ์ด๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ๊ณผ ์ฒ˜๋ฆฌ ์šฉ๋Ÿ‰์„ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

11True0 */2 * * *3t1
12True0 */6 * * *1t2
13True0 0 * * *0t3
14FalseNULL3t4
25FalseNULL3t4

Job submission์€ ๋˜ํ•œ ์ž‘์—…์„ ๋ถ„์„ํ•˜๊ณ  ScheduledJob ํ์— ์ž‘์—… ์ฃผ๊ธฐ์— ๋”ฐ๋ผ TTL์„ ์„ค์ •ํ•˜์—ฌ ๋Œ€๊ธฐ์—ด์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ScheduledJob์€ ๋‹ค์Œ ์Šคํ‚ค๋งˆ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์•ฝ๋œ ์ž‘์—…์„ ์ €์žฅํ•˜๋ฉฐ, NextExecutionTimestamp๊ฐ€ ์ƒค๋“œ ํ‚ค๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ํƒ€์ž„์Šคํƒฌํ”„ ๊ธฐ๋ฐ˜ ์ƒค๋“œ ํ‚ค๋Š” ๋Œ€๊ธฐ์—ด ์‹œ์Šคํ…œ์ด ๋งค ๋ถ„๋งˆ๋‹ค ๊ฐ™์€ ์œ„์น˜์— ์žˆ๋Š” ์ž‘์—…์„ ๊ฐ€์ ธ์˜ค๋„๋ก ๋•์Šต๋‹ˆ๋‹ค.

T11
T12
T13

Scheduler Service

์Šค์ผ€์ค„๋Ÿฌ ์„œ๋น„์Šค๋Š” ์ž‘์—…์„ ์‹คํ–‰ํ•  ๋•Œ ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ๊ฒฐ์ •์„ ๋‚ด๋ ค์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  1. ์ž‘์—…์ด ์ฃผ๊ธฐ์ (interval)์ธ ๊ฒฝ์šฐ, ๋‹ค์Œ ์‹คํ–‰ ์‹œ๊ฐ„์„ ๊ณ„์‚ฐํ•˜๊ณ  ๋‹ค์‹œ ScheduledJob ํ์— ๋Œ€๊ธฐ์‹œํ‚ต๋‹ˆ๋‹ค.
  2. ์ž‘์—…์˜ ์œ ํ˜•, ์ข…์†์„ฑ ๋“ฑ์„ ๋ถ„์„ํ•˜๊ณ  ์ ์ ˆํ•œ ์›Œ์ปค ๋…ธ๋“œ๋ฅผ ์ฐพ์•„ ์ž‘์—…์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Workers

์ž‘์—…์— ๋”ฐ๋ผ ์—ฌ๋Ÿฌ ์œ ํ˜•์˜ ์›Œ์ปค๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฉฑ๋“ฑ์„ฑ ์›Œ์ปค(Idempotent workers)

๋ถ„์‚ฐ ์‹œ์Šคํ…œ์—์„œ๋Š” (์žฌ์‹œ๋„ ๋“ฑ์œผ๋กœ ์ธํ•ด) ์ค‘๋ณต ๋ฉ”์‹œ์ง€๊ฐ€ ์›Œ์ปค์— ์ „๋‹ฌ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์›Œ์ปค๋Š” ๋ฉฑ๋“ฑ์„ฑ(idempotency)์„ ์œ ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋‘ ์›Œ์ปค๊ฐ€ ๋™์ผํ•œ ์ž‘์—…์„ ๋ณ‘๋ ฌ๋กœ ์ฒ˜๋ฆฌํ•˜๋”๋ผ๋„ ์„œ๋กœ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋น„๋ฉฑ๋“ฑ์„ฑ ์›Œ์ปค(Non idempotent workers)

์ผ๋ถ€ ์ž‘์—…์€ ํŠน์„ฑ์ƒ ์—ฌ๋Ÿฌ ์›Œ์ปค๊ฐ€ ๋™์ผํ•œ ์ž‘์—…์„ ๋™์‹œ์— ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š๋„๋ก ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ์›Œ์ปค๋Š” ์™ธ๋ถ€ ์ €์žฅ์†Œ๋ฅผ ์ฐธ์กฐํ•ด ์ž‘์—… ์ง„ํ–‰ ์ƒํƒœ๋ฅผ ํ™•์ธํ•˜๊ณ , ์ฒ˜๋ฆฌ ์ค‘์ธ์ง€ ํ™•์ธํ•˜์—ฌ ํ•„์š”์— ๋”ฐ๋ผ ์ž‘์—…์„ ๊ณ„์†ํ•˜๊ฑฐ๋‚˜ ์ค‘๋‹จํ•ฉ๋‹ˆ๋‹ค.

๋‹จ๊ธฐ ์ž‘์—…(Short running job)

๋ช‡ ์ดˆ ์•ˆ์— ์™„๋ฃŒํ•  ์ˆ˜ ์žˆ๋Š” ๋‹จ๊ธฐ ์ž‘์—…์€ ํƒ€์ž„์•„์›ƒ ๋‚ด์—์„œ ๋Œ€๊ธฐ์—ด์— ์‘๋‹ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์žฅ๊ธฐ ์ž‘์—…(Long running job)

์ผ๋ถ€ ์ž‘์—…์€ ์‹คํ–‰์— ๋ช‡ ๋ถ„ ํ˜น์€ ๋ช‡ ์‹œ๊ฐ„ ์ด์ƒ ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋Œ€๊ธฐ์—ด ์‹œ์Šคํ…œ์€ ์ด ์ •๋„๋กœ ์˜ค๋ž˜ ๊ธฐ๋‹ค๋ฆด ์ˆ˜ ์—†์œผ๋ฏ€๋กœ ์ด๋ฒคํŠธ๊ฐ€ ํƒ€์ž„์•„์›ƒ๋˜์–ด ๋‹ค๋ฅธ ์›Œ์ปค์—๊ฒŒ ์žฌ์ „๋‹ฌ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋กœ ์ธํ•ด ๋™์ผํ•œ ์ž‘์—…์ด ์—ฌ๋Ÿฌ ์›Œ์ปค์—์„œ ์žฌ์ฒ˜๋ฆฌ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์žฅ๊ธฐ ์ž‘์—…์„ ์ฒ˜๋ฆฌํ•˜๋Š” ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•์ด ์žˆ์œผ๋ฉฐ, ์ž‘์—…์˜ ํŠน์„ฑ์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

  • ์ž‘์—… ์ฒ˜๋ฆฌ ํŽ˜์ด์ง•(Paginate the job processing)
    ์ž‘์—… ๋ฒ”์œ„๋ฅผ ์ž‘์€ ์ฒญํฌ๋กœ ๋‚˜๋ˆ„๊ณ  ํŽ˜์ด์ง€ ๋ฐฉ์‹์œผ๋กœ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ์ด ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ž‘์—…์„ ๋‹จ๊ธฐ ์ž‘์—…์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. Checkpoint ํ…Œ์ด๋ธ”์„ ์‚ฌ์šฉํ•ด ์ž‘์—… ์ง„ํ–‰ ์ƒํƒœ๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

    JobId PageSize PageStart IsCompleted
    1 10 20 False
    2 10 0 False
    3 10 100 True

  • ์žฅ๊ธฐ ์‹คํ–‰ ์ž‘์—…์„ ์œ„ํ•œ ๋ฏธ๋ž˜ ๋ฉ”์‹œ์ง€ ๋Œ€๊ธฐ์—ด ์ถ”๊ฐ€(Enqueue message for long future)
    ์ž‘์—…์„ ์ž‘์€ ์ฒญํฌ๋กœ ๋‚˜๋ˆ„๊ธฐ ์–ด๋ ค์šด ๊ฒฝ์šฐ JobStatus ํ…Œ์ด๋ธ”์„ ์œ ์ง€ํ•˜์—ฌ ์ฒ˜๋ฆฌ ์ƒํƒœ๋ฅผ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

    JobId Status
    1 INPROGRESS
    2 COMPLETED

์žฌ์‹œ๋„ ๋ฉ”์ปค๋‹ˆ์ฆ˜(Retry mechanism)

์ž‘์—…์˜ ์žฌ์‹œ๋„ ์˜ต์…˜์— ๋”ฐ๋ผ ์›Œ์ปค ๋…ธ๋“œ๋Š” ๋ฉ”์‹œ์ง€๋ฅผ ํ™•์ธํ•˜๊ฑฐ๋‚˜ ํ™•์ธํ•˜์ง€ ์•Š๊ณ  ์žฌ์ „์†กํ•ฉ๋‹ˆ๋‹ค. ์žฌ์‹œ๋„๊ฐ€ ๊ฐ€๋Šฅํ•œ ์ž‘์—…์˜ ๊ฒฝ์šฐ ์‹คํŒจ ์‹œ ๋ฉ”์‹œ์ง€๋ฅผ ํ™•์ธํ•˜๊ณ , ๋ฉ”์‹œ์ง€๋ฅผ ์ƒˆ๋กœ ๋ฐœํ–‰ํ•˜๋ฉฐ ์žฌ์‹œ๋„๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์žฌ์‹œ๋„๊ฐ€ ๋ถˆ๊ฐ€๋Šฅํ•œ ์ž‘์—…์€ ์‹คํŒจ ์‹œ ๋ฉ”์‹œ์ง€๋ฅผ ํ™•์ธํ•˜์ง€ ์•Š๊ณ  ์ค‘๋‹จํ•ฉ๋‹ˆ๋‹ค.

์ž‘์—… ์‹คํ–‰ ์ด๋ ฅ(Task execution history)

์ž‘์—… ์‹คํ–‰ ์™„๋ฃŒ ํ›„ ์›Œ์ปค๋Š” ์ž‘์—… ์‹คํ–‰ ์ด๋ ฅ์„ ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค. TaskExecutionHistory ์Šคํ‚ค๋งˆ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

JobId StartTime CompletedTime Status RetryCount

๋ณด๊ณ  ์„œ๋น„์Šค(Reporting Service)

๋ณด๊ณ  ์„œ๋น„์Šค๋Š” Jobs์™€ TaskExecutionHistory ํ…Œ์ด๋ธ”์„ ์‚ฌ์šฉํ•ด ์‚ฌ์šฉ์ž๋ณ„ ์ž‘์—… ์ƒํƒœ๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.

์•Œ๋ฆผ ๋ฐ ๋ชจ๋‹ˆํ„ฐ๋ง ์„œ๋น„์Šค(Alerting and monitoring service)

์ž‘์—… ์‹คํŒจ์œจ์„ ๋ชจ๋‹ˆํ„ฐ๋ง ์‹œ์Šคํ…œ์— ๊ธฐ๋กํ•˜๊ณ  ์กฐ๊ฑด์— ๋”ฐ๋ผ ์•Œ๋ฆผ ๊ทœ์น™์„ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์Šค์ผ€์ผ๋ง ์ž‘์—… ์Šค์ผ€์ค„๋Ÿฌ ์‹œ์Šคํ…œ(Scale job scheduler system)

์‚ฌ์šฉ์ž์™€ ์ž‘์—…์˜ ์–‘์— ๋”ฐ๋ผ ๊ฐ ์ปดํฌ๋„ŒํŠธ๋ฅผ ์ˆ˜ํ‰์œผ๋กœ ํ™•์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ €์žฅ์†Œ๊ฐ€ ์ ์ ˆํžˆ ๋ถ„ํ• ๋˜์–ด ์žˆ์–ด ์ˆ˜ํ‰ ํ™•์žฅ์ด ์šฉ์ดํ•ฉ๋‹ˆ๋‹ค.


reference: Distributed Job Scheduler system design