Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinator #47

Open
1 of 5 tasks
korczis opened this issue Nov 10, 2016 · 1 comment
Open
1 of 5 tasks

Coordinator #47

korczis opened this issue Nov 10, 2016 · 1 comment
Assignees

Comments

@korczis
Copy link
Member

korczis commented Nov 10, 2016

Features

  • Prevent Duplicate Work
  • Re-queue Work
    • Crashed Workers
    • Disconnected Workers
    • 429 Too many requests

Implementation

  • Do Not Use Couchbase for Sharing Global State
  • {crawler, processor, url} => {state, pid}

Braindump

sha({worker, processor, url}) => {state, timestamp}

state = requested | queued | done

scenare

  1. Process 1 chce nejakou praci, tak si rekne koordinatorovi.
    Ten zadny takovy zaznam nenajde, tak si ulozi requested a
    zacne process 1 monitorovat.
  2. Process 2 chce stejnou praci, tak si rekne koordinatorovi.
    Ten najde zaznam requested, tak si tam zadost prida a zatim neodpovida.
  3. Process 1 spadne. Koodinator najde zaznam requested (podle pidu, ktery spadnul)
    , najde v seznamu process 2, tak ho monitoruje a posle mu ok.
  4. Process 2 posle praci do rabbita a rekne koordinatorovi, ze queued.
    Koordinator pokud najde zaznam requested, tak vsem ve fronte cekajicich rekne
    ne - prace uz je v queue a vytvori zaznam queued
    Pozn.: Pokud worker spadne mezi tim, co poslal praci do rabbita a tim,
    ze posila koordinatorovi queued, koordinator smaze zaznam requested
    a pokud v te chvili nekdo pozada o stejnou praci, muze bezet 2x. Ale
    to je hodne mala pravdepodobnost (nema proc spadnout).
  5. Process 3 chce stejnou praci, ale koordinator rekne ne, protoze queued
  6. Worker 1 dostane z rabbita praci, kdyz spadne, rabbit to posle jinam
  7. Kdyz worker praci dodela:
    • ulozi praci do cauche
    • rekne koordinatorovi, je prace je done
    • udela ack na rabbit
      Pozn.: Pokud process spadne mezi done a ack do rabbita, rabbit necha praci
      znova spocitat. Ale to se opet bude stavat malokdy.

chovani koordinatora

  1. request
    • prace neexistuje -> ok
    • prace je requested -> zaradit do fronty cekajich v requested (pak obvykle vratit no)
    • prace je queued -> no
    • prace je done a neni uz "stary" -> no
    • prace je done, ale je to uz hodne davno -> ok (a chovani jako kdyby tam zadny zaznam nebyl)
    • koordinator spadne - requestor dostane vyjimku a nejak se s tim vyrovna,
      treba taky sam spadne (bud je to nejaky call z clienta, tak ho klient zopakuje,
      nebo je to job z rabbita a rabbit ho requeue)
  2. queue
    • nastavit tam zaznam queued, pokud je tam requested od stejneho pidu
    • vratit chybu, kdy tam neni nebo je tam cokoliv jineho - to je bud
      spatne pouziti, nebo koordinator mezi volanimi spadnul. S tim se pak
      musi nejak vyrovnat volajici - treba zkusit opet nejdriv reques
  3. done
    • nastavi done v jakemkoliv pripade

Obecne kdyz koordinator spadne, tak se proste prijde o stav a nektere vypocty se muzou spustit
vicekrat. Ale po case se to zase ustali...

@bossek
Copy link
Contributor

bossek commented Nov 27, 2016

Currenly coordinator implements only 'Prevent Duplicate Work' feature. Feature 'Re-queue Work' will be handled by RabbitMQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants