New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[7주차] 하둡에코시스템 #43

Open

dani-lee-418 opened this issue Nov 14, 2023 · 1 comment

dani-lee-418 commented Nov 14, 2023

전에 다니던 회사에서 하둡/스파크 기반으로한 빅데이터 전처리 솔루션 설치했었는데 그때 알았으면 훨씬 재밌었을 것 같아요.

https://dodonam.tistory.com/390

myeongjae-kim commented Nov 14, 2023

다은: 빅데이터를 하둡으로 처리하기 위해서 오픈소스로 생태계를 구성해놓은 것. 원하는 툴둘을 갈아끼워서 사용할 수 있다.

전 회사에서는 하둡에 클라우데라 썼었다. 그때는 뭐하는 녀석들인지 몰랐는데 이제는 조금 알겠다.

스파크 slave한테 메모리를 얼마나 할당하는지에 따라서 속도가 달라졌는데.. 그 때는 이유를 몰랐었다. 테스트하면서 가장 속도 빠른 설정을 찾아봄.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment