Why is BigData done in Scala?

S

Sergey Nizhny Novgorod2017-06-22 01:46:40

Python

Sergey Nizhny Novgorod, 2017-06-22 01:46:40

Hello everyone
I drew attention to a paradoxical thing.
1) Python for BigData is promoted everywhere and many tutorials are made on it.
2) Real practice shows that python is considered a "sandbox", and real bigdata projects are done in Scala/Java.
Question: Why is python so bad for bigdata, and if everything is so bad, then why is it promoted?

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

I

ivodopyanov, 2017-06-23
@ivodopyanov

Not python is bad for bigdata, but java is good for enterprise

�

⚡ Kotobotov ⚡, 2017-08-04
@angrySCV

The problem with resource efficiency in Python
is that it is a dynamically typed language.
Java/Scala - statically typed languages.
With static typing, flexibility is lower, but the efficiency of working with data is higher, with dynamic typing, vice versa.
Scala allows you to combine the benefits of dynamic typing (due to automatic type inference) while maintaining high performance.
There are also other details, for example, if we speak for SPARK -> which is implemented on the rock, knowing the rock, you already know spark, because there is almost the same api for working with collections, all you need to make your code work on a spark cluster, replace the name of the collection you are processing from for example Array[MyClass] toRDD[MyClass] , and roughly speaking - all your code will be processed without changes on a spark cluster, of course there is no such thing and will not be in python, there you will need to master an additional api.

P

Pavel Ivanov, 2017-08-30
@eastywest

The JVM has good multithreading support, and Scala has a great collections library. And many solutions are written in Scala these days (Spark, Kafka, ...)