I want to start a project (startup) using Big Data

R

Radmir2015-05-17 16:26:56

big data

Radmir, 2015-05-17 16:26:56

I want to start a project (startup) using Big Data - where to start?

There is an idea for a startup using big data and visualization, where you can start studying the topic.
It is also planned to analyze the collected statistical data (ideally, millions of records) to see if anyone has experience of how this is generally implemented and in which direction you can look (study).

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

L

lPolar, 2015-05-18
@RadmirZ

First you need to understand if you need Big Data habrahabr.ru/post/194434
If you don’t have Big Data, then you can take these tools:
1. Pandas - data processing, I / O
2. Sklearn - building models
3. In in terms of the database for storage, options are possible:
3.1 SQL-bases - SQLite, postgres
3.2 NoSQL - Mongo, etc.
4. If it is expected that some of the data will be used more actively, i.e. you need hot caching - take Redis or its
analogues
Apache Hive - to store all this in a digestible form
Apache Spark - to build predictive models and all sorts of non-classical groupings
Things are more complicated with visualization. First you need to understand what kind of visualization is needed - static or dynamic + the language in which it will be more convenient for you personally to write visualization.
If we visualize in static (in .jpg files, for example), then like this:
R - lattice, ggplot2
Python - matplotlib, seaborn If
we want super cool real-time dashboards, then like this:
R - Shiny
Python - bokeh
what data sources you have, it will be easier to understand what to dig and what tools.

A

Andrey Burov, 2015-05-17
@BuriK666

a million is not bigdata. For starters, I advise you to watch www.youtube.com/watch?v=TEHdfPa1eJA

S

Sergey, 2015-05-17
@begemot_sun

Start by thinking "who needs it?" and offer him his services.

D

Dmitry, 2015-05-20
@vip1987

I agree with Sergey) you must first determine the target audience, and then think "To whom what and with what to serve!"))) Well, you understand what I mean) otherwise Azure still rules ...