Saturday, 15 August 2015

parallel processing - What is the "task" in Storm parallelism -


I'm a great article ""

But what if I get a little confused with the concept of "work" Working of the component (spout or bolt)? An executive who is doing many things is actually saying that the same component is executed multiple times by executor, am I right?

In addition, in the sense of a common similarity, the storm will create a dedicated thread (operator) for a spread or bolt, but in parallel by the output of the many tasks What has contributed? I think since a thread has many functions, a thread is executed sequentially, only makes the thread a "cached" resource, which saves new threads to run the next task. Am I right?

After taking more time to investigate, I can clear those confusions, but you know, we love both Stackoverflow; -)

Thanks in advance.

Disclaimer: I have written you in your question above.

Although I am confused with the concept of "work", is the function of a work component (spout or bolt) going on? An executive who is doing many things is actually saying that the same component is executed multiple times by executor, am I right?

Yes, and yes.

Apart from this, in the sense of a common equality, the storm will create a dedicated thread (executable) for a stout or bolt, but what contributed to parallelism with many tasks by an acting thread (thread) goes?

Executing multiple jobs per executable does not increase the level of parallelism - an executor always has a thread that uses it for all operations, which means that Run the executor on sequential basis.

I have been written in the article that:

  • After the start of the topology, the number of executable threads can be changed ( storm rebalance See the command).
  • The number of work of a topology is constant.

    And according to the definition, #xpectors & lt; = #tasks is irreversible. >

    Therefore, one reason to do 2+ work for each executable thread, in spite of expanding the topology through the storm rebalance command in the future, despite not taking topology offline To provide flexibility, for example, imagine that you start with a storm of 15 machines, but already know that another 10 boxes will be added next week. Here you can already choose to run topology at the estimated 25 parallel level of 25 machines (which is definitely slower than 25 boxes). After additional 10 boxes are integrated, you can do all the 25 boxes for full use of the hurricane rebellance topology without any downtime.

    There is another reason to run a 2+ task executioner (primarily functional) for example, if your god machine or CI server is only powerful enough to run, say, run on the machine 2 Exporter with all other accessories, you can still run 30 jobs (here: 15 executables) to see if the code

    PS: Note that the storm will actually contain eggs. For, every objection Nearby Adk has its own "message thread" which is responsible for handling outgoing tuples. For example, "system-level" background threads are also known as Accing Tupals that go with "Your" thread, the IIRC Storm UI adds threads in addition to those "threads".

No comments:

Post a Comment