my question is which scenario is right is my cluster going to tolerate one (name node or secondary name node )failure (3-1)/2=1 possible failure and 2 possible failures of data nodes(which doesn't have the journal node) becuase i have three data nodes and replication factor that covers all data nodes ) or my cluster tolerate one failure only even if is data node failure or (NameNode Or Secondary NameNode Faiuler) | Your first statement is correct. Your 5-node cluster can tolerate and work properly even though 1 name node and 2 data nodes have failed. In this case 1 active name node and 1 working data node will be available in the cluster. |
what is the meaning of failuer when it happens (is it Unabilty to continue working ) | Beyond tolerate threshold (one name node failure or two data node failure), you will have only one active name node and one working data node, then the cluster will fail totally in one of following cases happen, 1. If working data node failed, existing data will be lost. Still you can add new data nodes and start working with new data. 2. If active name node failed, the cluster will become unusable and down until we restart the all the services. |
Can Syncfusion Cluster Restart Services Again after this Senario | Yes. If all the nodes are restarted properly, we can regain the access of the cluster without any data loss. |
it is unability to restart the whole cluster again when all nodes reworking again | If all the cluster nodes machine and Big Data Agent are restarted and running properly, we can restart the whole cluster using Cluster Manager. To achieve this, please do the following steps, 1. In Cluster Manager, open Management page where the status of all services is displayed. The following screenshot state that installer agent service is running in all nodes and cluster's services are in dead state, 2. Click Start all Services. 3. You can see the status like below, once the cluster services are properly turned up. |
or i should readd namenode , secondary name node ,another data node again to this cluster (i mean at least three journal nodes ) and consider old ones to be as nothing ? | It is not required to add all the nodes again, if cluster is turned up with Cluster Manager. |
My Question |
Your Answer |
Comment |
it is un ability to restart the whole cluster again when all nodes reworking again |
If all the cluster nodes machine and Big Data Agent are restarted and running properly, we can restart the whole cluster using Cluster Manager. |
All cluster nodes machines and big data agents are restarted and running properly And of course, i tried to start all the services and the result was NameNode ,secondary name node and first data node appears as dead but datanode2 and datanode3 (data nodes that are not in default high availability group ,i mean not with name node and secondary name node) appears as active at first then after some time they appears as dead of course because of name node and secondary name node dead state
|
or i should read name node , secondary name node ,another data node again to this cluster (i mean at least three journal nodes ) and consider old ones to be as nothing ?
|
It is not required to add all the nodes again, if cluster is turned up with Cluster Manager. |
If it is not turned up, should I remove name node ,secondary name node and first data node and then readd them again
I attached the log can you verify it please and tell me if iam able to turn up the cluster again before trying this scenario Because this scenario of course will loss one data node And the data contained in ,log files in name node And what will be other files? |
Restarting all vms together |
|
I want to mention that after restarting all vms together The state I mentioned at first, big data agent got corrupted in name node and I wasn’t able to stop its service or restart it from services. MSc so I uninstalled it and reinstall it and every thing was ok ,but why it is getting corrupted |