Entire proxmox Xen Node has grey question marks + database container gone The 2019 Stack Overflow Developer Survey Results Are InSuddenly mysql/mariadb won't startConfiguring nagios notification settings to be very frequentProxmox: Node's load at 70, 2 containers are not responding, but top shows nothingTrying to recover MySQL database from a crashed server and I keep getting “mysqldump couldn't execute 'show create table doesn't exist (1146)”How do I migrate Proxmox 3.x openVZ containers to Proxmox 4.x LXC?Monitoring Proxmox LXC with ZabbixUsing NFS Mounts as Long Term Container Storageexecstop mysql error on stop and restartConfiguring MySQL to Listen for Remote Connections on UbuntuSystemctl doesn't see a service running
What force causes entropy to increase?
Is there a way to generate uniformly distributed points on a sphere from a fixed amount of random real numbers per point?
Lightning List Vew Search Not Returning Expected Results
Hello, Goodbye, Adios, Aloha
Did the UK government pay "millions and millions of dollars" to try to snag Julian Assange?
Using `min_active_rowversion` for global temporary tables
Guaranteed memory layout for standard layout struct with a single array member of primitive type
Am I ethically obligated to go into work on an off day if the reason is sudden?
How to read αἱμύλιος or when to aspirate
Variable with quotation marks "$()"
Why doesn't shell automatically fix "useless use of cat"?
The following signatures were invalid: EXPKEYSIG 1397BC53640DB551
Likelihood that a superbug or lethal virus could come from a landfill
What do I do when my TA workload is more than expected?
How to make Illustrator type tool selection automatically adapt with text length
The repository 'http://dl.google.com/linux/chrome/deb stable Release' is not signed
Does adding complexity mean a more secure cipher?
How do you keep chess fun when your opponent constantly beats you?
How can I define good in a religion that claims no moral authority?
Does Parliament need to approve the new Brexit delay to 31 October 2019?
How to quickly solve partial fractions equation?
How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time
Why can't wing-mounted spoilers be used to steepen approaches?
Are spiders unable to hurt humans, especially very small spiders?
Entire proxmox Xen Node has grey question marks + database container gone
The 2019 Stack Overflow Developer Survey Results Are InSuddenly mysql/mariadb won't startConfiguring nagios notification settings to be very frequentProxmox: Node's load at 70, 2 containers are not responding, but top shows nothingTrying to recover MySQL database from a crashed server and I keep getting “mysqldump couldn't execute 'show create table doesn't exist (1146)”How do I migrate Proxmox 3.x openVZ containers to Proxmox 4.x LXC?Monitoring Proxmox LXC with ZabbixUsing NFS Mounts as Long Term Container Storageexecstop mysql error on stop and restartConfiguring MySQL to Listen for Remote Connections on UbuntuSystemctl doesn't see a service running
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
Firstly, i've recently taken on the management of a proxmox cluster which I have had no experience managing previously (i'm completely new to cluster management, but not too bad at linux).
pve-manager/5.1-46/ae8241d4 (running kernel: 4.13.13-6-pve)
I have 2 xen nodes which run a number of containers and VMs within them. Yesterday, a container on Xen2, which runs a mysql database, stopped responding. I was able to log in to the container via ssh and attempted to restart mysql only to receive an error along the lines that it was unable to connect to the mysql.sock. So I decided to simply shutdown the container and start it back up. I chose 'shutdown' in proxmox UI for the container, which it then shutdown. Then I clicked 'start', in which proxmox logs recorded:
CT 110 - Start ERROR: command 'systemctl start pve-container@110' failed: exit code 1
So, I've tried running the 'system start ...' via ssh. It takes a while, and then I get the following:
Job for pve-container@110.service failed because a timeout was exceeded.
See "systemctl status pve-container@110.service" and "journalctl -xe" for details.
Here is the output of 'systemctl status ...':
● pve-container@110.service - PVE LXC Container: 110
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: timeout) since Thu 2018-06-07 08:35:22 BST; 43s ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 1603366 ExecStart=/usr/bin/lxc-start -n 110 (code=killed, signal=TERM)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/system-pvex2dcontainer.slice/pve-container@110.service
└─1532500 [lxc monitor] /var/lib/lxc 110
Jun 07 08:33:52 xen2 systemd[1]: Starting PVE LXC Container: 110...
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
and 'journalctl -xe':
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
-- Subject: Unit pve-container@110.service has failed
-- Defined-By: systemd
--
-- Unit pve-container@110.service has failed.
--
-- The result is failed.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
Shortly after attempting to restart the container the first time, the entire xen2 node started displaying grey questions marks along side all it's VM/containers and they lost their labels (see screenshot):
Despite this, all the other VMs/Containers within xen2 are still functioning fine. So, I've then decided to run the following commands to see what would happen:
service pvedaemon restart (nothing changed)
service pveproxy restart (nothing changed)
service pvestatd restart (The VMs started showing names within proxmox UI (but not containers), but this only lasted 10-15 minutes)
I'm hesitant to upgrade or restart the entire xen node due to the unknown side of configuration and what potential pitfalls may lie ahead and that its business critical to have at least something running. Furthermore, i've ran through /var/log/syslog and didn't see anything that indicated why the container crashed.
Ideally, I want to achieve:
Determine why the database container crashed (110)
Successfully start up the database container again
Determine why the xen2 node isn't reporting data to the UI about it's VM/Containers
Fix the reporting data in the UI for the node
Again, please appreciate i'm new to proxmox, but I do know my away around linux.
Thank you for any tips/knowledge on troubleshooting this problem. If there is any other info you'd like me to share, please let me know.
Cheers,
David
linux mysql cluster proxmox percona
add a comment |
Firstly, i've recently taken on the management of a proxmox cluster which I have had no experience managing previously (i'm completely new to cluster management, but not too bad at linux).
pve-manager/5.1-46/ae8241d4 (running kernel: 4.13.13-6-pve)
I have 2 xen nodes which run a number of containers and VMs within them. Yesterday, a container on Xen2, which runs a mysql database, stopped responding. I was able to log in to the container via ssh and attempted to restart mysql only to receive an error along the lines that it was unable to connect to the mysql.sock. So I decided to simply shutdown the container and start it back up. I chose 'shutdown' in proxmox UI for the container, which it then shutdown. Then I clicked 'start', in which proxmox logs recorded:
CT 110 - Start ERROR: command 'systemctl start pve-container@110' failed: exit code 1
So, I've tried running the 'system start ...' via ssh. It takes a while, and then I get the following:
Job for pve-container@110.service failed because a timeout was exceeded.
See "systemctl status pve-container@110.service" and "journalctl -xe" for details.
Here is the output of 'systemctl status ...':
● pve-container@110.service - PVE LXC Container: 110
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: timeout) since Thu 2018-06-07 08:35:22 BST; 43s ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 1603366 ExecStart=/usr/bin/lxc-start -n 110 (code=killed, signal=TERM)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/system-pvex2dcontainer.slice/pve-container@110.service
└─1532500 [lxc monitor] /var/lib/lxc 110
Jun 07 08:33:52 xen2 systemd[1]: Starting PVE LXC Container: 110...
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
and 'journalctl -xe':
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
-- Subject: Unit pve-container@110.service has failed
-- Defined-By: systemd
--
-- Unit pve-container@110.service has failed.
--
-- The result is failed.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
Shortly after attempting to restart the container the first time, the entire xen2 node started displaying grey questions marks along side all it's VM/containers and they lost their labels (see screenshot):
Despite this, all the other VMs/Containers within xen2 are still functioning fine. So, I've then decided to run the following commands to see what would happen:
service pvedaemon restart (nothing changed)
service pveproxy restart (nothing changed)
service pvestatd restart (The VMs started showing names within proxmox UI (but not containers), but this only lasted 10-15 minutes)
I'm hesitant to upgrade or restart the entire xen node due to the unknown side of configuration and what potential pitfalls may lie ahead and that its business critical to have at least something running. Furthermore, i've ran through /var/log/syslog and didn't see anything that indicated why the container crashed.
Ideally, I want to achieve:
Determine why the database container crashed (110)
Successfully start up the database container again
Determine why the xen2 node isn't reporting data to the UI about it's VM/Containers
Fix the reporting data in the UI for the node
Again, please appreciate i'm new to proxmox, but I do know my away around linux.
Thank you for any tips/knowledge on troubleshooting this problem. If there is any other info you'd like me to share, please let me know.
Cheers,
David
linux mysql cluster proxmox percona
add a comment |
Firstly, i've recently taken on the management of a proxmox cluster which I have had no experience managing previously (i'm completely new to cluster management, but not too bad at linux).
pve-manager/5.1-46/ae8241d4 (running kernel: 4.13.13-6-pve)
I have 2 xen nodes which run a number of containers and VMs within them. Yesterday, a container on Xen2, which runs a mysql database, stopped responding. I was able to log in to the container via ssh and attempted to restart mysql only to receive an error along the lines that it was unable to connect to the mysql.sock. So I decided to simply shutdown the container and start it back up. I chose 'shutdown' in proxmox UI for the container, which it then shutdown. Then I clicked 'start', in which proxmox logs recorded:
CT 110 - Start ERROR: command 'systemctl start pve-container@110' failed: exit code 1
So, I've tried running the 'system start ...' via ssh. It takes a while, and then I get the following:
Job for pve-container@110.service failed because a timeout was exceeded.
See "systemctl status pve-container@110.service" and "journalctl -xe" for details.
Here is the output of 'systemctl status ...':
● pve-container@110.service - PVE LXC Container: 110
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: timeout) since Thu 2018-06-07 08:35:22 BST; 43s ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 1603366 ExecStart=/usr/bin/lxc-start -n 110 (code=killed, signal=TERM)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/system-pvex2dcontainer.slice/pve-container@110.service
└─1532500 [lxc monitor] /var/lib/lxc 110
Jun 07 08:33:52 xen2 systemd[1]: Starting PVE LXC Container: 110...
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
and 'journalctl -xe':
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
-- Subject: Unit pve-container@110.service has failed
-- Defined-By: systemd
--
-- Unit pve-container@110.service has failed.
--
-- The result is failed.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
Shortly after attempting to restart the container the first time, the entire xen2 node started displaying grey questions marks along side all it's VM/containers and they lost their labels (see screenshot):
Despite this, all the other VMs/Containers within xen2 are still functioning fine. So, I've then decided to run the following commands to see what would happen:
service pvedaemon restart (nothing changed)
service pveproxy restart (nothing changed)
service pvestatd restart (The VMs started showing names within proxmox UI (but not containers), but this only lasted 10-15 minutes)
I'm hesitant to upgrade or restart the entire xen node due to the unknown side of configuration and what potential pitfalls may lie ahead and that its business critical to have at least something running. Furthermore, i've ran through /var/log/syslog and didn't see anything that indicated why the container crashed.
Ideally, I want to achieve:
Determine why the database container crashed (110)
Successfully start up the database container again
Determine why the xen2 node isn't reporting data to the UI about it's VM/Containers
Fix the reporting data in the UI for the node
Again, please appreciate i'm new to proxmox, but I do know my away around linux.
Thank you for any tips/knowledge on troubleshooting this problem. If there is any other info you'd like me to share, please let me know.
Cheers,
David
linux mysql cluster proxmox percona
Firstly, i've recently taken on the management of a proxmox cluster which I have had no experience managing previously (i'm completely new to cluster management, but not too bad at linux).
pve-manager/5.1-46/ae8241d4 (running kernel: 4.13.13-6-pve)
I have 2 xen nodes which run a number of containers and VMs within them. Yesterday, a container on Xen2, which runs a mysql database, stopped responding. I was able to log in to the container via ssh and attempted to restart mysql only to receive an error along the lines that it was unable to connect to the mysql.sock. So I decided to simply shutdown the container and start it back up. I chose 'shutdown' in proxmox UI for the container, which it then shutdown. Then I clicked 'start', in which proxmox logs recorded:
CT 110 - Start ERROR: command 'systemctl start pve-container@110' failed: exit code 1
So, I've tried running the 'system start ...' via ssh. It takes a while, and then I get the following:
Job for pve-container@110.service failed because a timeout was exceeded.
See "systemctl status pve-container@110.service" and "journalctl -xe" for details.
Here is the output of 'systemctl status ...':
● pve-container@110.service - PVE LXC Container: 110
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: timeout) since Thu 2018-06-07 08:35:22 BST; 43s ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 1603366 ExecStart=/usr/bin/lxc-start -n 110 (code=killed, signal=TERM)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/system-pvex2dcontainer.slice/pve-container@110.service
└─1532500 [lxc monitor] /var/lib/lxc 110
Jun 07 08:33:52 xen2 systemd[1]: Starting PVE LXC Container: 110...
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
and 'journalctl -xe':
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Start operation timed out. Terminating.
Jun 07 08:35:22 xen2 systemd[1]: Failed to start PVE LXC Container: 110.
-- Subject: Unit pve-container@110.service has failed
-- Defined-By: systemd
--
-- Unit pve-container@110.service has failed.
--
-- The result is failed.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Unit entered failed state.
Jun 07 08:35:22 xen2 systemd[1]: pve-container@110.service: Failed with result 'timeout'.
Shortly after attempting to restart the container the first time, the entire xen2 node started displaying grey questions marks along side all it's VM/containers and they lost their labels (see screenshot):
Despite this, all the other VMs/Containers within xen2 are still functioning fine. So, I've then decided to run the following commands to see what would happen:
service pvedaemon restart (nothing changed)
service pveproxy restart (nothing changed)
service pvestatd restart (The VMs started showing names within proxmox UI (but not containers), but this only lasted 10-15 minutes)
I'm hesitant to upgrade or restart the entire xen node due to the unknown side of configuration and what potential pitfalls may lie ahead and that its business critical to have at least something running. Furthermore, i've ran through /var/log/syslog and didn't see anything that indicated why the container crashed.
Ideally, I want to achieve:
Determine why the database container crashed (110)
Successfully start up the database container again
Determine why the xen2 node isn't reporting data to the UI about it's VM/Containers
Fix the reporting data in the UI for the node
Again, please appreciate i'm new to proxmox, but I do know my away around linux.
Thank you for any tips/knowledge on troubleshooting this problem. If there is any other info you'd like me to share, please let me know.
Cheers,
David
linux mysql cluster proxmox percona
linux mysql cluster proxmox percona
asked Jun 7 '18 at 8:08
DavidDavid
2641822
2641822
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Just stumbled over the same problem (one cluster node only showed grey question marks and the containers lost their labels). In my case this was shortly after a proxmox update (from 5.3 to 5.4). After doing similiar things like the OP I finally figured out that my sshd was not listing on port 22 anymore. After restarting sshd it was not ok immediately but needed about 15min or so. Then everything was fine again.
New contributor
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f915584%2fentire-proxmox-xen-node-has-grey-question-marks-database-container-gone%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Just stumbled over the same problem (one cluster node only showed grey question marks and the containers lost their labels). In my case this was shortly after a proxmox update (from 5.3 to 5.4). After doing similiar things like the OP I finally figured out that my sshd was not listing on port 22 anymore. After restarting sshd it was not ok immediately but needed about 15min or so. Then everything was fine again.
New contributor
add a comment |
Just stumbled over the same problem (one cluster node only showed grey question marks and the containers lost their labels). In my case this was shortly after a proxmox update (from 5.3 to 5.4). After doing similiar things like the OP I finally figured out that my sshd was not listing on port 22 anymore. After restarting sshd it was not ok immediately but needed about 15min or so. Then everything was fine again.
New contributor
add a comment |
Just stumbled over the same problem (one cluster node only showed grey question marks and the containers lost their labels). In my case this was shortly after a proxmox update (from 5.3 to 5.4). After doing similiar things like the OP I finally figured out that my sshd was not listing on port 22 anymore. After restarting sshd it was not ok immediately but needed about 15min or so. Then everything was fine again.
New contributor
Just stumbled over the same problem (one cluster node only showed grey question marks and the containers lost their labels). In my case this was shortly after a proxmox update (from 5.3 to 5.4). After doing similiar things like the OP I finally figured out that my sshd was not listing on port 22 anymore. After restarting sshd it was not ok immediately but needed about 15min or so. Then everything was fine again.
New contributor
New contributor
answered 1 hour ago
Argl BarglArgl Bargl
1
1
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Server Fault!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f915584%2fentire-proxmox-xen-node-has-grey-question-marks-database-container-gone%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown