Sept 27th/2008

DALLAS CLUSTER LINUX ACCOUNTS "Out of Disk Space" errors when FTP'ing

The last week has been rough for customers on this linux box and the good news is that the issue should be resolved shortly. We felt it important to explain the situation because usually adding a hard drive sounds like a simple thing to do. But what we were facing was not so simple.

The server is the web.mywebsitepanel.com server. CPU is fine with lots of growing room. RAM is fine with lots of room. In fact, disk space is also fine with 180+GB still free on the server. This is why we did not anticipate having any disk space problems for the forseeable future.

The issue is only regarding the amount of disk space that Hsphere assigned to the web content partition (called the /hsphere partition). It was capped when psoft setup the server and apparently we cannot alter it once set. This is something we were NOT told. In fact, after perusing the install docs I cannot even find this information. We contacted psoft to see what we could do to use the space already on the server that was free (over 180GB worth!). Again, they cannot alter this without creating a whole new server. That would be quite a job and expense so we asked for psofts help by doing things a bit different. The plan was to add a new large disk drive to the server, copy the hsphere partition to the new disk drive, and change hsphere so it thinks it is the hsphere partition. Psoft techs said it should work. So we progressed in doing this.

We added the new drive and formatted it which took about 5 hours. Then we started the copy process. This would be a 24 hours or longer job. Half way through, the copy process hung. We made some adjustments and tried again. 14 hours in to copying it hung again. We asked for psoft's help to see why this was happening which took 2 days. Things were worked out and we started the process again.. 36 hours later it was all working and live.

But apparently, hsphere did not like the new setup which was not apparent. Disk quota figures were not being reported correctly and we had to revert to the original hsphere partition again. This is when customer files that were uploaded during this time had reverted to the previous state. With more assistance from psoft we tried again and redirected to the new disk drive. This took a few days. All seemed to be working well. But it was no victory. In the last few days customer's scripts were all getting server 500 errors. At first we thought it was just miva sites as no other complaints came in. We took a couple days working with miva corporation on this and then other non-miva customers started telling us about the server 500 error on some scripts etc.

At this point, we had it. So yesterday we moved all back to the original hsphere partition. We created a new partition on another server that has ample space. Today, we will be moving all the largest web accounts to this new partition and this will resolve the issue.

We will update customers below once we reach this step.

Since moving to the new cluster in Dallas back in January of this year, we have had no major incidents or issues. And you can expect that to continue. We will never try a work around with hsphere again, even with the company that made the hsphere control panel software helping us. It was not an issue we forseen and it was unfortunate.

We apologize for the inconveniences over the last 1-2 weeks regarding this issue.

Don Gerrity
Owner, Web Service Specialist