Over Load


Wild Card

Recommended Posts

Hello I have recently had a problem with an overload on the server I was using a php script to automatically check file sharing sites to see if the links were still active. Here is the original problem we received from the host.

please remove the following hack automated Link checker!

this is causing over load on the server that was the reason host suspeneded the site.

this php file is causing the trouble vbbot.php

here is what the host said

Your site appears to be running a PHP script which is using large amounts of CPU and is continually running until we kill off the process. We have disabled the script /home/xxxxxxxxx/public_html/forums/vbbot.php by chmoding it to 0 to stop the issue. If you could please contact us when you believe you've solved the problem it would be much appreciated.

The file was deleted and uploaded again as autobot.php

I have disabled the script yet again and activated your account. Activation of this script again will result in termination of your account.

We have gotten rid of this but I was wondering if you might have a look at this script and see how we can make it useable we do not want to get booted from you we very much enjoy being a part of your orginization.

Link to comment
Share on other sites

There is no simple fix to a system like this. Lets say you have 10,000 threads each with just 1 outside link to be checked. That turns into 10,000 outgoing requests. So doing it daily that is 10,000 requests via curl each day. It's pretty much a proxy which we do not allow due to the CPU and bandwidth they use.

The script itself is pretty flawed. It just runs continually and it's way of reducing it's load is sleeping itself. This is not going to stop the fact the process is sitting there and still using CPU time and memory. An ideal system would be one that stores last checked thread and is ran every few minutes. This way it's not sleeping and it runs then is killed. That's a complete re-design of the system in order to make it more reasonable.

It's one of those scripts where it's advantages do not outweigh it's disadvantages.

Link to comment
Share on other sites

Wild Card:

Does your site have enough traffic that a significant portion of these threads are viewed between link checks? You might be better off only running it on demand, i.e. only when a thread is viewed coupled with what Tony suggested, a "last checked" timestamp for each link so that it only checks once every specified interval.

Also, what method are you using to check the links? cURL?

Link to comment
Share on other sites

Thank you Guys for taking the time to have a look at this. The way the script is setup it does timestamp the post and would not check it again until the next day. We did run it on a demand basis but we could have made a cron job to check it once a day. The problem I think we faced it was the first time running and it had so many links to check and bin the dead ones and send auto pms to the original poster the server went nuts and over loaded. I stripped out the pms but it was still too server intense. I would like to have an Opportunity to rewrite it and maybe set up a time where we can test with your permission and get your feedback if this script is at all plausable and server friendly.

Link to comment
Share on other sites

Thank you Guys for taking the time to have a look at this. The way the script is setup it does timestamp the post and would not check it again until the next day. We did run it on a demand basis but we could have made a cron job to check it once a day. The problem I think we faced it was the first time running and it had so many links to check and bin the dead ones and send auto pms to the original poster the server went nuts and over loaded. I stripped out the pms but it was still too server intense. I would like to have an Opportunity to rewrite it and maybe set up a time where we can test with your permission and get your feedback if this script is at all plausable and server friendly.

There is really no way this thing is going to ever be server friendly. Every time you check if a url still works it's intensive. You run it on 1000 topics a day that's quite a bit. This only gets worse as you add more topics.

So just to make things clear the intensiveness is the portion where you need to see if the url is still working. fopen, curl doesn't matter. They're all costly when you go outside onto the internet.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...