File Removal after Database Comparison

File Removal after Database Comparison

am 03.04.2008 23:13:46 von Chris Owens

Hi all,

I'm trying to figure the logic (and therefor the code) to enable the
clean-up of a site I've inherited. Basically I have a folder that has
around 6,666 images in it that are loosely related to the information
held in a database. I say loosely because the images are uploaded (via
FTP) manually. There is no direct reference to the JPEGs in the
database, as the code on the site references them via:





Each record in the database has three related images:

Image of the front - recordid_f.jpg
Image of the side - recordid_s.jpg
Image of the back - recordid_b.jpg

Now over time records have been removed from the database, but the
images haven't been removed and as such, short of removing all images,
there is no easy way that I can think of to clear the folder (several
GB of data) whilst keeping the images that have related records in the
db.

My question is: How can I through code remove any orphaned images (that
aren't needed anymore), whilst keeping the images that still relate to
records in the database?

Any assistance/pointers/snippets would be much appreciated.

Kind regards,

Chris

Re: File Removal after Database Comparison

am 04.04.2008 04:43:24 von Preventer of Work

Chris Owens wrote:
> Hi all,
>
> I'm trying to figure the logic (and therefor the code) to enable the
> clean-up of a site I've inherited. Basically I have a folder that has
> around 6,666 images in it that are loosely related to the information
> held in a database. I say loosely because the images are uploaded (via
> FTP) manually. There is no direct reference to the JPEGs in the
> database, as the code on the site references them via:
>
>
>
>
>
> Each record in the database has three related images:
>
> Image of the front - recordid_f.jpg
> Image of the side - recordid_s.jpg
> Image of the back - recordid_b.jpg

If the image file names really contain the database id like
2177_s.jpg
2177_b.jpg
.... etc
It would be a matter of (think unix but similar in windows shell)
ls -1 *_s.jpg > fnames.txt
to capture all file names.
After that, scratch some code code together to collect the numbers
from fnames.txt entry (easy in any language).

Create new folder nearby.

write script that does this:
open database
for each number in list
.. query database for the number
.. if in the db:
.. stringify the number to recreate filenames
.. copy files to new dir

When this is done, ONLY the images you want to keep will be
in the new folder. Once you are sure it worked, delete all originals.
This is the safe way.

Faster (but test carefully first):
open database
for each number in list
.. query database for the number
.. if NOT in the db:
.. stringify the number to recreate filenames
.. delete files

>
> Now over time records have been removed from the database, but the
> images haven't been removed and as such, short of removing all images,
> there is no easy way that I can think of to clear the folder (several GB
> of data) whilst keeping the images that have related records in the db.
>
> My question is: How can I through code remove any orphaned images (that
> aren't needed anymore), whilst keeping the images that still relate to
> records in the database?
>
> Any assistance/pointers/snippets would be much appreciated.
>
> Kind regards,
>
> Chris
>

Re: File Removal after Database Comparison

am 04.04.2008 19:28:42 von Shion

Chris Owens wrote:
> I'm trying to figure the logic (and therefor the code) to enable the
> clean-up of a site I've inherited. Basically I have a folder that has
> around 6,666 images in it that are loosely related to the information
> held in a database. I say loosely because the images are uploaded (via
> FTP) manually. There is no direct reference to the JPEGs in the
> database, as the code on the site references them via:
>
>
>
>
>
> Each record in the database has three related images:
>
> Image of the front - recordid_f.jpg
> Image of the side - recordid_s.jpg
> Image of the back - recordid_b.jpg
>
> Now over time records have been removed from the database, but the
> images haven't been removed and as such, short of removing all images,
> there is no easy way that I can think of to clear the folder (several GB
> of data) whilst keeping the images that have related records in the db.

ob_end_flush();
$mysqli= new mysqli('localhost','login','pass','db');
$refid=explode("\n",shell_exec("ls -1 | sed 's/jpg//g'"));
foreach($refid AS $id) {
echo "Cheking {$id}: ";
$query="SELECT COUNT(*) FROM table_name WHERE recordid='{$id}'";
$res=$mysqli->query($query);
if(!$res->num_rows) {
echo "delete old files.";
//delete all three files at once
exec("rm -f {$id}_[bfd].jpg");
} else {
echo "Still active, we keep them";
}
echo (empty($_SERVER['SERVER_ADDR'])?'':'
')."\n";
}
$mysqli->close();
?>

This should take care of the whole thing for you, I do suggest you do run it
once with the exec("rm -f {$id}_[bfd].jpg"); commented out, that way
you can check if the deletion would be correct. This one uses mysqli, but if
you don't use it or even another database, then just modify the code a little bit.

You could change the exec() line and move the old images to a new directory
instead, that way if there would be something wrong with the script, then you
still would have the images and can move those back that was faulty removed.

--

//Aho

Re: File Removal after Database Comparison

am 04.04.2008 19:30:00 von Shion

Chris Owens wrote:
> Hi all,
>
> I'm trying to figure the logic (and therefor the code) to enable the
> clean-up of a site I've inherited. Basically I have a folder that has
> around 6,666 images in it that are loosely related to the information
> held in a database. I say loosely because the images are uploaded (via
> FTP) manually. There is no direct reference to the JPEGs in the
> database, as the code on the site references them via:
>
>
>
>
>
> Each record in the database has three related images:
>
> Image of the front - recordid_f.jpg
> Image of the side - recordid_s.jpg
> Image of the back - recordid_b.jpg
>
> Now over time records have been removed from the database, but the
> images haven't been removed and as such, short of removing all images,
> there is no easy way that I can think of to clear the folder (several GB
> of data) whilst keeping the images that have related records in the db.
>
> My question is: How can I through code remove any orphaned images (that
> aren't needed anymore), whilst keeping the images that still relate to
> records in the database?
>
> Any assistance/pointers/snippets would be much appreciated.
>
> Kind regards,
>
> Chris
>

ob_end_flush();
$mysqli= new mysqli('localhost','login','pass','db');
$refid=explode("\n",shell_exec("ls -1 *_s.jpg | sed 's/_s.jpg//g'"));
foreach($refid AS $id) {
echo "Cheking {$id}: ";
$query="SELECT COUNT(*) FROM table_name WHERE recordid='{$id}'";
$res=$mysqli->query($query);
if(!$res->num_rows) {
echo "delete old files.";
//delete all three files at once
exec("rm -f {$id}_[bfd].jpg");
} else {
echo "Still active, we keep them";
}
echo (empty($_SERVER['SERVER_ADDR'])?'':'
')."\n";
}
$mysqli->close();
?>

This should take care of the whole thing for you, I do suggest you do run it
once with the exec("rm -f {$id}_[bfd].jpg"); commented out, that way
you can check if the deletion would be correct. This one uses mysqli, but if
you don't use it or even another database, then just modify the code a little bit.

You could change the exec() line and move the old images to a new directory
instead, that way if there would be something wrong with the script, then you
still would have the images and can move those back that was faulty removed.

--

//Aho