Am I correctly extracting JPEG binary data from this mysqldump?

Posted by Glenn on Stack Overflow See other posts from Stack Overflow or by Glenn
Published on 2010-04-27T18:39:09Z Indexed on 2010/04/28 0:23 UTC
Read the original article Hit count: 478

Filed under:

I have a very old .sql backup of a vbulletin site that I ran around 8 years ago. I am trying to see the file attachments that are stored in the DB. The script below extracts them all and is verified to be JPEG by hex dumping and checking the SOI (start of image) and EOI (end of image) bytes (FFD8 and FFD9, respectively) according to the JPEG wiki page.

But when I try to open them with evince, I get this message "Error interpreting JPEG image file (JPEG datastream contains no image)"

What could be going on here?

Some background info:

sqldump is around 8 years old
vbulletin 2.x was the software that stored the info
most likely php 4 was used
most likely mysql 4.0, possibly even 3.x
the column datatype these attachments are stored in is mediumtext

My Python 3.1 script:

#!/usr/bin/env python3.1

import re

trim_l = re.compile(b"""^INSERT INTO attachment VALUES\('\d+', '\d+', '\d+', '(.+)""")
trim_r = re.compile(b"""(.+)', '\d+', '\d+'\);$""")
extractor = re.compile(b"""^(.*(?:\.jpe?g|\.gif|\.bmp))', '(.+)$""")

with open('attachments.sql', 'rb') as fh:
    for line in fh:
        data = trim_l.findall(line)[0]
        data = trim_r.findall(data)[0]
        data = extractor.findall(data)
        if data:
            name, data = data[0]
            try:
                filename = 'files/%s' % str(name, 'UTF-8')
                ah = open(filename, 'wb')
                ah.write(data)
            except UnicodeDecodeError:
                continue
            finally:
                ah.close()

fh.close()

update The JPEG wiki page says FF bytes are section markers, with the next byte indicating the section type. I see some that are not listed in the wiki page (specifically, I see a lot of 5C bytes, so FF5C). But the list is of "common markers" so I'm trying to find a more complete list. Any guidance here would also be appreciated.

Developer IT

Am I correctly extracting JPEG binary data from this mysqldump? - Developer IT

Am I correctly extracting JPEG binary data from this mysqldump?

jpeg

mysql

python

Related posts about jpeg

Macports: port install jpeg fails

How to install Hercules Webcam Deluxe

Some JPEG images are not working in IE

Add jpeg image to another in C#

ImageMagick PDF to JPEG conversion results in green square where image should be

Related posts about mysql

How to remove MySQL completely with config and library files on ubuntu 12.04 gnome 3.0

mysql: Cannot load from mysql.proc. The table is probably corrupted

Why is there a /etc/init.d/mysql file on this Slackware machine? How could it have gotten there?

mysql: Bind on unix socket: Permission denied

MySQL – Learning MySQL Online in 6 Hours – MySQL Fundamentals in 320 Minutes

Categories cloud