Am I correctly extracting JPEG binary data from this mysqldump?

Posted by Glenn on Stack Overflow See other posts from Stack Overflow or by Glenn
Published on 2010-04-27T18:39:09Z Indexed on 2010/04/28 0:23 UTC
Read the original article Hit count: 379

Filed under:
|
|

I have a very old .sql backup of a vbulletin site that I ran around 8 years ago. I am trying to see the file attachments that are stored in the DB. The script below extracts them all and is verified to be JPEG by hex dumping and checking the SOI (start of image) and EOI (end of image) bytes (FFD8 and FFD9, respectively) according to the JPEG wiki page.

But when I try to open them with evince, I get this message "Error interpreting JPEG image file (JPEG datastream contains no image)"

What could be going on here?

Some background info:

  • sqldump is around 8 years old
  • vbulletin 2.x was the software that stored the info
  • most likely php 4 was used
  • most likely mysql 4.0, possibly even 3.x
  • the column datatype these attachments are stored in is mediumtext

My Python 3.1 script:

#!/usr/bin/env python3.1

import re

trim_l = re.compile(b"""^INSERT INTO attachment VALUES\('\d+', '\d+', '\d+', '(.+)""")
trim_r = re.compile(b"""(.+)', '\d+', '\d+'\);$""")
extractor = re.compile(b"""^(.*(?:\.jpe?g|\.gif|\.bmp))', '(.+)$""")

with open('attachments.sql', 'rb') as fh:
    for line in fh:
        data = trim_l.findall(line)[0]
        data = trim_r.findall(data)[0]
        data = extractor.findall(data)
        if data:
            name, data = data[0]
            try:
                filename = 'files/%s' % str(name, 'UTF-8')
                ah = open(filename, 'wb')
                ah.write(data)
            except UnicodeDecodeError:
                continue
            finally:
                ah.close()

fh.close()

update The JPEG wiki page says FF bytes are section markers, with the next byte indicating the section type. I see some that are not listed in the wiki page (specifically, I see a lot of 5C bytes, so FF5C). But the list is of "common markers" so I'm trying to find a more complete list. Any guidance here would also be appreciated.

© Stack Overflow or respective owner

Related posts about jpeg

Related posts about mysql