Stumbling Toward 'Awesomeness'

A Technical Art Blog

Monday, April 19, 2010

Dealing with File Sequences in Python



I have been parsing through the files of other people a lot lately, and finally took the time to make a little function to give me general information about a sequence of files. It uses regex to yank the numeric parts out of a filename, figure out the padding, and glob to tell you how many files in the sequence. Here’s the code and an example usage:

#returns [base name, padding, filetype, number of files, first file, last file]
def getSeqInfo(file):
	dir = os.path.dirname(file)
	file = os.path.basename(file)
	segNum = re.findall(r'\d+', file)[-1]
	numPad = len(segNum)
	baseName = file.split(segNum)[0]
	fileType = file.split('.')[-1]
	globString = baseName
	for i in range(0,numPad): globString += '?'
	theGlob = glob.glob(dir+'\\'+globString+file.split(segNum)[1])
	numFrames = len(theGlob)
	firstFrame = theGlob[0]
	lastFrame = theGlob[-1]
	return [baseName, numPad, fileType, numFrames, firstFrame, lastFrame]

Here is an example of usage:

print getSeqInfo('E:\\data\\data\\Games\\Project\\CaptureOutput\\Frame000547.jpg')
>>['Frame', 6, 'jpg', 994, 'E:\\data\\data\\Games\\Project\\CaptureOutput\\Frame000000.jpg', 'E:\\data\\data\\Games\\Project\\CaptureOutput\\Frame000993.jpg']

I know this is pretty simple, but I looked around a bit online and didn’t see anything readily available showing how to deal with different numbered file sets. I have needed something like this for a while that will work with anything from OBJs sent from external contractors, to images from After Effects…

posted by admin at 6:49 PM  

5 Comments »

  1. Hi Chris, thanks for sharing ! Not sure you’ll be interested, but for the sake of it, as I’m also playing with a lot of files in Python, I wanted to expose how I’d have done it :

    def getSeqInfo(fpath):
    folder, fname = os.path.split(fpath)
    match = re.compile(“^(.*?)(\d+)\.(.*?)$”).match(fname)
    if match is None:
    raise RuntimeError(“Unable to find sequence number”)
    baseName, sequenceNum, fileType = match.groups()
    numPad = len(sequenceNum)
    seqPattern = re.compile(r”^%s%s.%s$” % (baseName, “\d”*numPad, fileType))
    names = [ name for name in os.listdir(folder) if seqPattern.match(name) ]
    if names == []:
    raise RuntimeError(“No matching file found”)
    names.sort()
    numItems = len(names)
    firstItem = names[0]
    lastItem = names[-1]
    return [baseName, numPad, fileType, numItems, firstItem, lastItem]

    May not be perfect, but it’s only another version :)

    Comment by rotoglup — 2010/04/21 @ 12:14 AM

  2. Man, you’re awesome! I just bought my little regex cheat book, and this stuff is like chinese to me. You know of a site that has like regex tutorials based on solid scenarios? :D

    Comment by admin — 2010/04/25 @ 4:27 PM

  3. It often looks like chinese to me too, when looked at some time after the regexp creation !! Sorry, I have no solid pointer to give… I only work through sweat, pain, and general regexp principles knowledge plus python syntax memento from python docs…. I limit myself to basic uses, as it may be painful to debug and re-read ! Hang on, you’ll make it ;)

    Comment by rotoglup — 2010/05/03 @ 8:15 PM

  4. V.cool little snippet, throwing this into the farm control here :)

    Love Regex, got a couple of those generic cheat sheet pdfs. Takes a bit of getting into but once it’s in your script you wonder how you managed without it!

    Best

    TxRx
    ;)

    Comment by TxRx — 2010/05/31 @ 12:58 PM

  5. For me, this is very helpful book:
    http://oreilly.com/catalog/9781565922570

    Comment by Marin Petrov — 2010/07/27 @ 3:45 AM

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress